Re: [R] Sorting based a custom sorting function
In the spirit of 'advent of code', maybe it is better to exploit the features of the particular language you've chosen? Then the use of factors seems very relevant. value_levels <- c("Small", "Medium", "Large") df <- data.frame( person = c("Alice", "Bob", "Bob", "Charlie"), value = factor( c("Medium", "Large", "Small", "Large"), levels = value_levels ) ) df[with(df, order(person, value)),] Likely this is more efficient than the hints of your existing solution, because it will act on vectors rather than iterating through individual elements of the 'person' and 'value' vectors. For a more general solution, I don't think I'd follow the low-level approach Duncan suggests (maybe see also ?Math for S3 generics), but rather define a class (e.g., that requires vectors person and value) and implement a corresponding `xtfrm()` method. Have fun with the remainder of the advent! Another Martin From: R-help on behalf of Martin Møller Skarbiniks Pedersen Date: Thursday, December 14, 2023 at 6:42 AM To: R mailing list Subject: Re: [R] Sorting based a custom sorting function On Thu, 14 Dec 2023 at 12:02, Duncan Murdoch wrote: > > class(df$value) <- "sizeclass" > > `>.sizeclass` <- function(left, right) custom_sort(unclass(left), > unclass(right)) == 1 > > `==.sizeclass` <- function(left, right) custom_sort(unclass(left), > unclass(right)) == 0 > > `[.sizeclass` <- function(x, i) structure(unclass(x)[i], class="sizeclass") > > df[order(df$value),] > > All the "unclass()" calls are needed to avoid infinite recursion. For a > more complex kind of object where you are extracting attributes to > compare, you probably wouldn't need so many of those. Great! Just what I need. I will create a class and overwrite > and ==. I didn't know that order() used these exact methods. My best solution was something like this: quicksort <- function(arr, compare_func) { if (length(arr) <= 1) { return(arr) } else { pivot <- arr[[1]] less <- arr[-1][compare_func(arr[-1], pivot) <= 0] greater <- arr[-1][compare_func(arr[-1], pivot) > 0] return(c(quicksort(less, compare_func), pivot, quicksort(greater, compare_func))) } } persons <- c("alfa", "bravo", "charlie", "delta", "echo", "foxtrot", "golf", "hotel", "india", "juliett", "kilo", "lima", "mike", "november", "oscar", "papa", "quebec", "romeo", "sierra", "tango", "uniform", "victor", "whiskey", "x-ray", "yankee", "zulu") quicksort(persons, function(left, right) { nchar(left) - nchar(right) }) Regards Martin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R does not run under latest RStudio
Do you mean here https://community.rstudio.com/ ? there seems to be quite a bit of activity… here's a very similar post to yours in the last day https://community.rstudio.com/t/latest-rstudio-version-did-not-launch-appropriately-in-my-computer/163585 with a response from an RStudio / Posit employee. Martin Morgan From: R-help on behalf of Steven Yen Date: Thursday, April 6, 2023 at 3:20 PM To: Uwe Ligges Cc: R-help Mailing List , Steven T. Yen Subject: Re: [R] R does not run under latest RStudio The RStudio list generally does not respond to free version users. I was hoping someone one this (R) list would be kind enough to help me. Steven from iPhone > On Apr 6, 2023, at 6:22 PM, Uwe Ligges > wrote: > > No, but you need to ask on an RStudio mailing list. > This one is about R. > > Best, > Uwe Ligges > > > > >> On 06.04.2023 11:28, Steven T. Yen wrote: >> I updated to latest RStudio (RStudio-2023.03.0-386.exe) but >> R would not run. Error message: >> Error Starting R >> The R session failed to start. >> RSTUDIO VERSION >> RStudio 2023.03.0+386 "Cherry Blossom " (3c53477a, 2023-03-09) for Windows >> [No error available] >> I also tried RStudio 2022.12.0+353 --- same problem. >> I then tried another older version of RStudio (not sure version >> as I changed file name by accident) and R ran. >> Any clues? Please help. Thanks. >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to access source code
showMethods(LGD, includeDef = TRUE) shows the implementation of all methods on the LGD generic, and can be a useful fast track to getting an overview of what is going on. Martin Morgan From: R-help on behalf of Ivan Krylov Date: Thursday, December 8, 2022 at 11:23 AM To: Christofer Bogaso Cc: r-help Subject: Re: [R] How to access source code � Thu, 8 Dec 2022 20:56:12 +0530 Christofer Bogaso �: > > showMethods(LGD) > > Function: LGD (package GCPM) > > this="GCPM" Almost there! Try getMethod(LGD, signature = 'GCPM'). Not sure if this is going to work as written, but if you need to see an S4 method definition, getMethod is the way. -- Best regards, Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] interval between specific characters in a string...
You could split the string into letters and figure out which ones are �b� which(strsplit(x, "")[[1]] == "b") and then find the difference between each position, �anchoring� at position 0 > diff(c(0, which(strsplit(x, "")[[1]] == "b"))) [1] 2 4 1 6 4 From: R-help on behalf of Evan Cooch Date: Friday, December 2, 2022 at 6:56 PM To: r-help@r-project.org Subject: [R] interval between specific characters in a string... Was wondering if there is an 'efficient/elegant' way to do the following (without tidyverse). Take a string abaaabbabaaab Its easy enough to count the number of times the character 'b' shows up in the string, but...what I'm looking for is outputing the 'intervals' between occurrences of 'b' (starting the counter at the beginning of the string). So, for the preceding example, 'b' shows up in positions 2, 6, 7, 13, 17 So, the interval data would be: 2, 4, 1, 6, 4 My main approach has been to simply output positions (say, something like unlist(gregexpr('b', target_string))), and 'do the math' between successive positions. Can anyone suggest a more elegant approach? Thanks in advance... __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Partition vector of strings into lines of preferred width
> strwrap(text) [1] "What is the best way to split/cut a vector of strings into lines of" [2] "preferred width? I have come up with a simple solution, albeit naive," [3] "as it involves many arithmetic divisions. I have an alternative idea" [4] "which avoids this problem. But I may miss some existing functionality!" Maybe used as > strwrap(text) |> paste(collapse = "\n") |> cat("\n") What is the best way to split/cut a vector of strings into lines of preferred width? I have come up with a simple solution, albeit naive, as it involves many arithmetic divisions. I have an alternative idea which avoids this problem. But I may miss some existing functionality! > ? From: R-help on behalf of Leonard Mada via R-help Date: Friday, October 28, 2022 at 5:42 PM To: R-help Mailing List Subject: [R] Partition vector of strings into lines of preferred width Dear R-Users, text = " What is the best way to split/cut a vector of strings into lines of preferred width? I have come up with a simple solution, albeit naive, as it involves many arithmetic divisions. I have an alternative idea which avoids this problem. But I may miss some existing functionality!" # Long vector of strings: str = strsplit(text, " |(?<=\n)", perl=TRUE)[[1]]; lenWords = nchar(str); # simple, but naive solution: # - it involves many divisions; cut.character.int = function(n, w) { ncm = cumsum(n); nwd = ncm %/% w; count = rle(nwd)$lengths; pos = cumsum(count); posS = pos[ - length(pos)] + 1; posS = c(1, posS); pos = rbind(posS, pos); return(pos); } npos = cut.character.int(lenWords, w=30); # lets print the results; for(id in seq(ncol(npos))) { len = npos[2, id] - npos[1, id]; cat(str[seq(npos[1, id], npos[2, id])], c(rep(" ", len), "\n")); } The first solution performs an arithmetic division on all string lengths. It is possible to find out the total length and divide only the last element of the cumsum. Something like this should work (although it is not properly tested). w = 30; cumlen = cumsum(lenWords); max = tail(cumlen, 1) %/% w + 1; pos = cut(cumlen, seq(0, max) * w); count = rle(as.numeric(pos))$lengths; # everything else is the same; pos = cumsum(count); posS = pos[ - length(pos)] + 1; posS = c(1, posS); pos = rbind(posS, pos); npos = pos; # then print The cut() may be optimized as well, as the cumsum is sorted ascending. I did not evaluate the efficiency of the code either. But do I miss some existing functionality? Note: - technically, the cut() function should probably return a vector of indices (something like: rep(seq_along(count), count)), but it was more practical to have both the start and end positions. Many thanks, Leonard __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Handling dependencies on Bioconductor packages for packages on CRAN
One possibility is to make graph a Suggests: dependency, and preface any code using it (or, e.g., in an .onLoad function) with if (!requireNamespace("graph", quietly = TRUE)) stop( "install the Bioconductor 'graph' package using these commands\n\n", ## standard Bioconductor package installation instructions " if (!requireNamespace('BiocManager', quiety = TRUE))\n", "install.packages('BiocManager')\n", " BiocManager::install('graph')\n\n" ) Use graph:: for any function used in the graph package. The code could be simplified if BiocManager were an Imports: dependency of your package -- it would already be installed. The 'Suggests:' dependency would not cause problems with CRAN, because Suggest'ed packages are available when the package is built / checked. The user experience of package installation would be 'non-standard' (didn't I just install gRbase??), so this is not an ideal solution. Martin On 12/4/21, 10:55 AM, "R-help on behalf of Søren Højsgaard" wrote: Dear all My gRbase package imports the packages from Bioconductor: graph, RBGL and Rgraphviz If these packages are not installed, then gRbase can not be installed. The error message is: ERROR: dependency ‘graph’ is not available for package ‘gRbase’ If I, prior to installation, run setRepositories and highlight 'BioC software', then gRbase installs as it should, because the graph package from Bioconductor is installed along with it. However, this extra step is an obstacle to many users of the package which means that either people do not use the package or people ask questions about this issue on stack overflow, R-help, by email to me etc. It is not a problem to get the package on CRAN because, I guess, the CRAN check machines already have the three bioconductor packages installed. Therefore, I wonder if there is a way of specifying, in the DESCRIPTION file or elsewhere, that these packages should be installed automatically from bioconductor. An alternative would be if one could force the error message ERROR: dependency ‘graph’ is not available for package ‘gRbase’ to be accompanied by a message about what the user then should do. Any good suggestions? Thanks in advance. Best regards Søren [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to create a proper S4 class?
From my example, as(employees, "People") more general coercion is possible; see the documentation ?setAs. From your problem description I would have opted for the solution that you now have, with two slots rather than inheritance. Inheritance has a kind of weird contract when using another package, where you're agreeing to inherit whatever generics and methods are / will be defined on the object, by any package loaded by the user; a usual benefit of object-oriented programming is better control over object type, and this contract seems to total undermine that. I actually don't know how to navigate the mysterious error; the secret is somewhere in the cbind / cbind2 documentation, but this seems quite opaque to me. Martin On 11/17/21, 8:00 PM, "Leonard Mada" wrote: Dear Martin, thank you very much for the guidance. Ultimately, I got it running. But, for mysterious reasons, it was challenging: - I skipped for now the inheritance (and used 2 explicit non-inherited slots): this is still unresolved; [*] - the code is definitely cleaner; [*] Mysterious errors, like: "Error in cbind(deparse.level, ...) : cbind for agentMatrix is only defined for 2 agentMatrices" One last question pops up: If B inherits from A, how can I down-cast back to A? b = new("B", someA); ??? as.A(b) ??? Is there a direct method? I could not explore this, as I am still struggling with the inheritance. The information may be useful, though: it helps in deciding the design of the data-structures. [Actually, all base-methods should work natively as well - but to have a solution in any case.] Sincerely, Leonard On 11/17/2021 5:48 PM, Martin Morgan wrote: > Hi Leonard -- > > Remember that a class can have 'has a' and 'is a' relationships. For instance, a People class might HAVE slots name and age > > .People <- setClass( > "People", > slots = c(name = "character", age = "numeric") > ) > > while an Employees class might be described as an 'is a' relationship -- all employeeds are people -- while also having slots like years_of_employment and job_title > > .Employees <- setClass( > "Employees", > contains = "People", > slots = c(years_of_employment = "numeric", job_title = "character") > ) > > I've used .People and .Employees to capture the return value of setClass(), and these can be used as constructors > > people <- .People( > name = c("Simon", "Andre"), > age = c(30, 60) > ) > > employees = .Employees( > people, # unnamed arguments are class(es) contained in 'Employees' > years_of_employment = c(3, 30), > job_title = c("hard worker", "manager") > ) > > I would not suggest using attributes in addition to slots. Rather, embrace the paradigm and represent attributes as additional slots. In practice it is often helpful to write a constructor function that might transform between formats useful for users to formats useful for programming, and that can be easily documented. > > Employees <- > function(name, age, years_of_employment, job_title) > { > ## implement sanity checks here, or in validity methods > people <- .People(name = name, age = age) > .Employees(people, years_of_employment = years_of_employment, job_title = job_title) > } > > plot() and lines() are both S3 generics, and the rules for S3 generics using S4 objects are described in the help page ?Methods_for_S3. Likely you will want to implement a show() method; show() is an S4 method, so see ?Methods_Details. Typically this should use accessors rather than relying on direct slot access, e.g., > > person_names <- function(x) x@name > employee_names <- person_names > > The next method implemented is often the [ (single bracket subset) function; this is relatively complicated to get right, but worth exploring. > > I hope that gets you a little further along the road. > > Martin Morgan > > On 11/16/21, 11:34 PM, "R-help on behalf of Leonard Mada via R-help" wrote: > > Dear List-Members, > > > I want to create an S4 class with 2 data slots, as well as a plot and a > line method. > > > Unfortunately I lack any experience with S4 classes. I have put together > some working code - but I
Re: [R] How to create a proper S4 class?
Hi Leonard -- Remember that a class can have 'has a' and 'is a' relationships. For instance, a People class might HAVE slots name and age .People <- setClass( "People", slots = c(name = "character", age = "numeric") ) while an Employees class might be described as an 'is a' relationship -- all employeeds are people -- while also having slots like years_of_employment and job_title .Employees <- setClass( "Employees", contains = "People", slots = c(years_of_employment = "numeric", job_title = "character") ) I've used .People and .Employees to capture the return value of setClass(), and these can be used as constructors people <- .People( name = c("Simon", "Andre"), age = c(30, 60) ) employees = .Employees( people, # unnamed arguments are class(es) contained in 'Employees' years_of_employment = c(3, 30), job_title = c("hard worker", "manager") ) I would not suggest using attributes in addition to slots. Rather, embrace the paradigm and represent attributes as additional slots. In practice it is often helpful to write a constructor function that might transform between formats useful for users to formats useful for programming, and that can be easily documented. Employees <- function(name, age, years_of_employment, job_title) { ## implement sanity checks here, or in validity methods people <- .People(name = name, age = age) .Employees(people, years_of_employment = years_of_employment, job_title = job_title) } plot() and lines() are both S3 generics, and the rules for S3 generics using S4 objects are described in the help page ?Methods_for_S3. Likely you will want to implement a show() method; show() is an S4 method, so see ?Methods_Details. Typically this should use accessors rather than relying on direct slot access, e.g., person_names <- function(x) x@name employee_names <- person_names The next method implemented is often the [ (single bracket subset) function; this is relatively complicated to get right, but worth exploring. I hope that gets you a little further along the road. Martin Morgan On 11/16/21, 11:34 PM, "R-help on behalf of Leonard Mada via R-help" wrote: Dear List-Members, I want to create an S4 class with 2 data slots, as well as a plot and a line method. Unfortunately I lack any experience with S4 classes. I have put together some working code - but I presume that it is not the best way to do it. The actual code is also available on Github (see below). 1.) S4 class - should contain 2 data slots: Slot 1: the agents: = agentMatrix class (defined externally, NetlogoR S4 class); Slot 2: the path traveled by the agents: = a data frame: (x, y, id); - my current code: defines only the agents ("t"); setClass("agentsWithPath", contains = c(t="agentMatrix")); 1.b.) Attribute with colors specific for each agent - should be probably an attribute attached to the agentMatrix and not a proper data slot; Note: - it is currently an attribute on the path data.frame, but I will probably change this once I get the S4 class properly implemented; - the agentMatrix does NOT store the colors (which are stored in another class - but it is useful to have this information available with the agents); 2.) plot & line methods for this class plot.agentsWithPath; lines.agentsWithPath; I somehow got stuck with the S4 class definition. Though it may be a good opportunity to learn about S4 classes (and it is probably better suited as an S4 class than polynomials). The GitHub code draws the agents, but was somehow hacked together. For anyone interested: https://github.com/discoleo/R/blob/master/Stat/ABM.Models.Particles.R Many thanks, Leonard __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RSQLite slowness
https://support.bioconductor.org and the community slack (sign up at https://bioc-community.herokuapp.com/ ) as well as the general site https://bioconductor.org . Actually your question sounds like a SQLite question � JOIN a table, versus parameterized query. One could perhaps construct the relevant example at the sqlite command line? Martin Morgan On 10/6/21, 2:50 PM, "R-help" wrote: Thank you Bert, I set up a new thread on BioStars [1]. So far, I'm a bit unfamilliar with Bioconductor (but will hopefully attend a course about it in November, which I'm kinda hyped about), other than installing and updating R packages using BiocManager Did you think of something else than BioStars.org when saying �Bioconductor?� The question could be viewed as gene related, but I think it is really about how can one easier than with sqlite handle large tsv files, and why is that parser thing so slow ... I think this is more like a core R thing than gene related question ... [1] https://www.biostars.org/p/9492486/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] inverse of the methods function
> methods(class = "lm") [1] add1 alias anova case.names coerce [6] confintcooks.distance deviance dfbeta dfbetas [11] drop1 dummy.coef effectsextractAIC family [16] formulahatvalues influence initialize kappa [21] labels logLik model.framemodel.matrix nobs [26] plot predictprint proj qr [31] residuals rstandard rstudent show simulate [36] slotsFromS3summaryvariable.names vcov see '?methods' for accessing help and source code Martin Morgan On 5/3/21, 6:34 PM, "R-help on behalf of Therneau, Terry M., Ph.D. via R-help" wrote: Is there a complement to the methods function, that will list all the defined methods for a class?One solution is to look directly at the NAMESPACE file, for the package that defines it, and parse out the entries. I was looking for something built-in, i.e., easier. -- Terry M Therneau, PhD Department of Health Science Research Mayo Clinic thern...@mayo.edu "TERR-ree THUR-noh" [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] parallel: socket connection behind a NAT router
A different approach uses doRedis https://CRAN.R-project.org/package=doRedis (currently archived, but actively developed) for use with the foreach package, or RedisParam https://github.com/mtmorgan/RedisParam (not released) for use with Bioconductor's BiocParallel package. These use a redis server https://redis.io/ to communicate -- the manager submits jobs / obtains results from the redis server, the workers retrieve jobs / submit results to the redis server. Manager and worker need to know the (http) address of the server, etc, but there are no other ports involved. Redis servers are easy to establish in a cloud environment, using e.g., existing AWS or docker images. The README for doRedis https://github.com/bwlewis/doRedis probably provides the easiest introduction. The (not mature) k8sredis Kubernetes / helm chart https://github.com/Bioconductor/k8sredis illustrates a complete system using RedisParam, deploying manager and workers locally or in the google cloud; the app could be modified to only start the workers in the cloud, exposing the redis server for access by a local 'manager'; this would be cool. Martin On 1/19/21, 1:50 AM, "R-help on behalf of Henrik Bengtsson" wrote: On Mon, Jan 18, 2021 at 9:42 PM Jiefei Wang wrote: > > Thanks for introducing this interesting package to me! it is great to know a new powerful tool, but it seems like this method does not work in my environment. ` parallelly::makeClusterPSOCK` will hang until timeout. > > I checked the verbose output and it looks like the parallelly package also depends on `parallel:::.slaveRSOCK` on the remote instance to build the connection. This explains why it failed for the local machine does not have a public IP and the remote does not know how to build the connection. It's correct that the worker does attempt to connect back to the parent R process that runs on your local machine. However, it does *not* do so by your local machines public IP address but it does it by connecting to a port on its own machine - a port that was set up by SSH. More specifically, when parallelly::makeClusterPSOCK() connects to the remote machine over SSH it also sets up a so-called reverse SSH tunnel with a certain port on your local machine and certain port of your remote machine. This is what happens: > cl <- parallelly::makeClusterPSOCK("machine1.example.org", verbose=TRUE) [local output] Workers: [n = 1] 'machine1.example.org' [local output] Base port: 11019 ... [local output] Starting worker #1 on 'machine1.example.org': '/usr/bin/ssh' -R 11068:localhost:11068 machine1.example.org "'Rscript' --default-packages=datasets,utils,grDevices,graphics,stats,methods -e 'workRSOCK <- tryCatch(parallel:::.slaveRSOCK, error=function(e) parallel:::.workRSOCK); workRSOCK()' MASTER=localhost PORT=11068 OUT=/dev/null TIMEOUT=2592000 XDR=FALSE" [local output] - Exit code of system() call: 0 [local output] Waiting for worker #1 on 'machine1.example.org' to connect back '/usr/bin/ssh' -R 11019:localhost:11019 machine1.example.org "'Rscript' --default-packages=datasets,utils,grDevices,graphics,stats,methods -e 'workRSOCK <- tryCatch(parallel:::.slaveRSOCK, error=function(e) parallel:::.workRSOCK); workRSOCK()' MASTER=localhost PORT=11019 OUT=/dev/null TIMEOUT=2592000 XDR=FALSE" All the magic is in that SSH option '-R 11068:localhost:11068' SSH options, which allow the parent R process on your local machine to communicate with the remote worker R process on its own port 11068, and vice versa, the worker R process will communicate with the parent R process as if it was running on MASTER=localhost PORT=11068. Basically, for all that the worker R process' knows, the parent R process runs on the same machine as itself. You haven't said what operating system you're running on your local machine, but if it's MS Windows, know that the 'ssh' client that comes with Windows 10 has some bugs in its reverse tunneling. See ?parallelly::makeClusterPSOCK for lots of details. You also haven't said what OS the cloud workers run, but I assume it's Linux. So, my guesses on your setup is, the above "should work" for you. For your troubleshooting, you can also set argument outfile=NULL. Then you'll also see output from the worker R process. There are additional troubleshooting suggestions in Section 'Failing to set up remote workers' of ?parallelly::makeClusterPSOCK that will help you figure out what the problem is. > > I see in README the package states it works with "remote clusters without knowing public IP". I think this might be where the confusion is, it may mean the remote machine does not have a public IP, but the server machine does. I'm in the opposite situation, the server does not have a public IP, but the remote does. I'm not sure if t
Re: [R] error in installing limma
Show the entire command and output. I have > BiocManager::install("limma") Bioconductor version 3.12 (BiocManager 1.30.10), R 4.0.3 (2020-10-10) Installing package(s) 'limma' trying URL 'https://bioconductor.org/packages/3.12/bioc/src/contrib/limma_3.46.0.tar.gz' Content type 'application/x-gzip' length 1527170 bytes (1.5 MB) == downloaded 1.5 MB * installing *source* package ‘limma’ ... ** using staged installation ** libs gcc -I"/usr/local/lib/R/include" -DNDEBUG -I/usr/local/include -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c init.c -o init.o gcc -I"/usr/local/lib/R/include" -DNDEBUG -I/usr/local/include -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c normexp.c -o normexp.o gcc -I"/usr/local/lib/R/include" -DNDEBUG -I/usr/local/include -fpic -g -O2 -fstack-protector-strong -Wformat -Werror=format-security -Wdate-time -D_FORTIFY_SOURCE=2 -g -c weighted_lowess.c -o weighted_lowess.o gcc -shared -L/usr/local/lib/R/lib -L/usr/local/lib -o limma.so init.o normexp.o weighted_lowess.o -L/usr/local/lib/R/lib -lR installing to /usr/local/lib/R/site-library/00LOCK-limma/00new/limma/libs ** R ** inst ** byte-compile and prepare package for lazy loading ** help *** installing help indices ** building package indices ** installing vignettes ** testing if installed package can be loaded from temporary location ** checking absolute paths in shared objects and dynamic libraries ** testing if installed package can be loaded from final location ** testing if installed package keeps a record of temporary installation path * DONE (limma) From: Ayushi Dwivedi Date: Wednesday, December 23, 2020 at 12:52 AM To: Martin Morgan Cc: "r-help@r-project.org" , "r-help-requ...@r-project.org" Subject: Re: [R] error in installing limma hey.. I used this command to install limma but after running sometime it terminated with error "installation of package ‘limma’ had non-zero exit status". if (!requireNamespace("BiocManager", quietly = TRUE)) + install.packages("BiocManager") > BiocManager::install("limma") Ayushi Dwivedi Ph.D. Scholar Dept. of Biotechnology & Bioinformatics, School of Life Sciences, University of Hyderabad, Hyderabad - 500046 ( India ). Phone No. :- +91 - 8858037252 Email Id :- mailto:ayushi.crea...@gmail.com On Wed, Dec 23, 2020 at 12:21 AM Martin Morgan <mailto:mtmorgan.b...@gmail.com> wrote: limma is a Bioconductor package so you should use https://support.bioconductor.org I'd guess that you've trimmed your screen shot just after the informative information. Just copy and paste as plain text the entire output of your installation attempt. Presumably you are using standard practices documented on, e.g., https://bioconductor.org/packages/limma to install packages BiocManager::install("limma") Martin Morgan On 12/22/20, 1:11 PM, "R-help on behalf of Ayushi Dwivedi" <mailto:r-help-boun...@r-project.org on behalf of mailto:ayushi.crea...@gmail.com> wrote: Good afternoon Sir, With due respect I want to convey that while installing limma package in R, I am getting the error message, not just limma If I am installing any package in R like biomaRt the same error message is coming it is terminating with "installation of package ‘limma’ had non-zero exit status". Hereby, I am attaching the screenshot of the error. Kindly, go through it. I shall be highly obliged. *Ayushi Dwivedi* *Ph.D. Scholar* *Dept. of Biotechnology & Bioinformatics,* School of Life Sciences, University of Hyderabad, Hyderabad - 500046 ( India ). Phone No. :- +91 - 8858037252 Email Id :- mailto:ayushi.crea...@gmail.com* <mailto:swapnilkr...@gmail.com>** <mailto:swapnil...@yahoo.com>* __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] error in installing limma
limma is a Bioconductor package so you should use https://support.bioconductor.org I'd guess that you've trimmed your screen shot just after the informative information. Just copy and paste as plain text the entire output of your installation attempt. Presumably you are using standard practices documented on, e.g., https://bioconductor.org/packages/limma to install packages BiocManager::install("limma") Martin Morgan On 12/22/20, 1:11 PM, "R-help on behalf of Ayushi Dwivedi" wrote: Good afternoon Sir, With due respect I want to convey that while installing limma package in R, I am getting the error message, not just limma If I am installing any package in R like biomaRt the same error message is coming it is terminating with "installation of package ‘limma’ had non-zero exit status". Hereby, I am attaching the screenshot of the error. Kindly, go through it. I shall be highly obliged. *Ayushi Dwivedi* *Ph.D. Scholar* *Dept. of Biotechnology & Bioinformatics,* School of Life Sciences, University of Hyderabad, Hyderabad - 500046 ( India ). Phone No. :- +91 - 8858037252 Email Id :- ayushi.crea...@gmail.com* ** * __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inappropriate color name
Lainey wishes to report a bug, so should see ?bug.report. Mail sent to R-core will be held for moderator approval, and relevant input or ultimate resolution would not be visible to the wider community; it is not a good place to report bugs. Martin Morgan On 11/16/20, 4:48 PM, "R-help on behalf of Mitchell Maltenfort" wrote: r-c...@r-project.org. would be the first stop. On Mon, Nov 16, 2020 at 4:37 PM Lainey Gallenberg < laineygallenb...@gmail.com> wrote: > Whether or not you agree with my reason for doing so, my question was how > to contact the creator of the "colors" function. If you do not have advice > on this, please refrain from weighing in. > > On Mon, Nov 16, 2020 at 12:03 PM Bert Gunter > wrote: > > > WIth all due respect, can we end this thread NOW. This is not a forum to > > discuss social or political viewpoints. I consider it a disservice to > make > > it one. > > > > Bert Gunter > > > > "The trouble with having an open mind is that people keep coming along > and > > sticking things into it." > > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > > > > On Mon, Nov 16, 2020 at 12:54 PM Jim Lemon wrote: > > > >> Hi Elaine, > >> There seems to be a popular contest to discover offence everywhere. I > >> don't > >> think that it does anything against racism, sexism or > >> antidisestablishmentarianism. Words are plucked from our vast lexicon to > >> comfort or insult our fellows depending upon the intent of the user. It > is > >> the intent that matters, not the poor word. Chasing the words wastes > your > >> time, blames those who use the words harmlessly, and gives the real > >> offender time to find another epithet. > >> > >> Jim > >> > >> On Tue, Nov 17, 2020 at 5:39 AM Lainey Gallenberg < > >> laineygallenb...@gmail.com> wrote: > >> > >> > Hello, > >> > > >> > I'm hoping someone on here knows the appropriate place/contact for me > to > >> > lodge a complaint about a color name in the "colors" function. I was > >> > shocked to see there are four named color options that include the > term > >> > "indianred." Surely these colors can be changed to something less > >> > offensive- my suggestion is "blush." How can I find out who to contact > >> > about making this happen? > >> > > >> > Thank you in advance for any suggestions. > >> > > >> > Sincerely, > >> > Elaine Gallenberg > >> > > >> > [[alternative HTML version deleted]] > >> > > >> > __ > >> > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> > https://stat.ethz.ch/mailman/listinfo/r-help > >> > PLEASE do read the posting guide > >> > http://www.R-project.org/posting-guide.html > >> > and provide commented, minimal, self-contained, reproducible code. > >> > > >> > >> [[alternative HTML version deleted]] > >> > >> __ > >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >> https://stat.ethz.ch/mailman/listinfo/r-help > >> PLEASE do read the posting guide > >> http://www.R-project.org/posting-guide.html > >> and provide commented, minimal, self-contained, reproducible code. > >> > > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installing bioconduction packages in connection with loading an R package
An alternative to setRepositories() is use of (the CRAN package) BiocManager::install("gRbase") instead of install.packages(). BiocManager installs CRAN packages as well as Bioconductor packages. Another, more transparent, solution is to use install.packages("gRbase", repos = BiocManager::repositories()) where the key idea is to include Bioconductor repositories explicitly. These approaches are preferred to setRepositories(), because of the details of the twice-yearly Bioconductor release cycle, compared to the annual R release and patch cycles. The usual approach to your problem is to move the package to Suggests:. But then the namespace commands like Imports, and the direct use of imported package functions, is not possible; you'll need to litter your code with fully resolved functions (graph::foo() instead of foo()). Also Suggests: is usually home to packages that have a limited role to play, but that does not seem likely for RBGL etc in your package. Also, in implementing this approach one would normally check that the package were installed, and fail with an error message telling the user how to fix the problem (e.g., by installing the package). This doesn't really sound like progress. If you instead try to automatically install the package (in .onAttach(), I guess was your plan) you'll shortly run into users who need to use arguments to install.packages() that you have not made available to them. Your CRAN page took me quickly to your package web site and clear installation instructions; I do not think use of Bioc packages is a particular barrier to use. Martin Morgan On 10/11/20, 2:52 PM, "R-help on behalf of Søren Højsgaard" wrote: Dear all, My gRbase package imports functionality from the bioconductor packages graph, Rgraphviz and RBGL. To make installation of gRbase easy, I would like to have these bioconductor packages installed in connection with installation of gRbase, but to do so the user must use setRepositories() to make sure that R also installs packages from bioconductor. Having to call setRepositories causes what can perhaps be called an (unnecessary?) obstacle. Therefore I have been experimenting with deferring installation of these bioc-packages until gRbase is loaded the first time using .onAttach; please see my attempt below. However, if the bioc-packages are not installed I can not install gRbase so that does not seem to be a viable approach. (The bioc-packages appear as Imports: in DESCRIPTION). Can anyone tell if it is a futile approach and / or perhaps suggest a solution. (I would guess that there are many CRAN packages that use bioc-packages, so other people must have faced this challenge before). Thanks in advance. Best regards S�ren .onAttach<-function(libname, pkgname) { ## package startup check toinstall=c( "graph", "Rgraphviz", "RBGL" ) already_installed <- sapply(toinstall, function(pkg) requireNamespace(pkg, quietly=TRUE)) if (any(!already_installed)){ packageStartupMessage("Need to install the following package(s): ", toString(toinstall[!already_installed]), "\n") } ## install if needed if(!base::all(already_installed)){ if (!requireNamespace("BiocManager", quietly=TRUE)) install.packages("BiocManager") BiocManager::install(toinstall[!already_installed], dependencies=TRUE) } } [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] combine filter() and select()
A kind of hybrid answer is to use base::subset(), which supports non-standard evaluation (it searches for unquoted symbols like 'files' in the code line below in the object that is its first argument; %>% puts 'mytbl' in that first position) and row (filter) and column (select) subsets > mytbl %>% subset(files %in% "a", files) # A tibble: 1 x 1 files 1 a Or subset(grepl("a", files), files) if that was what you meant. One important idea that the tidyverse implements is, in my opinion, 'endomorphism' -- you get back the same type of object as you put in -- so I wouldn't use a base R idiom that returned a vector unless that were somehow essential for the next step in the analysis. There is value in having separate functions for filter() and select(), and probably there are edge cases where filter(), select(), and subset() behave differently, but for what it's worth subset() can be used to perform these operations individually > mytbl %>% subset(, files) # A tibble: 6 x 1 files 1 a 2 b 3 c 4 d 5 e 6 f > mytbl %>% subset(grepl("a", files), ) # A tibble: 1 x 2 files prop 1 a 1 Martin Morgan On 8/20/20, 2:48 AM, "R-help on behalf of Ivan Calandra" wrote: Hi Jeff, The code you show is exactly what I usually do, in base R; but I wanted to play with tidyverse to learn it (and also understand when it makes sense and when it doesn't). And yes, of course, in the example I gave, I end up with a 1-cell tibble, which could be better extracted as a length-1 vector. But my real goal is not to end up with a single value or even a single column. I just thought that simplifying my example was the best approach to ask for advice. But thank you for letting me know that what I'm doing is pointless! Ivan -- Dr. Ivan Calandra TraCEr, laboratory for Traceology and Controlled Experiments MONREPOS Archaeological Research Centre and Museum for Human Behavioural Evolution Schloss Monrepos 56567 Neuwied, Germany +49 (0) 2631 9772-243 https://www.researchgate.net/profile/Ivan_Calandra On 19/08/2020 19:27, Jeff Newmiller wrote: > The whole point of dplyr primitives is to support data frames... that is, lists of columns. When you pare your data frame down to one column you are almost certainly using the wrong tool for the job. > > So, sure, your code works... and it even does what you wanted in the dplyr style, but what a pointless exercise. > > grep( "a", mytbl$file, value=TRUE ) > > On August 19, 2020 7:56:32 AM PDT, Ivan Calandra wrote: >> Dear useRs, >> >> I'm new to the tidyverse world and I need some help on basic things. >> >> I have the following tibble: >> mytbl <- structure(list(files = c("a", "b", "c", "d", "e", "f"), prop = >> 1:6), row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) >> >> I want to subset the rows with "a" in the column "files", and keep only >> that column. >> >> So I did: >> myfile <- mytbl %>% >> filter(grepl("a", files)) %>% >> select(files) >> >> It works, but I believe there must be an easier way to combine filter() >> and select(), right? >> >> Thank you! >> Ivan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Best settings for RStudio video recording?
Excellent question! I think most R courses use RStudio, so it is completely appropriate to ask about how to help people learn R using RStudio. I don't have a lot experience with virtual teaching, and very limited experience with anything other than short-term workshops. I think that there is tremendous value, during the 'in person' portion of a course, in doing interactive and even 'ad hoc' analysis, perhaps especially handling the off-the-wall questions that participants might raise (when I have to struggle to figure out what the R answer is, and then convey to the attendees my thinking process), and making all kinds of mistakes, including simple typos (requiring me to explain what the error message means, and how I diagnosed the problem and arrived at a solution that was other than a pull-it-out-of-the-hat miracle). With this in mind, I try to increase the prominence of the console portion of the RStudio interface. I place it at the top left of the screen (this might be a remnant of in-person presentations, where the heads of people in front often block the view of the lines where code is being enter; this is obviously not relevant in a virtual context). Usually I keep the script portion of the display visible at the bottom left, with only a few lines showing, as a kind of cheat sheet for me, rather than for the students to 'follow along'). I use a large font, which I think helps in both virtual and physical sessions in part because it limits the amount of information on the screen, causing me to slow my presentation enough that the students can absorb what I am saying. Perhaps as a consequence of the limited screen real-estate, students often ask 'to see the last command' so I now include in the right panel the 'History' tab. The division is asymmetric, so the console continues to take up the majority of screen real estate. The end result of a sequence of operations is often a pretty picture, but since this is only the end result and not the meat of the learning experience I tend to keep the plot window (lower right) relatively small, and try to remember to expand things at the time when the end result is in sight (so to speak;)). I hope others with more direct experience are not dissuaded by Bert's opinions, and offer up their own experiences or resource recommendations. Martin Morgan On 8/13/20, 6:05 PM, "R-help on behalf of Jonathan Greenberg" wrote: Folks: I was wondering if you all would suggest some helpful RStudio configurations that make recording a session via e.g. zoom the most useful for students doing remote learning. Thoughts? --j -- Jonathan A. Greenberg, PhD Randall Endowed Professor and Associate Professor of Remote Sensing Global Environmental Analysis and Remote Sensing (GEARS) Laboratory Natural Resources & Environmental Science University of Nevada, Reno 1664 N Virginia St MS/0186 Reno, NV 89557 Phone: 415-763-5476 https://www.gearslab.org/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Plotting DMRs (Differentially Methylated Regions) using Gviz package in R
Probably have more success asking on https://support.bioconductor.org. Martin Morgan On 2/7/20, 12:57 PM, "R-help on behalf of pooja sinha" wrote: Hi All, I have a file list consisting of Chromosome, Start , End & Methylation Difference in the following format in excel: Chrom Start End Meth. Diff chr1 38565900 38566000 -0.20276818 chr1 38870400 38870500 -0.342342342 chr1 39469400 39469500 -0.250260552 chr1 52013600 52013700 -0.37797619 chr1 52751700 52751800 0.257575758 chr1 75505100 75505200 -0.262847308 I need help in plotting the DMRs using Gviz package in R. I tried a code below but it doesn't turn out correct. library(GenomicRanges) library(grid) library(Gviz) library(rtracklayer) library(BSgenome) library(readxl) library(BSgenome.Rnorvegicus.UCSC.rn6) genome <- getBSgenome("BSgenome.Rnorvegicus.UCSC.rn6") genome data1 <- read_excel("DMRs_plots.xlsx") head(data1) data1$Chrom = Chrom$chr1 track1 <- DataTrack(data = data1, from = "38565900" , to = "28225", chromosome = Chrom$chr1, name = "DMRs") itrack <- IdeogramTrack(genome = genome, chromosome = chr) plotTracks(track1, itrack) If anyone know how to plot and correct my code including how to add methylation difference values, then that will be of great help. Thanks, Puja [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to find number of unique rows for combination of r columns
With this example > df = data.frame(a = c(1, 1, 2, 2), b = c(1, 1, 2, 3), value = 1:4) > df a b value 1 1 1 1 2 1 1 2 3 2 2 3 4 2 3 4 The approach to drop duplicates in the first and second columns has as a consequence the arbitrary choice of 'value' for the duplicate entries -- why chose a value of '1' rather than '2' (or the average of 1 and 2, or a list containing all possible values, or...) for the rows duplicated in columns a and b? > df[!duplicated(df[,1:2]),] a b value 1 1 1 1 3 2 2 3 4 2 3 4 In base R one might > aggregate(value ~ a + b, df, mean) a b value 1 1 1 1.5 2 2 2 3.0 3 2 3 4.0 > aggregate(value ~ a + b, df, list) a b value 1 1 1 1, 2 2 2 2 3 3 2 3 4 but handling several value-like columns would be hard(?) Using library(dplyr), I have > group_by(df, a, b) %>% summarize(mean_value = mean(value)) # A tibble: 3 x 3 # Groups: a [2] a b mean_value 1 1 11.5 2 2 23 3 2 34 or > group_by(df, a, b) %>% summarize(values = list(value)) # A tibble: 3 x 3 # Groups: a [2] a b values 1 1 1 2 2 2 3 2 3 summarizing multiple columns with dplyr > df$v1 = 1:4 > df$v2 = 4:1 > group_by(df, a, b) %>% summarize(v1_mean = mean(v1), v2_median = median(v2)) # A tibble: 3 x 4 # Groups: a [2] a b v1_mean v2_median 1 1 1 1.5 3.5 2 2 2 3 2 3 2 3 4 1 I do not know how performant this would be with data of your size. Martin Morgan On 11/8/19, 1:39 PM, "R-help on behalf of Ana Marija" wrote: Thank you so much!!! On Fri, Nov 8, 2019 at 11:40 AM Bert Gunter wrote: > > Correction: > df <- data.frame(a = 1:3, b = letters[c(1,1,2)], d = LETTERS[c(1,1,2)]) > df[!duplicated(df[,2:3]), ] ## Note the ! sign > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Fri, Nov 8, 2019 at 7:59 AM Bert Gunter wrote: >> >> Sorry, but you ask basic questions.You really need to spend some more time with an R tutorial or two. This list is not meant to replace your own learning efforts. >> >> You also do not seem to be reading the docs carefully. Under ?unique, it links ?duplicated and tells you that it gives indices of duplicated rows of a data frame. These then can be used by subscripting to remove those rows from the data frame. Here is a reproducible example: >> >> df <- data.frame(a = 1:3, b = letters[c(1,1,2)], d = LETTERS[c(1,1,2)]) >> df[-duplicated(df[,2:3]), ] ## Note the - sign >> >> If you prefer, the "Tidyverse" world has what are purported to be more user-friendly versions of such data handling functionality that you can use instead. >> >> >> Bert >> >> On Fri, Nov 8, 2019 at 7:38 AM Ana Marija wrote: >>> >>> would you know how would I extract from my original data frame, just >>> these unique rows? >>> because this gives me only those 3 columns, and I want all columns >>> from the original data frame >>> >>> > head(udt) >>>chr pos gene_id >>> 1 chr1 54490 ENSG0227232 >>> 2 chr1 58814 ENSG0227232 >>> 3 chr1 60351 ENSG0227232 >>> 4 chr1 61920 ENSG0227232 >>> 5 chr1 63671 ENSG0227232 >>> 6 chr1 64931 ENSG0227232 >>> >>> > head(dt) >>> chr pos gene_id pval_nominal pval_ret wl wr META >>> 1: chr1 54490 ENSG0227232 0.608495 0.783778 31.62278 21.2838 0.7475480 >>> 2: chr1 58814 ENSG0227232 0.295211 0.897582 31.62278 21.2838 0.6031214 >>> 3: chr1 60351 ENSG0227232 0.439788 0.867959 31.62278 21.2838 0.6907182 >>> 4: chr1 61920 ENSG0227232 0.319528 0.601809 31.62278 21.2838 0.4032200 >>> 5: chr1 63671 ENSG0227232 0.237739 0.988039 31.62278 21.2838 0.7482519 >>> 6: chr1 64931 ENSG0227232 0.276679 0.907037 31.62278 21.2838 0.5974800 >>> >>> On Fri, Nov 8, 2019 at 9:30 AM Ana Marija wrote: >>> > >>> > Thank you so much! Converting it to data frame resolved the issue! >>> >
Re: [R] how to use a matrix as an index to another matrix?
A matrix can be subset by another 2-column matrix, where the first column is the row index and the second column the column index. So idx = matrix(c(B, col(B)), ncol = 2) A[] <- A[idx] Martin Morgan On 10/11/19, 6:31 AM, "R-help on behalf of Eric Berger" wrote: Here is one way A <- sapply(1:ncol(A), function(i) {A[,i][B[,i]]}) On Fri, Oct 11, 2019 at 12:44 PM Jinsong Zhao wrote: > Hi there, > > I have two matrices, A and B. The columns of B is the index of the > corresponding columns of A. I hope to rearrange of A by B. A minimal > example is following: > > > set.seed(123) > > A <- matrix(sample(1:10), nrow = 5) > > B <- matrix(c(sample(1:5), sample(1:5)), nrow =5, byrow = FALSE) > > A > [,1] [,2] > [1,]39 > [2,] 101 > [3,]27 > [4,]85 > [5,]64 > > B > [,1] [,2] > [1,]21 > [2,]34 > [3,]15 > [4,]43 > [5,]52 > > A[,1] <- A[,1][B[,1]] > > A[,2] <- A[,2][B[,2]] > > A > [,1] [,2] > [1,] 109 > [2,]25 > [3,]34 > [4,]87 > [5,]61 > > My question is whether there is any elegant or generalized way to replace: > > > A[,1] <- A[,1][B[,1]] > > A[,2] <- A[,2][B[,2]] > > Thanks in advance. > > PS., I know how to do the above thing by loop. > > Best, > Jinsong > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] BiocManager problem.
Please follow the response to your question on the Bioconductor support site https://support.bioconductor.org/p/125493/ Martin Morgan On 10/10/19, 12:23 PM, "R-help on behalf of Ali Siavosh" wrote: Hi, I have installation of R in a server running on redhat 7. I have upgraded R and now to upgrade BiocManager I get error messages as below: > install.packages("BiocManager") Installing package into ‘/usr/lib64/R/library’ (as ‘lib’ is unspecified) trying URL 'https://cran.revolutionanalytics.com/src/contrib/BiocManager_1.30.7.tar.gz' Content type 'application/octet-stream' length 38020 bytes (37 KB) == downloaded 37 KB * installing *source* package ‘BiocManager’ ... ** package ‘BiocManager’ successfully unpacked and MD5 sums checked ** using staged installation ** R ** inst ** byte-compile and prepare package for lazy loading ** help *** installing help indices converting help for package ‘BiocManager’ finding HTML links ... done BiocManager-pkg html available html install html repositorieshtml valid html version html ** building package indices ** installing vignettes ** testing if installed package can be loaded from temporary location ** testing if installed package can be loaded from final location ** testing if installed package keeps a record of temporary installation path * DONE (BiocManager) Making 'packages.html' ... done The downloaded source packages are in ‘/tmp/RtmpgHhwMp/downloaded_packages’ Updating HTML index of packages in '.Library' Making 'packages.html' ... done > BiocManager::version() Error: .onLoad failed in loadNamespace() for 'BiocManager', details: call: NULL error: Bioconductor version '3.8' requires R version '3.5'; see https://bioconductor.org/install > BiocManager::valid() Error: .onLoad failed in loadNamespace() for 'BiocManager', details: call: NULL error: Bioconductor version '3.8' requires R version '3.5'; see https://bioconductor.org/install > BiocManager::install(version="3.5") Error: .onLoad failed in loadNamespace() for 'BiocManager', details: call: NULL error: Bioconductor version '3.8' requires R version '3.5'; see https://bioconductor.org/install > BiocManager::install(version="3.7") Error: .onLoad failed in loadNamespace() for 'BiocManager', details: call: NULL error: Bioconductor version '3.8' requires R version '3.5'; see https://bioconductor.org/install <https://bioconductor.org/install> I appreciate any help with regard to this. Thank you [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Trying to coerce an AnnotatedDataFrame in order to access Probeset Info
Are you remembering to attach the Biobase package to your R session? > AnnotatedDataFrame() Error in AnnotatedDataFrame() : could not find function "AnnotatedDataFrame" > suppressPackageStartupMessages({ library(Biobase) }) > AnnotatedDataFrame() An object of class 'AnnotatedDataFrame': none Biobase is a Bioconductor package, so support questions should more appropriately go to https://support.bioconductor.org Martin On 7/17/19, 4:20 PM, "R-help on behalf of Spencer Brackett" wrote: Good evening, I downloaded the Biobase package in order to utilize the ExpressionSet and other features hosted there to examine annotations for probeset data, which I seek to visualize. I currently have pre-analyzed object located in my environment containing said probeset info, along with gene id and location. After experimenting with the following approaches, I'm am at a loss for as to why the AnnotatedDataFrame function is not being recognized by R. ##Example of some of my attempts and their respective error messages## >AnnotatedDataFrame() Error in AnnotatedDataFrame() : could not find function "AnnotatedDataFrame" signature(object="assayData") object "assayData" > annotatedDataFrameFrom("assayData", byrow=FALSE) Error in annotatedDataFrameFrom("assayData", byrow = FALSE) : could not find function "annotatedDataFrameFrom" >as(data.frame, "AnnotatedDataFrame") Error in as(data.frame, "AnnotatedDataFrame") : no method or default for coercing “function” to “AnnotatedDataFrame” Best, Spencer [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Was there a change to R ver. 3.5.2 so that it now treats warnings during installs as errors?
Looks like you're using remotes::install_github(), which in turn uses remotes::install(). The README https://github.com/r-lib/remotes/blob/254c67ed6502e092a316553f2a44f04b0e595b64/README.md says "Setting R_REMOTES_NO_ERRORS_FROM_WARNINGS=true avoids stopping the installation for warning messages. Warnings usually mean installation errors, so by default remotes stops for a warning. However, sometimes other warnings might happen, that could be ignored by setting this environment variable. So I'd guess Sys.setenv(R_REMOTES_NO_ERRORS_FROM_WARNINGS = TRUE) before installing the package would address this problem. Martin Morgan On 1/20/19, 6:58 AM, "R-help on behalf of Duncan Murdoch" wrote: On 19/01/2019 8:22 p.m., Peter Waltman wrote: > I'm trying to install a devel package called gGnome ( > https://github.com/mskilab/gGnome). One of its dependencies is another > package from the same group, called gTrack, which causes several warning > messages to be generated because it overloads a couple of functions that > are part of other packages that gTrack is dependent upon. The specific > warnings are provided below. During the lazy-loading step of gGnome's > install, gTrack is loaded, and when these warnings come up, they are > converted to errors, causing the install to fail. This behavior is new to > version 3.5.2, as I've been able to successfully install these packages > with R versions 3.5.0 and 3.5.1. Is there a workaround for this for version > 3.5.2? > > Thanks! > > Error message during gGnome install: > >> install_github('mskilab/gGnome') > Downloading GitHub repo mskilab/gGnome@master > Skipping 3 packages not available: GenomicRanges, rtracklayer, > VariantAnnotation > ✔ checking for file > ‘/tmp/Rtmp4hnMMO/remotes7fb938cd0553/mskilab-gGnome-81f661e/DESCRIPTION’ ... > ─ preparing ‘gGnome’: > ✔ checking DESCRIPTION meta-information ... > ─ checking for LF line-endings in source and make files and shell scripts > ─ checking for empty or unneeded directories > Removed empty directory ‘gGnome/inst/extdata/gTrack.js’ > ─ building ‘gGnome_0.1.tar.gz’ > > * installing *source* package ‘gGnome’ ... > ** R > ** inst > ** byte-compile and prepare package for lazy loading > Error: package or namespace load failed for ‘gTrack’: > * (converted from warning)* multiple methods tables found for ‘seqinfo<-’ > Error : package ‘gTrack’ could not be loaded > ERROR: lazy loading failed for package ‘gGnome’ > * removing ‘/home/waltman/bin/R/3.5.2/lib/R/library/gGnome’ > Error in i.p(...) : >(converted from warning) installation of package > ‘/tmp/Rtmp4hnMMO/file7fb929638ed8/gGnome_0.1.tar.gz’ had non-zero exit > status That message indicates that options("warn") is 2 or higher when the warning occurs. What is its setting before you start the install? Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Efficient way of loading files in R
Ask on the Bioconductor support site https://support.bioconductor.org Provide (on the support site) the output of the R commands library(GEOquery) sessionInfo() Also include (copy and paste) the output of the command that fails. I have > gseEset2 <- getGEO('GSE76896')[[1]] Found 1 file(s) GSE76896_series_matrix.txt.gz trying URL 'https://ftp.ncbi.nlm.nih.gov/geo/series/GSE76nnn/GSE76896/matrix/GSE76896_series_matrix.txt.gz' Content type 'application/x-gzip' length 40561936 bytes (38.7 MB) == downloaded 38.7 MB Parsed with column specification: cols( .default = col_double(), ID_REF = col_character() ) See spec(...) for full column specifications. |=| 100% 84 MB File stored at: /tmp/Rtmpe4NWji/GPL570.soft |=| 100% 75 MB > sessionInfo() R version 3.5.1 Patched (2018-08-22 r75177) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.5 LTS Matrix products: default BLAS: /home/mtmorgan/bin/R-3-5-branch/lib/libRblas.so LAPACK: /home/mtmorgan/bin/R-3-5-branch/lib/libRlapack.so locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods [8] base other attached packages: [1] bindrcpp_0.2.2 GEOquery_2.49.1 Biobase_2.41.2 [4] BiocGenerics_0.27.1 BiocManager_1.30.2 loaded via a namespace (and not attached): [1] Rcpp_0.12.18 tidyr_0.8.1 crayon_1.3.4 dplyr_0.7.6 [5] assertthat_0.2.0 R6_2.2.2 magrittr_1.5 pillar_1.3.0 [9] stringi_1.2.4rlang_0.2.2 curl_3.2 limma_3.37.4 [13] xml2_1.2.0 tools_3.5.1 readr_1.1.1 glue_1.3.0 [17] purrr_0.2.5 hms_0.4.2compiler_3.5.1 pkgconfig_2.0.2 [21] tidyselect_0.2.4 bindr_0.1.1 tibble_1.4.2 On 09/07/2018 06:08 AM, Deepa wrote: Hello, I am using a bioconductor package in R. The command that I use reads the contents of a file downloaded from a database and creates an expression object. The syntax works perfectly fine when the input size is of 10 MB. Whereas, when the file size is around 40MB the object isn't created. Is there an efficient way of loading a large input file to create the expression object? This is my code, library(gcrma) library(limma) library(biomaRt) library(GEOquery) library(Biobase) require(GEOquery) require(Biobase) gseEset1 <- getGEO('GSE53454')[[1]] #filesize 10MB gseEset2 <- getGEO('GSE76896')[[1]] #file size 40MB ##gseEset2 doesn't load and isn't created Many thanks [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] mzR fails to install/compile (linuxes)
mzR is a Bioconductor package so you might have more luck contacting the maintainer on the Bioconductor support site https://support.bioconductor.org or on the 'bioc-devel' mailing list https://stat.ethz.ch/mailman/listinfo/bioc-devel or most directly by opening an issue on the maintainer's github https://github.com/sneumann/mzR/issues/ this is linked to from the package 'landing page' https://bioconductor.org/packages/mzR Martin Morgan On 06/15/2018 10:49 AM, lejeczek via R-help wrote: hi guys, just an admin here. I wonder if anybody see what I see, or similar? I'm on Centos 7.x and this occurs with R 3.4.x 3.5.x and probably earlier versions too. Every time I use something like -j>1 to pass to a compiler, eg.echo -ne $ "Sys.setenv(MAKEFLAGS = \"-j2\")\\n source(\"https://bioconductor.org/biocLite.R\";)\\n biocLite(c(\"mzR\"), suppressUpdates=FALSE, suppressAutoUpdate=FALSE, ask=FALSE)" | /usr/bin/R --vanilla mzR fails to compile: ... g++ -m64 -std=gnu++11 -shared -L/usr/lib64/R/lib -Wl,-z,relro -o mzR.so cramp.o ramp_base64.o ramp.o RcppRamp.o RcppRampModule.o rnetCDF.o RcppPwiz.o RcppPwizModule.o RcppIdent.o RcppIdentModule.o ./boost/libs/system/src/error_code.o ./boost/libs/regex/src/posix_api.o ./boost/libs/regex/src/fileiter.o ./boost/libs/regex/src/regex_raw_buffer.o ./boost/libs/regex/src/cregex.o ./boost/libs/regex/src/regex_debug.o ./boost/libs/regex/src/instances.o ./boost/libs/regex/src/icu.o ./boost/libs/regex/src/usinstances.o ./boost/libs/regex/src/regex.o ./boost/libs/regex/src/wide_posix_api.o ./boost/libs/regex/src/regex_traits_defaults.o ./boost/libs/regex/src/winstances.o ./boost/libs/regex/src/wc_regex_traits.o ./boost/libs/regex/src/c_regex_traits.o ./boost/libs/regex/src/cpp_regex_traits.o ./boost/libs/regex/src/static_mutex.o ./boost/libs/regex/src/w32_regex_traits.o ./boost/libs/iostreams/src/zlib.o ./boost/libs/iostreams/src/file_descriptor.o ./boost/libs/filesystem/src/operations.o ./boost/libs/filesystem/src/path.o ./boost/libs/filesystem/src/utf8_codecvt_facet.o ./boost/libs/chrono/src/chrono.o ./boost/libs/chrono/src/process_cpu_clocks.o ./boost/libs/chrono/src/thread_clock.o ./pwiz/data/msdata/Version.o ./pwiz/data/identdata/Version.o ./pwiz/data/common/MemoryIndex.o ./pwiz/data/common/CVTranslator.o ./pwiz/data/common/cv.o ./pwiz/data/common/ParamTypes.o ./pwiz/data/common/BinaryIndexStream.o ./pwiz/data/common/diff_std.o ./pwiz/data/common/Unimod.o ./pwiz/data/msdata/mz5/Configuration_mz5.o ./pwiz/data/msdata/mz5/Connection_mz5.o ./pwiz/data/msdata/mz5/Datastructures_mz5.o ./pwiz/data/msdata/mz5/ReferenceRead_mz5.o ./pwiz/data/msdata/mz5/ReferenceWrite_mz5.o ./pwiz/data/msdata/mz5/Translator_mz5.o ./pwiz/data/msdata/SpectrumList_MGF.o ./pwiz/data/msdata/DefaultReaderList.o ./pwiz/data/msdata/ChromatogramList_mzML.o ./pwiz/data/msdata/ChromatogramList_mz5.o ./pwiz/data/msdata/examples.o ./pwiz/data/msdata/Serializer_mzML.o ./pwiz/data/msdata/Serializer_MSn.o ./pwiz/data/msdata/Reader.o ./pwiz/data/msdata/Serializer_mz5.o ./pwiz/data/msdata/Serializer_MGF.o ./pwiz/data/msdata/Serializer_mzXML.o ./pwiz/data/msdata/SpectrumList_mzML.o ./pwiz/data/msdata/SpectrumList_MSn.o ./pwiz/data/msdata/SpectrumList_mz5.o ./pwiz/data/msdata/BinaryDataEncoder.o ./pwiz/data/msdata/Diff.o ./pwiz/data/msdata/MSData.o ./pwiz/data/msdata/References.o ./pwiz/data/msdata/SpectrumList_mzXML.o ./pwiz/data/msdata/IO.o ./pwiz/data/msdata/SpectrumList_BTDX.o ./pwiz/data/msdata/SpectrumInfo.o ./pwiz/data/msdata/RAMPAdapter.o ./pwiz/data/msdata/LegacyAdapter.o ./pwiz/data/msdata/SpectrumIterator.o ./pwiz/data/msdata/MSDataFile.o ./pwiz/data/msdata/MSNumpress.o ./pwiz/data/msdata/SpectrumListCache.o ./pwiz/data/msdata/Index_mzML.o ./pwiz/data/msdata/SpectrumWorkerThreads.o ./pwiz/data/identdata/IdentDataFile.o ./pwiz/data/identdata/IdentData.o ./pwiz/data/identdata/DefaultReaderList.o ./pwiz/data/identdata/Reader.o ./pwiz/data/identdata/Serializer_protXML.o ./pwiz/data/identdata/Serializer_pepXML.o ./pwiz/data/identdata/Serializer_mzid.o ./pwiz/data/identdata/IO.o ./pwiz/data/identdata/References.o ./pwiz/data/identdata/MascotReader.o ./pwiz/data/proteome/Modification.o ./pwiz/data/proteome/Digestion.o ./pwiz/data/proteome/Peptide.o ./pwiz/data/proteome/AminoAcid.o ./pwiz/utility/minimxml/XMLWriter.o ./pwiz/utility/minimxml/SAXParser.o ./pwiz/utility/chemistry/Chemistry.o ./pwiz/utility/chemistry/ChemistryData.o ./pwiz/utility/chemistry/MZTolerance.o ./pwiz/utility/misc/IntegerSet.o ./pwiz/utility/misc/Base64.o ./pwiz/utility/misc/IterationListener.o ./pwiz/utility/misc/MSIHandler.o ./pwiz/utility/misc/Filesystem.o ./pwiz/utility/misc/TabReader.o ./pwiz/utility/misc/random_access_compressed_ifstream.o ./pwiz/utility/misc/SHA1.o ./pwiz/utility/misc/SHA1Calculator.o ./pwiz/utility/misc/sha1calc.o ./random_access_gzFile.o ./RcppE
Re: [R] S4 class slot type S4 class
On 05/21/2018 12:06 AM, Glenn Schultz wrote: All, I am considering creating an S4 class whose slots (2) are both S4 classes. Since an S4 slot can be an S3 class I figure this can be done. However, the correct syntax of which I am unsure. Reviewing the docs I have come to the following conclusion: SetClass('myfoo', slots = (foo1, foo2)) Without a type I believe each slot is .Data. A get method on the above class slots would return say foo1 which will have all methods and generics belonging to foo1 class. Is this the correct approach? Suppose you have two classes .A = setClass("A", slots = c(x = "numeric")) .B = setClass("B", slots = c(y = "numeric", z = "numeric")) A third class containing these would be .C = setClass("C", slots = c(a = "A", b = "B")) where names of the slot argument are the slot names, and the character strings "A", "B" are the type of object the slot will store. > .C() An object of class "C" Slot "a": An object of class "A" Slot "x": numeric(0) Slot "b": An object of class "B" Slot "y": numeric(0) Slot "z": numeric(0) > .C(a = .A(x = 1:2), b = .B(y = 2:1, z = 1:2)) An object of class "C" Slot "a": An object of class "A" Slot "x": [1] 1 2 Slot "b": An object of class "B" Slot "y": [1] 2 1 Slot "z": [1] 1 2 Martin Morgan Best, Glenn __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Possible Improvement to sapply
On 03/13/2018 09:23 AM, Doran, Harold wrote: While working with sapply, the documentation states that the simplify argument will yield a vector, matrix etc "when possible". I was curious how the code actually defined "as possible" and see this within the function if (!identical(simplify, FALSE) && length(answer)) This seems superfluous to me, in particular this part: !identical(simplify, FALSE) The preceding code could be reduced to if (simplify && length(answer)) and it would not need to execute the call to identical in order to trigger the conditional execution, which is known from the user's simplify = TRUE or FALSE inputs. I *think* the extra call to identical is just unnecessary overhead in this instance. Take for example, the following toy example code and benchmark results and a small modification to sapply: myList <- list(a = rnorm(100), b = rnorm(100)) answer <- lapply(X = myList, FUN = length) simplify = TRUE library(microbenchmark) mySapply <- function (X, FUN, ..., simplify = TRUE, USE.NAMES = TRUE){ FUN <- match.fun(FUN) answer <- lapply(X = X, FUN = FUN, ...) if (USE.NAMES && is.character(X) && is.null(names(answer))) names(answer) <- X if (simplify && length(answer)) simplify2array(answer, higher = (simplify == "array")) else answer } microbenchmark(sapply(myList, length), times = 1L) Unit: microseconds exprmin lq mean median uqmax neval sapply(myList, length) 14.156 15.572 16.67603 15.926 16.634 650.46 1 microbenchmark(mySapply(myList, length), times = 1L) Unit: microseconds exprmin lq mean median uq max neval mySapply(myList, length) 13.095 14.864 16.02964 15.218 15.573 1671.804 1 My benchmark timings show a timing improvement with only that small change made and it is seemingly nominal. In my actual work, the sapply function is called millions of times and this additional overhead propagates to some overall additional computing time. I have done some limited testing on various real data to verify that the objects produced under both variants of the sapply (base R and my modified) yield identical objects when simply is both TRUE or FALSE. Perhaps someone else sees a counterexample where my proposed fix does not cause for sapply to behave as expected. Check out ?sapply for possible values of `simplify=` to see why your proposal is not adequate. For your example, lengths() is an order of magnitude faster than sapply(., length). This is a example of the advantages of vectorization (single call to an R function implemented in C) versus iteration (`for` loops but also the *apply family calling an R function many times). vapply() might also be relevant. Often performance improvements come from looking one layer up from where the problem occurs and re-thinking the algorithm. Why would one need to call sapply() millions of times, in a situation where this becomes rate-limiting? Can the algorithm be re-implemented to avoid this step? Martin Morgan Harold __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] UseDevel: version requires a more recent R
Ask questions about Bioconductor on the support site https://support.bioconductor.org Bioconductor versions are tied to particular R versions. The current Bioc-devel requires use of R-devel. You're using R-3.4.2, so need to install the devel version of R. Additional information is at http://bioconductor.org/developers/how-to/useDevel/ Martin Morgan On 01/09/2018 01:32 PM, Sariya, Sanjeev wrote: Hello R experts: I need a developer version of a Bioconductor library. sessionInfo() R version 3.4.2 (2017-09-28) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 When I try to useDevel it fails. I've removed packages and again loaded but I get the same error message. remove.packages("BiocInstaller") source("https://bioconductor.org/biocLite.R";) library(BiocInstaller) Bioconductor version 3.6 (BiocInstaller 1.28.0), ?biocLite for help useDevel() Error: 'devel' version requires a more recent R I'm running into this error for few days now. I close R after removing biocInstaller and proceed with following steps. Please guide me to fix this. Thanks, SS [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Facing problem in installing the package named "methyAnalysis"
On 12/29/2017 07:00 AM, Pijush Das wrote: Thank you Michael Dewey. Can you please send me the email id for Bioconductor. https://support.bioconductor.org Make sure you are using packages from a consistent version of Bioconductor source("https://bioconductor.org/biocLite.R";) BiocInstaller::biocValid() Martin regards Pijush On Fri, Dec 29, 2017 at 5:20 PM, Michael Dewey wrote: Dear Pijush You might do better to ask on the Bioconductor list as IRanges does not seem to be on CRAN so I deduce it is a Bioconductor package too. Michael On 29/12/2017 07:29, Pijush Das wrote: Dear Sir, I have been using R for a long time. But recently I have faced a problem when installing the Bioconductor package named "methyAnalysis". Firstly it was require to update my older R (R version 3.4.3 (2017-11-30)) in to newer version. That time I have also updated the RStudio software. After that when I have tried to install the package named "methyAnalysis". It shows some error given below. No methods found in package ‘IRanges’ for requests: ‘%in%’, ‘elementLengths’, ‘elementMetadata’, ‘ifelse’, ‘queryHits’, ‘Rle’, ‘subjectHits’, ‘t’ when loading ‘bumphunter’ Error: package or namespace load failed for ‘methyAnalysis’: objects ‘.__T__split:base’, ‘split’ are not exported by 'namespace:IRanges' In addition: Warning message: replacing previous import ‘BiocGenerics::image’ by ‘graphics::image’ when loading ‘methylumi’ I also try to install the package after downloading the source package from Bioconductor but the method is useless. Please help me to install the package named "methyAnalysis". Thanking you regards Pijush [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posti ng-guide.html and provide commented, minimal, self-contained, reproducible code. -- Michael http://www.dewey.myzen.co.uk/home.html [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dplyr - add/expand rows
On 11/29/2017 05:47 PM, Tóth Dénes wrote: Hi Martin, On 11/29/2017 10:46 PM, Martin Morgan wrote: On 11/29/2017 04:15 PM, Tóth Dénes wrote: Hi, A benchmarking study with an additional (data.table-based) solution. I don't think speed is the right benchmark (I do agree that correctness is!). Well, agree, and sorry for the wording. It was really just an exercise and not a full evaluation of the approaches. When I read the avalanche of solutions neither of which mentioning data.table (my first choice for data.frame-manipulations), I became curious how a one-liner data.table code performs against the other solutions in terms of speed and readability. Second, I quite often have the feeling that dplyr is extremely overused among novice (and sometimes even experienced) R users nowadays. This is unfortunate, as the present example also illustrates. Another solution is Bill's approach and dplyr's implementation (adding the 1L to keep integers integers!) fun_bill1 <- function(d) { i <- rep(seq_len(nrow(d)), d$to - d$from + 1L) j <- sequence(d$to - d$from + 1L) ## d[i,] %>% mutate(year = from + j - 1L, from = NULL, to = NULL) mutate(d[i,], year = from + j - 1L, from = NULL, to = NULL) } which is competitive with IRanges and data.table (the more dplyr-ish? solution d[i, ] %>% mutate(year = from + j - 1L) %>% select(station, record, year)) has intermediate performance) and might appeal to those introduced to R through dplyr but wanting more base R knowledge, and vice versa. I think if dplyr introduces new users to R, or exposes R users to new approaches for working with data, that's great! Martin Regards, Denes For the R-help list, maybe something about least specialized R knowledge required would be appropriate? I'd say there were some 'hard' solutions -- Michael (deep understanding of Bioconductor and IRanges), Toth (deep understanding of data.table), Jim (at least for me moderate understanding of dplyr,especially the .$ notation; a simpler dplyr answer might have moved this response out of the 'difficult' category, especially given the familiarity of the OP with dplyr). I'd vote for Bill's as requiring the least specialized knowledge of R (though the +/- 1 indexing is an easy thing to get wrong). A different criteria might be reuse across analysis scenarios. Bill seems to win here again, since the principles are very general and at least moderately efficient (both Bert and Martin's solutions are essentially R-level iterations and have poor scalability, as demonstrated in the microbenchmarks; Bill's is mostly vectorized). Certainly data.table, dplyr, and IRanges are extremely useful within the confines of the problem domains they address. Martin Enjoy! ;) Cheers, Denes -- ## packages ## library(dplyr) library(data.table) library(IRanges) library(microbenchmark) ## prepare example dataset ### ## use Bert's example, with 2000 stations instead of 2 d_df <- data.frame( station = rep(rep(c("one","two"),c(5,4)), 1000L), from = as.integer(c(60,61,71,72,76,60,65,82,83)), to = as.integer(c(60,70,71,76,83,64, 81, 82,83)), record = c("A","B","C","B","D","B","B","D","E"), stringsAsFactors = FALSE) stations <- rle(d_df$station) stations$value <- gsub( " ", "0", paste0("station", format(1:length(stations$value), width = 6))) d_df$station <- rep(stations$value, stations$lengths) ## prepare tibble and data.table versions d_tbl <- as_tibble(d_df) d_dt <- as.data.table(d_df) ## solutions ## ## Bert - by fun_bert <- function(d) { out <- by( d, d$station, function(x) with(x, { i <- to - from +1 data.frame(record =rep(record,i), year =sequence(i) -1 + rep(from,i), stringsAsFactors = FALSE) })) data.frame(station = rep(names(out), sapply(out,nrow)), do.call(rbind,out), row.names = NULL, stringsAsFactors = FALSE) } ## Bill - transform fun_bill <- function(d) { i <- rep(seq_len(nrow(d)), d$to-d$from+1) j <- sequence(d$to-d$from+1) transform(d[i,], year=from+j-1, from=NULL, to=NULL) } ## Michael - IRanges fun_michael <- function(d) { df <- with(d, DataFrame(station, record, year=IRanges(from, to))) expand(df, "year") } ## Jim - dplyr fun_jim <- function(d) { d %>% rowwise() %>% do(tibble(station = .$station, record = .$record, year = seq(.$from, .$to)) ) } ## Martin - Map fun_martin <- function(d) { d$year <- with(d, Ma
Re: [R] dplyr - add/expand rows
n year record 1 07EA001 1960 QMS 2 07EA001 1961 QMC 3 07EA001 1962 QMC 4 07EA001 1963 QMC 5 07EA001 1964 QMC ... ... ... ... 20 07EA001 1979 QRC 21 07EA001 1980 QRC 22 07EA001 1981 QRC 23 07EA001 1982 QRC 24 07EA001 1983 QRC If you tell the computer more about your data, it can do more things for you. Michael On Tue, Nov 28, 2017 at 7:34 AM, Martin Morgan < martin.mor...@roswellpark.org> wrote: On 11/26/2017 08:42 PM, jim holtman wrote: try this: ## library(dplyr) input <- tribble( ~station, ~from, ~to, ~record, "07EA001" , 1960 , 1960 , "QMS", "07EA001" , 1961 , 1970 , "QMC", "07EA001" , 1971 , 1971 , "QMM", "07EA001" , 1972 , 1976 , "QMC", "07EA001" , 1977 , 1983 , "QRC" ) result <- input %>% rowwise() %>% do(tibble(station = .$station, year = seq(.$from, .$to), record = .$record) ) ### In a bit more 'base R' mode I did input$year <- with(input, Map(seq, from, to)) res0 <- with(input, Map(data.frame, station=station, year=year, record=record)) as_tibble(do.call(rbind, unname(res0)))# A tibble: 24 x 3 resulting in as_tibble(do.call(rbind, unname(res0)))# A tibble: 24 x 3 station year record 1 07EA001 1960 QMS 2 07EA001 1961 QMC 3 07EA001 1962 QMC 4 07EA001 1963 QMC 5 07EA001 1964 QMC 6 07EA001 1965 QMC 7 07EA001 1966 QMC 8 07EA001 1967 QMC 9 07EA001 1968 QMC 10 07EA001 1969 QMC # ... with 14 more rows I though I should have been able to use `tibble` in the second step, but that leads to a (cryptic) error res0 <- with(input, Map(tibble, station=station, year=year, record=record))Error in captureDots(strict = `__quosured`) : the argument has already been evaluated The 'station' and 'record' columns are factors, so different from the original input, but this seems the appropriate data type for theses columns. It's interesting to compare the 'specialized' knowledge needed for each approach -- rowwise(), do(), .$ for tidyverse, with(), do.call(), maybe rbind() and Map() for base R. Martin Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Sun, Nov 26, 2017 at 2:10 PM, Bert Gunter wrote: To David W.'s point about lack of a suitable reprex ("reproducible example"), Bill's solution seems to be for only one station. Here is a reprex and modification that I think does what was requested for multiple stations, again using base R and data frames, not dplyr and tibbles. First the reprex with **two** stations: d <- data.frame( station = rep(c("one","two"),c(5,4)), from = c(60,61,71,72,76,60,65,82,83), to = c(60,70,71,76,83,64, 81, 82,83), record = c("A","B","C","B","D","B","B","D","E")) d station from to record 1 one 60 60 A 2 one 61 70 B 3 one 71 71 C 4 one 72 76 B 5 one 76 83 D 6 two 60 64 B 7 two 65 81 B 8 two 82 82 D 9 two 83 83 E ## Now the conversion code using base R, especially by(): out <- by(d, d$station, function(x) with(x, { + i <- to - from +1 + data.frame(YEAR =sequence(i) -1 +rep(from,i), RECORD =rep(record,i)) + })) out <- data.frame(station = rep(names(out),sapply(out,nrow)),do.call(rbind,out), row.names = NULL) out station YEAR RECORD 1 one 60 A 2 one 61 B 3 one 62 B 4 one 63 B 5 one 64 B 6 one 65 B 7 one 66 B 8 one 67 B 9 one 68 B 10 one 69 B 11 one 70 B 12 one 71 C 13 one 72 B 14 one 73 B 15 one 74 B 16 one 75 B 17 one 76 B 18 one 76 D 19 one 77 D 20 one 78 D 21 one 79 D 22 one 80 D 23 one 81 D 24 one 82 D 25 one 83 D 26 two 60 B 27 two 61 B 28 two 62 B 29 two 63 B 30 two 64 B 31 two 65 B 32 two 66 B 33 two 67 B 34 two 68 B 35 two 69 B 36 two 70 B 37 two 71 B 38 two 72 B 39 two 73 B 40 two
Re: [R] dplyr - add/expand rows
On 11/26/2017 08:42 PM, jim holtman wrote: try this: ## library(dplyr) input <- tribble( ~station, ~from, ~to, ~record, "07EA001" ,1960 , 1960 , "QMS", "07EA001" , 1961 , 1970 , "QMC", "07EA001" ,1971 , 1971 , "QMM", "07EA001" ,1972 , 1976 , "QMC", "07EA001" ,1977 , 1983 , "QRC" ) result <- input %>% rowwise() %>% do(tibble(station = .$station, year = seq(.$from, .$to), record = .$record) ) ### In a bit more 'base R' mode I did input$year <- with(input, Map(seq, from, to)) res0 <- with(input, Map(data.frame, station=station, year=year, record=record)) as_tibble(do.call(rbind, unname(res0)))# A tibble: 24 x 3 resulting in > as_tibble(do.call(rbind, unname(res0)))# A tibble: 24 x 3 station year record 1 07EA001 1960QMS 2 07EA001 1961QMC 3 07EA001 1962QMC 4 07EA001 1963QMC 5 07EA001 1964QMC 6 07EA001 1965QMC 7 07EA001 1966QMC 8 07EA001 1967QMC 9 07EA001 1968QMC 10 07EA001 1969QMC # ... with 14 more rows I though I should have been able to use `tibble` in the second step, but that leads to a (cryptic) error > res0 <- with(input, Map(tibble, station=station, year=year, record=record))Error in captureDots(strict = `__quosured`) : the argument has already been evaluated The 'station' and 'record' columns are factors, so different from the original input, but this seems the appropriate data type for theses columns. It's interesting to compare the 'specialized' knowledge needed for each approach -- rowwise(), do(), .$ for tidyverse, with(), do.call(), maybe rbind() and Map() for base R. Martin Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Sun, Nov 26, 2017 at 2:10 PM, Bert Gunter wrote: To David W.'s point about lack of a suitable reprex ("reproducible example"), Bill's solution seems to be for only one station. Here is a reprex and modification that I think does what was requested for multiple stations, again using base R and data frames, not dplyr and tibbles. First the reprex with **two** stations: d <- data.frame( station = rep(c("one","two"),c(5,4)), from = c(60,61,71,72,76,60,65,82,83), to = c(60,70,71,76,83,64, 81, 82,83), record = c("A","B","C","B","D","B","B","D","E")) d station from to record 1 one 60 60 A 2 one 61 70 B 3 one 71 71 C 4 one 72 76 B 5 one 76 83 D 6 two 60 64 B 7 two 65 81 B 8 two 82 82 D 9 two 83 83 E ## Now the conversion code using base R, especially by(): out <- by(d, d$station, function(x) with(x, { +i <- to - from +1 +data.frame(YEAR =sequence(i) -1 +rep(from,i), RECORD =rep(record,i)) + })) out <- data.frame(station = rep(names(out),sapply(out,nrow)),do.call(rbind,out), row.names = NULL) out station YEAR RECORD 1 one 60 A 2 one 61 B 3 one 62 B 4 one 63 B 5 one 64 B 6 one 65 B 7 one 66 B 8 one 67 B 9 one 68 B 10 one 69 B 11 one 70 B 12 one 71 C 13 one 72 B 14 one 73 B 15 one 74 B 16 one 75 B 17 one 76 B 18 one 76 D 19 one 77 D 20 one 78 D 21 one 79 D 22 one 80 D 23 one 81 D 24 one 82 D 25 one 83 D 26 two 60 B 27 two 61 B 28 two 62 B 29 two 63 B 30 two 64 B 31 two 65 B 32 two 66 B 33 two 67 B 34 two 68 B 35 two 69 B 36 two 70 B 37 two 71 B 38 two 72 B 39 two 73 B 40 two 74 B 41 two 75 B 42 two 76 B 43 two 77 B 44 two 78 B 45 two 79 B 46 two 80 B 47 two 81 B 48 two 82 D 49 two 83 E Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Sat, Nov 25, 2017 at 4:49 PM, William Dunlap via R-help < r-help@r-project.org> wrote: dplyr may have something for this, but in base R I think the following does what you want. I've shortened the name of your data set to 'd'. i <- rep(seq_len(nrow(d)), d$YEAR_TO-d$YEAR_FROM+1) j <- sequence(d$YEAR_TO-d$YEAR_FROM+1) transform(d[i,], YEAR=YEAR_FROM+j-1, YEAR_FROM=NULL, YEAR_TO=NULL) Bill Dunlap TIBCO Software wdunlap tibco.com On Sat, Nov 25, 2017 at 11:18 AM, Hutchinson, David (EC) < david.hutchin...@canada.ca> wrote: I have a ret
Re: [R] R_LIBS_USER not in libPaths
On 09/16/2017 11:29 AM, Rene J Suarez-Soto wrote: I have not intentionally set R_LIBS_USER. I looked for an Renviron.site file but did not see it in R/etc or my home directory. The strange part is that if I print Sud.getenv I see a value for R_LIBS_USER. However, this directory is not showing under libPaths. I though .libPaths should contain R_LIBS_USER. If the directory pointed to by R_LIBS_USER does not exist, then .libPaths() will not contain it. This is documented on ?.libPaths or ?R_LIBS_USER Only directories which exist at the time will be included. The file in the user home directory is .Renviron, rather than Renviron.site. This documented at, e.g,. ?Renviron The name of the user file can be specified by the 'R_ENVIRON_USER' environment variable; if this is unset, the files searched for are '.Renviron' in the current or in the user's home directory (in that order). R environment variables are set when R starts; I can discover these, on linux, by invoking the relevant command-line command after running R CMD $ env|grep "^R_" $ (i.e., no output) versus $ R CMD env|grep "^R_" R_UNZIPCMD=/usr/bin/unzip ... Generally, ?Startup describes the startup process, and most variables are described in R via ?R_... Martin I also noticed that R related variables are not in the system or user variables because I dont see them when I type SET from the Windows Command line. So a related question is where does R get the system variables (e.g., R_LIBS_USER, R_HOME) if I dont see a Renviron.site file. Thanks On Sep 16, 2017 10:45 AM, "Henrik Bengtsson" wrote: I'm not sure I follow what.the problem is. Are you trying to set R_LIBS_USER but R does not acknowledge it, or do you observe something in R that you didn't expect to be there and you are trying to figure out why that is / where that happens? Henrik On Sep 16, 2017 07:10, "Rene J Suarez-Soto" wrote: I have a computer where R_LIBS_USER is not found in libPaths. This is for Windows (x64). I ran R from the command line, RGui and RStudio and I get the same results. I also ran R --vanilla and I still get the discrepancy. The only thing I found interesting was that I also ran SET from the command line and the "R related variables" (e.g., R_HOME; R_LIBS_USER) are not there. Therefore these variables are being set when I start R. I have not been able to track where does R obtain the value for these. Aside from looking at http://stat.ethz.ch/R-manual/R-patched/library/base/html/Startup.html I am not sure I have much more information that I have found useful. Thanks R [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posti ng-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to add make option to package compilation?
On 09/15/2017 08:57 AM, Michael Dewey wrote: In line On 15/09/2017 13:30, Martin Møller Skarbiniks Pedersen wrote: On 15 September 2017 at 14:13, Duncan Murdoch wrote: On 15/09/2017 8:11 AM, Martin Møller Skarbiniks Pedersen wrote: Hi, I am installing a lot of packages to a new R installation and it takes a long time. However the machine got 4 cpus and most of the packages are written in C/C++. So is it possible to add a -j4 flag to the make command when I use the install.packages() function? That will probably speed up the package installation process 390%. See the Ncpus argument in ?install.packages. Thanks. However it looks like Ncpus=4 tries to compile four R packages at the same time using one cpu for each packages. The variable MAKE is defined in ${R_HOME}/etc/Renviron, and can be over-written with ~/.Renviron MAKE=make -j There is further discussion in https://cran.r-project.org/doc/manuals/r-release/R-admin.html#Configuration-variables and ?Renviron. One could configure a source installation to always compile with make -j, something like ./configure MAKE="make -j" Martin But you said you had lots to install so would that not speed things up too? From the documentation: " Ncpus: the number of parallel processes to use for a parallel install of more than one source package. Values greater than one are supported if the ‘make’ command specified by ‘Sys.getenv("MAKE", "make")’ accepts argument ‘-k -j Ncpus’ " [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. --- This email has been checked for viruses by AVG. http://www.avg.com This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error in readRDS(dest) (was Re: Error with installed.packages with R 3.4.0 on Windows)
On 05/31/2017 04:38 AM, Patrick Connolly wrote: On Tue, 23-May-2017 at 12:20PM +0200, Martin Maechler wrote: [...] |> |> Given the above stack trace. |> It may be easier to just do |> |> debugonce(available.packages) |> install.packages("withr") |> |> and then inside available.packages, (using 'n') step to the |> point _before_ the tryCatch(...) call happens; there, e.g. use |> |> ls.str() |> |> which gives an str() of all your local objects, notably 'dest' |> and 'method'. |> but you can also try other things once inside |> available.packages(). I couldn't see any differences between R-3.3.3 (which works) and R-3.4.0 (which doesn't) until I got to here, a few lines before the download.file line: Browse[2]> debug: dest <- file.path(tempdir(), paste0("repos_", URLencode(repos, TRUE), ".rds")) Browse[2]> When I check out those directories in a terminal, there's a big diffrence: With R-3.4.0 ~ > ll /tmp/RtmpFUhtpY total 4 drwxr-xr-x 2 hrapgc hrapgc 4096 May 31 10:45 downloaded_packages/ -rw-r--r-- 1 hrapgc hrapgc0 May 31 10:56 repos_http%3A%2F%2Fcran.stat.auckland.ac.nz%2Fsrc%2Fcontrib.rds The file repos_http%3A%2F%2Fcran.stat.auckland.ac.nz%2Fsrc%2Fcontrib.rds was likely created earlier in your R session. Likely the download a few lines down download.file(url = paste0(repos, "/PACKAGES.rds"), destfile = dest, method = method, cacheOK = FALSE, quiet = TRUE, mode = "wb") 'succeeded' but created a zero-length file. You could try to troubleshoot this with something like the following, downloading to a temporary location dest = tempfile() url = "http://cran.stat.auckland.ac.nz/src/contrib/PACKAGES.rds"; download.file(url, dest) file.size(dest) If this succeeds (it should download a file of several hundred KB), then try adding the options method, cacheOK, quiet, mode to the download.file() call. 'method' can be determined when you are in available.packages while debugging; if R says that it is missing, then it will be assigned, in download.file, to either getOption("download.file.method") or (if the option is NULL or "auto") "libcurl". If the download 'succeeds' but the temporary file created is 0 bytes, then it would be good to share the problematic command with us. Martin Morgan With R-3.3.3 ~ > ll /tmp/RtmpkPgL3A total 380 drwxr-xr-x 2 hrapgc hrapgc 4096 May 31 11:01 downloaded_packages/ -rw-r--r-- 1 hrapgc hrapgc 8214 May 31 11:01 libloc_185_3165c7f52d5fdf96.rds -rw-r--r-- 1 hrapgc hrapgc 372263 May 31 11:01 repos_http%3A%2F%2Fcran.stat.auckland.ac.nz%2Fsrc%2Fcontrib.rds So, if I could figure out what makes *that* difference I could get somewhere. I see there's considerably extra code in the newer of the two versions of available.packages() but being a bear with a small brain, I can't figure out what differences should be expected. I have no idea what populates those 'dest' directories. TIA This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error with installed.packages with R 3.4.0 on Windows
On 05/22/2017 05:10 AM, Patrick Connolly wrote: On Fri, 28-Apr-2017 at 07:04PM +0200, peter dalgaard wrote: |> |> > On 28 Apr 2017, at 12:08 , Duncan Murdoch wrote: |> > |> > On 28/04/2017 4:45 AM, Thierry Onkelinx wrote: |> >> Dear Peter, |> >> |> >> It actually breaks install.packages(). So it is not that innocent. |> > |> > I don't think he meant that it is harmless, he meant that the fix is easy, and is in place in R-patched and R-devel. You should use R-patched and you won't have the problem. |> |> Read more carefully: I said that the _fix_ is harmless for this case, but might not be so in general. |> |> -pd Apparently it isn't harmless. install.packages("withr") Error in readRDS(dest) : error reading from connection that seems like a plain-old network connectivity issue, or perhaps an issue with the CRAN mirror you're using. Can you debug on your end, e.g,. options(error=recover) install.packages("withr") ... then select the 'frame' where the error occurs, look around ls() find the value of 'dest', and e.g., try to open dest in your browser. Martin Morgan sessionInfo() R version 3.4.0 Patched (2017-05-19 r72713) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 14.04.5 LTS Matrix products: default BLAS: /home/hrapgc/local/R-patched/lib/libRblas.so LAPACK: /home/hrapgc/local/R-patched/lib/libRlapack.so locale: [1] LC_CTYPE=en_NZ.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_NZ.UTF-8LC_COLLATE=en_NZ.UTF-8 [5] LC_MONETARY=en_NZ.UTF-8LC_MESSAGES=en_NZ.UTF-8 [7] LC_PAPER=en_NZ.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_NZ.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] grDevices utils stats graphics methods base other attached packages: [1] lattice_0.20-35 loaded via a namespace (and not attached): [1] compiler_3.4.0 tools_3.4.0grid_3.4.0 Has anyone a workaround? This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] renameSeqlevels
Rsamtools and GenomicAlignments are Bioconductor packages so ask on the Bioconductor support site https://support.bioconductor.org You cannot rename the seqlevels in the bam file; you could rename the seqlevels in the object(s) you have created from the bam file. Martin On 02/14/2017 09:17 AM, Teresa Tavella wrote: Dear all, I would like to ask if it is possible to change the seqnames of a bam file giving a vector of character to the function renameSeqlevels. This is because in order to use the fuction summarizeOverlap or count/find, the seqnames have to match. From the bamfile below I have extracted the locus annotations form the seqnames (i.e ERCC2, NC_001133.9...etc) and I have created a list (same length as the seqlevels of the bam file). *bamfile* GAlignments object with 6 alignments and 0 metadata columns: seqnames [1] DQ459430_gene=ERCC2_loc:ERCC2|1-1061|+_exons:1-1061_segs:1-1061 [2] DQ459430_gene=ERCC2_loc:ERCC2|1-1061|+_exons:1-1061_segs:1-1061 [3] DQ459430_gene=ERCC2_loc:ERCC2|1-1061|+_exons:1-1061_segs:1-1061 [4] DQ459430_gene=ERCC2_loc:ERCC2|1-1061|+_exons:1-1061_segs:1-1061 [5] DQ459430_gene=ERCC2_loc:ERCC2|1-1061|+_exons:1-1061_segs:1-1061 [6] DQ459430_gene=ERCC2_loc:ERCC2|1-1061|+_exons:1-1061_segs:1-1061 strand cigarqwidth start end width njunc [1] + 8M2D27M35 1025 106137 0 [2] + 8M2D27M35 1025 106137 0 [3] - 36M36 1025 106036 0 [4] - 36M36 1026 106136 0 [5] + 35M35 1027 106135 0 [6] + 35M35 1027 106135 0 --- *gffile* GRanges object with 6 ranges and 12 metadata columns: seqnames ranges strand | source type score | [1] NC_001133.9 [ 24837, 25070] + | s_cerevisiae exon [2] NC_001133.9 [ 25048, 25394] + | s_cerevisiae exon [3] NC_001133.9 [ 27155, 27786] + | s_cerevisiae exon [4] NC_001133.9 [ 73431, 73792] + | s_cerevisiae exon [5] NC_001133.9 [165314, 165561] + | s_cerevisiae exon [6] NC_001133.9 [165388, 165781] + | s_cerevisiae exon phase gene_id transcript_id exon_number gene_name [1] XLOC_40 TCONS_0191 1FLO9 [2] XLOC_40 TCONS_0192 1FLO9 [3] XLOC_41 TCONS_0193 1FLO9 [4] XLOC_55 TCONS_0200 1 YAL037C-A [5] XLOC_75 TCONS_0100 1 YAR010C [6] XLOC_75 TCONS_0219 1 YAR010C oId nearest_ref class_code [1] {TRINITY_GG_normal}16_c1_g1_i1.mrna1rna8 x [2] {TRINITY_GG_normal}16_c0_g1_i1.mrna1rna8 x [3] {TRINITY_GG_normal}12_c0_g1_i1.mrna1rna8 x [4]{TRINITY_GG_normal}3_c3_g1_i1.mrna1 rna31 x [5] {TRINITY_GG_normal}3479_c0_g1_i1.mrna1 rna77 x [6] {TRINITY_GG_normal}24_c0_g1_i1.mrna1 rna77 x tss_id [1] TSS42 [2] TSS43 [3] TSS44 [4] TSS71 [5] TSS118 [6] TSS118 --- It is possible to replace the seqlevels names with the list? I have tried: bamfile1 <- renameSeqlevels(seqlevels(bamfile), listx) Thank you for any advice, Kind regards, Teresa __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Using a mock of an S4 class
On 02/01/2017 02:46 PM, Ramiro Barrantes wrote: Hello, I have a function that applies to an S4 object which contains a slot called @analysis: function calculation(myObject) { tmp <- myObjects@analysis result <- ...operations on analysis... return result } I am writing a unit test for this function. So I was hoping to create a mock object but I can't figure out how to do it: test_that("test calculation function", { mockMyObject<- mock(?) #I am not sure what to put here r<-calculation(mockMyObject) expect_true(r,0.83625) }) How can I create a mock S4 object?? I don't know of a convenient way to create a mock with functionality like mocks in other languages. But here's a class .A = setClass("A", contains="integer") This creates an instance that might be used as a mock mock = .A() # same as new("A") but maybe you have an initialize method (initialize methods are very tricky to get correct, and many people avoid them, using plain-old-functions to form an API around object creation; the plain-old-function finishes by calling the constructor .A() or new("A")) that has side effects that are inappropriate for your test, mimicked here with stop() setMethod("initialize", "A", function(.Object, ...) stop("oops")) our initial attempts are thwarted > .A() Error in initialize(value, ...) : oops but we could reach into our bag of hacks and try mock = .Call(methods:::C_new_object, getClassDef("A")) You would still need to populate slots / data used in your test, e.g., slot(mock, ".Data") = 1:4 This is robust to any validity method, since the validity method is not invoked on direct slot assignment setValidity("A", function(object) { if (all(object > 0)) TRUE else "oops2" }) slot(mock, ".Data") = 0:4 # still works So something like mockS4object = function(class, ..., where=topenv(parent.frame())) { obj <- .Call( methods:::C_new_object, getClassDef(class, where=where) ) args = list(...) for (nm in names(args)) slot(obj, nm) = args[[nm]] obj } mockS4object("A", .Data=1:4) Mock objects typically have useful testing properties, like returning the number of times a slot (field) is accessed. Unfortunately, I don't have anything to offer for that. Martin Thanks in advance, Ramiro [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error In DESeq installation
On 10/23/2016 10:13 PM, Yogesh Gupta wrote: Dear All, I am getting error in DESeq installation in R. package ‘DESeq’ is not available (for R version 3.3.1) source("http://www.Bioconductor.org/biocLite.R";) Bioconductor version 3.4 (BiocInstaller 1.24.0), ?biocLite for help biocLite("BiocUpgrade") Error: Bioconductor version 3.4 cannot be upgraded with R version 3.3.1 Can you suggest me I How I can resolve it. Ask questions about Bioconductor packages on the Bioconductor support forum https://support.bioconductor.org DESeq was replaced by DESeq2, but is still available; provide (on the Bioconductor support site) the complete output of the installation attempt and sessionInfo(). 'BiocUpgrade' is to update to a more recent version of Bioconductor. There is a 'devel' version that is m ore recent that 3.4, but it requires R-devel. Martin Thanks Yogesh *Yogesh Gupta* *Postdoctoral Researcher* *Department of Biological Science* *Seoul National University* *Seoul, South Korea* web) http://biosci.snu.ac.kr/jiyounglee *Cell No. +82-10-6453-0716* [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Faster Subsetting
On 09/28/2016 02:53 PM, Hervé Pagès wrote: Hi, I'm surprised nobody suggested split(). Splitting the data.frame upfront is faster than repeatedly subsetting it: tmp <- data.frame(id = rep(1:2, each = 10), foo = rnorm(20)) idList <- unique(tmp$id) system.time(for (i in idList) tmp[which(tmp$id == i),]) # user system elapsed # 16.286 0.000 16.305 system.time(split(tmp, tmp$id)) # user system elapsed # 5.637 0.004 5.647 an odd speed-up is to provide (non-sequential) row names, e.g., > system.time(split(tmp, tmp$id)) user system elapsed 4.472 0.648 5.122 > row.names(tmp) = rev(seq_len(nrow(tmp))) > system.time(split(tmp, tmp$id)) user system elapsed 0.588 0.000 0.587 for reasons explained here http://stackoverflow.com/questions/39545400/why-is-split-inefficient-on-large-data-frames-with-many-groups/39548316#39548316 Martin Cheers, H. On 09/28/2016 09:09 AM, Doran, Harold wrote: I have an extremely large data frame (~13 million rows) that resembles the structure of the object tmp below in the reproducible code. In my real data, the variable, 'id' may or may not be ordered, but I think that is irrelevant. I have a process that requires subsetting the data by id and then running each smaller data frame through a set of functions. One example below uses indexing and the other uses an explicit call to subset(), both return the same result, but indexing is faster. Problem is in my real data, indexing must parse through millions of rows to evaluate the condition and this is expensive and a bottleneck in my code. I'm curious if anyone can recommend an improvement that would somehow be less expensive and faster? Thank you Harold tmp <- data.frame(id = rep(1:200, each = 10), foo = rnorm(2000)) idList <- unique(tmp$id) ### Fast, but not fast enough system.time(replicate(500, tmp[which(tmp$id == idList[1]),])) ### Not fast at all, a big bottleneck system.time(replicate(500, subset(tmp, id == idList[1]))) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] src/Makevars ignored ?
On 09/26/2016 07:46 AM, Eric Deveaud wrote: Hello, as far as I understood the R library generic compilation mechanism, compilation of C//C++ sources is controlde 1) at system level by the ocntentos RHOME/etc/Makeconf 2) at user level by the content of ~/.R/Makevars 3) at package level by the content of src/Makevars Problem I have is that src/Makevars is ignored see following example: R is compiled and use the following CC and CFLAGS definition bigmess:epactsR/src > R CMD config CC gcc -std=gnu99 bigmess:epactsR/src > R CMD config CFLAGS -Wall -g so building C sources lead to the following bigmess:epactsR/src > R CMD SHLIB index.c gcc -std=gnu99 -I/local/gensoft2/adm/lib64/R/include -DNDEBUG -I/usr/local/include-fpic -Wall -g -c index.c -o index.o normal, it uses defintion from RHOME/etc/Makeconf when I set upp a ~/.R/Makevars that overwrite CC and CFLAGS definition. bigmess:epactsR/src > cat ~/.R/Makevars CC=gcc CFLAGS=-O3 bigmess:epactsR/src > R CMD SHLIB index.c gcc -I/local/gensoft2/adm/lib64/R/include -DNDEBUG -I/usr/local/include -fpic -O3 -c index.c -o index.o gcc -std=gnu99 -shared -L/usr/local/lib64 -o index.so index.o OK CC and CFLAGS are honored and set accordingly to ~/.R/Makevars but when I try to use src/Makevars, it is ignored bigmess:epactsR/src > cat ~/.R/Makevars cat: /home/edeveaud/.R/Makevars: No such file or directory bigmess:epactsR/src > cat ./Makevars CC = gcc CFLAGS=-O3 bigmess:epactsR/src > R CMD SHLIB index.c gcc -std=gnu99 -I/local/gensoft2/adm/lib64/R/include -DNDEBUG -I/usr/local/include-fpic -Wall -g -c index.c -o index.o what I have missed or is there something wrong ? Use PKG_CFLAGS instead of CFLAGS; CC cannot be changed in Makevars. See https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Using-Makevars Martin Morgan PS I tested the ssame behaviour with various version of R from R/2.15 to R/3.3 best regards Eric __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] makePSOCKcluster launches different version of R
On 08/05/2016 12:07 PM, Guido Kraemer wrote: Hi everyone, we are running R on a Linux Cluster with several R versions installed in parallel. If I run: library(parallel) cl <- makePSOCKcluster( rep('nodeX', 24), homogeneous = FALSE, rscript = '/usr/local/apps/R/R-3.2.2/bin/Rscript' ) from ?makePSOCKcluster 'homogeneous' Logical. Are all the hosts running identical setups, so 'Rscript' can be launched using the same path on each? Otherwise 'Rscript' has to be in the default path on the workers. 'rscript' The path to 'Rscript' on the workers, used if 'homogeneous' is true. Defaults to the full path on the master. so homogeneous = FALSE and rscript = ... are incompatible. From your description it seems like you mean homogeneous = TRUE. Martin then still R-3.0.0 gets launched on nodeX. Version 3.0.0 is the default R version, which is started when I just type R in the terminal without any further configuration. Cheers, Guido __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Get the location of a numeric element in a list
On 06/28/2016 03:03 AM, Mohammad Tanvir Ahamed via R-help wrote: Can any one please help me. I will apply this for a very large list, about 400k vector in a list and vector size is unequal and large Example : Input: a <- c(1,3,6,9,25,100) b<-c(10,7,20,2,25) c<-c(1,7,5,15,25,300,1000) d<-list(a,b,c) Expected outcome : # When looking for 1 in d c(1,3) # When looking for 7 in d c(2,3) # when looking for 25 in d c(1,2,3) # When looking for 50 in d NULL or 0 Make a vector of queries queries = c(1, 7, 25, 50) Create a factor of unlist(d), using queries as levels. Create a vector rep(seq_along(d), lengths(d)), and split it into groups defined by f f = factor(unlist(d, use.names=FALSE), levels=queries) split(rep(seq_along(d), lengths(d)), f) Martin Morgan Thanks in advance !! Tanvir Ahamed Göteborg, Sweden | mashra...@yahoo.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Warning when running R - can't install packages either
Hi Jakub, This is really a separate question. It is not really end-user related, and should be asked on the R-devel mailing list. Nonetheless, some answers below. On 05/13/2016 03:55 PM, Jakub Jirutka wrote: Hi, I’m maintainer of the R package in Alpine Linux. I read on multiple places that some packages needs R_HOME variable set to the location where is R installed, so I’ve added it to the system-wide profile. Is this correct, or a misinformation? R_HOME is set when R starts ~$ env|grep R_HOME ~$ R --vanilla -e "Sys.getenv('R_HOME')" > Sys.getenv('R_HOME') [1] "/home/mtmorgan/bin/R-3-3-branch" and (after reading the documentation in ?R_HOME it the R help system) ~$ R RHOME /home/mtmorgan/bin/R-3-3-branch so there is no need to set it in a system-wide profile. It is sometimes referenced inside an R package source tree that uses C or other compiled code in a Makevars file, as described in the 'Writing R Extensions' manual https://cran.r-project.org/doc/manuals/r-release/R-exts.html e.g., the section on configure and cleanup https://cran.r-project.org/doc/manuals/r-release/R-exts.html#Configure-and-cleanup In these circumstances it has been set by the R process that is compiling the source code. What system dependencies does R need to compile modules from CRAN? On Alpine the following dependencies are needed to build R: bzip2-dev curl-dev gfortran lapack-dev pcre-dev perl readline-dev xz-dev zlib-dev. Are all of these dependencies needed for compiling modules? As you say, those look like dependencies required to build R itself. Individual packages may have dependencies on these or other system libraries, but many packages do not have system dependencies. It is up to the package maintainer to ensure that appropriate checks are made to discover the system resource; there are probably dozens or even hundreds of system dependencies amongst all of the CRAN packages. Typically the task of satisfying those dependencies is left to the user (or to those creating distributions of R packages, e.g., https://cran.r-project.org/bin/linux/debian/) Martin Morgan Jakub On 13. May 2016, at 11:31, Martin Morgan wrote: On 05/12/2016 10:25 PM, Alba Pompeo wrote: Martin Morgan, I tried an HTTP mirror and it worked. What could be the problem and how to fix? Also, should I ignore the warning about ignoring environment value of R_HOME? It depends on why you set the value in your environment in the first place; maybe you were trying to use a particular installation of R, but setting R_HOME is not the way to do that (I use an alias, e.g., R-3.3='~/bin/R-3-3-branch/bin/R --no-save --no-restore --silent') Martin Thanks. On Thu, May 12, 2016 at 5:59 PM, Tom Hopper wrote: setInternet2() first thing after launching R might fix that. On May 12, 2016, at 07:45, Alba Pompeo wrote: Hello. I've tried to run R, but I receive many warnings and can't do simple stuff such as installing packages. Here's the full log when I run it. http://pastebin.com/raw/2BkNpTte Does anyone know what could be wrong here? Thanks a lot. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Warning when running R - can't install packages either
On 05/12/2016 10:25 PM, Alba Pompeo wrote: Martin Morgan, I tried an HTTP mirror and it worked. What could be the problem and how to fix? Also, should I ignore the warning about ignoring environment value of R_HOME? It depends on why you set the value in your environment in the first place; maybe you were trying to use a particular installation of R, but setting R_HOME is not the way to do that (I use an alias, e.g., R-3.3='~/bin/R-3-3-branch/bin/R --no-save --no-restore --silent') Martin Thanks. On Thu, May 12, 2016 at 5:59 PM, Tom Hopper wrote: setInternet2() first thing after launching R might fix that. On May 12, 2016, at 07:45, Alba Pompeo wrote: Hello. I've tried to run R, but I receive many warnings and can't do simple stuff such as installing packages. Here's the full log when I run it. http://pastebin.com/raw/2BkNpTte Does anyone know what could be wrong here? Thanks a lot. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Warning when running R - can't install packages either
On 05/12/2016 10:25 PM, Alba Pompeo wrote: Martin Morgan, I tried an HTTP mirror and it worked. What could be the problem and how to fix? The problem is in the warning message 1: In download.file(url, destfile = f, quiet = TRUE) : URL 'https://cran.r-project.org/CRAN_mirrors.csv': status was 'Problem with the SSL CA cert (path? access rights?)' and an easier way to reproduce / troubleshoot the problem is download.file("https://cran.r-project.org/CRAN_mirrors.csv";, tempfile()) The details of this process are described in ?download.file. My guess would be that you have 'libcurl' available > capabilities()["libcurl"] libcurl TRUE that it supports https (mine does, in the protocol attribute): > libcurlVersion() [1] "7.35.0" attr(,"ssl_version") [1] "OpenSSL/1.0.1f" attr(,"libssh_version") [1] "" attr(,"protocols") [1] "dict" "file" "ftp""ftps" "gopher" "http" "https" "imap" [9] "imaps" "ldap" "ldaps" "pop3" "pop3s" "rtmp" "rtsp" "smtp" [17] "smtps" "telnet" "tftp" and that you have outdated or other CA certificates problem, with some hints for troubleshooting in the first and subsequent paragraphs of the 'Secure URL' section. Martin Morgan Also, should I ignore the warning about ignoring environment value of R_HOME? Thanks. On Thu, May 12, 2016 at 5:59 PM, Tom Hopper wrote: setInternet2() first thing after launching R might fix that. On May 12, 2016, at 07:45, Alba Pompeo wrote: Hello. I've tried to run R, but I receive many warnings and can't do simple stuff such as installing packages. Here's the full log when I run it. http://pastebin.com/raw/2BkNpTte Does anyone know what could be wrong here? Thanks a lot. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Warning when running R - can't install packages either
On 05/12/2016 07:45 AM, Alba Pompeo wrote: Hello. I've tried to run R, but I receive many warnings and can't do simple stuff such as installing packages. Here's the full log when I run it. http://pastebin.com/raw/2BkNpTte Does anyone know what could be wrong here? do you have any success when choosing a non-https mirror, #28 in your screenshot? Martin Morgan Thanks a lot. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or...{{dropped:2}} __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S4 non-virtual class with no slots?
On 04/22/2016 04:38 PM, Boylan, Ross wrote: It seems that if an S4 class has no slots it can't be instantiated because it is assumed to be virtual. Is there a way around this other than adding a do-nothing slot? A singleton would be OK, though is not essential. Problem: EmptyFitResult <- setClass("EmptyFitResult", representation=representation()) # also tried it without the second argument. same result. > e <- EmptyFitResult() Error in new("EmptyFitResult", ...) : trying to generate an object from a virtual class ("EmptyFitResult") This in R 3.1.1. Context: I fit simulated data; in some simulations none survive to the second stage of fitting. So I just need a way to record that this happened, in a way that integrates with my other non-null results. A not too artificial solution is to create a base class, with derived classes corresponding to stateless or stateful conditions Base = setClass("Base"); A = setClass("A", contains="Base"); A() Martin Morgan Thanks. Ross Boylan __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] what is the faster way to search for a pattern in a few million entries data frame ?
On 04/10/2016 03:27 PM, Fabien Tarrade wrote: Hi Duncan, Didn't you post the same question yesterday? Perhaps nobody answered because your question is unanswerable. sorry, I got a email that my message was waiting for approval and when I look at the forum I didn't see my message and this is why I sent it again and this time I did check that the format of my message was text only. Sorry for the noise. You need to describe what the strings are like and what the patterns are like if you want advice on speeding things up. my strings are 1-gram up to 5-grams (sequence of 1 work up to 5 words) and I am searching for the frequency in my DF of the strings starting with a sequence of few words. I guess these days it is standard to use DF with millions of entries so I was wondering how people are doing that in the faster way. I did this to generate and search 40 million unique strings > grams <- as.character(1:4e7)## a long time passes... > system.time(grep("^91", grams)) ## similar times to grepl user system elapsed 10.384 0.168 10.543 Is that the basic task you're trying to accomplish? grep(l) goes quickly to C, so I don't think data.table or other will be markedly faster if you're looking for an arbitrary regular expression (use fixed=TRUE if looking for an exact match). If you're looking for strings that start with a pattern, then in R-3.3.0 there is > system.time(res0 <- startsWith(grams, "91")) user system elapsed 0.658 0.012 0.669 which returns the same result as grepl > identical(res0, res1 <- grepl("^91", grams)) [1] TRUE One can also parallelize the already vectorized grepl function with parallel::pvec, with some opportunity for gain (compared to grepl) on non-Windows > system.time(res2 <- pvec(seq_along(grams), function(i) grepl("^91", grams[i]), mc.cores=8)) user system elapsed 24.996 1.709 3.974 > identical(res0, res2) [[1]] TRUE I think anything else would require pre-processing of some kind, and then some more detail about what your data looks like is required. Martin Morgan Thanks Cheers Fabien This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Ask if an object will respond to a function or method
On 03/31/2016 04:00 PM, Paul Johnson wrote: In the rockchalk package, I want to provide functions for regression objects that are "well behaved." If an object responds to the methods that lm or glm objects can handle, like coef(), nobs(), and summary(), I want to be able to handle the same thing. It is more difficult than expected to ask a given fitted model object "do you respond to these functions: coef(), nobs(), summary()." How would you do it? I tried this with the methods() function but learned that all methods that a class can perform are not listed. I'll demonstrate with a regression "zz" that is created by the example in the plm package. The coef() function succeeds on the zz object, but coef is not listed in the list of methods that the function can carry out. library(plm) example(plm) class(zz) [1] "plm""panelmodel" methods(class = "plm") [1] ercomp fixef has.intercept model.matrix [5] pFtest plmtest plotpmodel.response [9] pooltestpredict residuals summary [13] vcovBK vcovDC vcovG vcovHC [17] vcovNW vcovSCC see '?methods' for accessing help and source code methods(class = "panelmodel") [1] deviance df.residual fittedhas.intercept index [6] nobs pbgtest pbsytest pcdtest pdim [11] pdwtest phtestprint pwartest pwfdtest [16] pwtestresiduals terms updatevcov see '?methods' for accessing help and source code coef(zz) log(pcap) log(pc) log(emp)unemp -0.026149654 0.292006925 0.768159473 -0.005297741 I don't understand why coef(zz) succeeds but coef is not listed as a method. coef(zz) finds stats:::coef.default, which happens to do the right thing for zz but also 'works' (returns without an error) for things that don't have coefficients, e.g., coef(data.frame()). stats:::coef.default is > stats:::coef.default function (object, ...) object$coefficients Maybe fail on use, rather than trying to guess up-front that the object is fully appropriate? Martin Morgan Right now, I'm contemplating this: zz1 < - try(coef(zz)) if (inherits(zz1, "try-error")) stop("Your model has no coef method") This seems like a bad workaround because I have to actually run the function in order to find out if the function exists. That might be time consuming for some summary() methods. pj This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Persistent state in a function?
Use a local environment to as a place to store state. Update with <<- and resolve symbol references through lexical scope E.g., persist <- local({ last <- NULL# initialize function(value) { if (!missing(value)) last <<- value # update with <<- last# use } }) and in action > persist("foo") [1] "foo" > persist() [1] "foo" > persist("bar") [1] "bar" > persist() [1] "bar" A variant is to use a 'factory' function factory <- function(init) { stopifnot(!missing(init)) last <- init function(value) { if (!missing(value)) last <<- value last } } and > p1 = factory("foo") > p2 = factory("bar") > c(p1(), p2()) [1] "foo" "bar" > c(p1(), p2("foo")) [1] "foo" "foo" > c(p1(), p2()) [1] "foo" "foo" The 'bank account' exercise in section 10.7 of RShowDoc("R-intro") illustrates this. Martin On 03/19/2016 12:45 PM, Boris Steipe wrote: Dear all - I need to have a function maintain a persistent lookup table of results for an expensive calculation, a named vector or hash. I know that I can just keep the table in the global environment. One problem with this approach is that the function should be able to delete/recalculate the table and I don't like side-effects in the global environment. This table really should be private. What I don't know is: -A- how can I keep the table in an environment that is private to the function but persistent for the session? -B- how can I store and reload such table? -C- most importantly: is that the right strategy to initialize and maintain state in a function in the first place? For illustration ... --- myDist <- function(a, b) { # retrieve or calculate distances if (!exists("Vals")) { Vals <<- numeric() # the lookup table for distance values # here, created in the global env. } key <- sprintf("X%d.%d", a, b) thisDist <- Vals[key] if (is.na(thisDist)) { # Hasn't been calculated yet ... cat("Calculating ... ") thisDist <- sqrt(a^2 + b^2) # calculate with some expensive function ... Vals[key] <<- thisDist # store in global table } return(thisDist) } # run this set.seed(112358) for (i in 1:10) { x <- sample(1:3, 2) print(sprintf("d(%d, %d) = %f", x[1], x[2], myDist(x[1], x[2]))) } Thanks! Boris __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] regex - extracting src url
On 03/22/2016 12:44 AM, Omar André Gonzáles Díaz wrote: Hi,I have a DF with a column with "html", like this: https://ad.doubleclick.net/ddm/trackimp/N344006.1960500FACEBOOKAD/B9589414.130145906;dc_trk_aid=303019819;dc_trk_cid=69763238;ord=[timestamp];dc_lat=;dc_rdid=;tag_for_child_directed_treatment=?"; BORDER="0" HEIGHT="1" WIDTH="1" ALT="Advertisement"> I need to get this: https://ad.doubleclick.net/ddm/trackimp/N344006.1960500FACEBOOKAD/B9589414.130145906;dc_trk_aid=303019819;dc_trk_cid=69763238;ord=[timestamp];dc_lat=;dc_rdid=;tag_for_child_directed_treatment= ? I've got this so far: https://ad.doubleclick.net/ddm/trackimp/N344006.1960500FACEBOOKAD/B9589414.130145906;dc_trk_aid=303019819;dc_trk_cid=69763238;ord=[timestamp];dc_lat=;dc_rdid=;tag_for_child_directed_treatment=?\"; BORDER=\"0\" HEIGHT=\"1\" WIDTH=\"1\" ALT=\"Advertisement With this is the code I've used: carreras_normal$Impression.Tag..image. <- gsub("","\\1",carreras_normal$Impression.Tag..image., ignore.case = T) *But I still need to use get rid of this part:* https://ad.doubleclick.net/ddm/trackimp/N344006.1960500FACEBOOKAD/B9589414.130145906;dc_trk_aid=303019819;dc_trk_cid=69763238;ord=[timestamp];dc_lat=;dc_rdid=;tag_for_child_directed_treatment= ?*\" BORDER=\"0\" HEIGHT=\"1\" WIDTH=\"1\" ALT=\"Advertisement* Thank you for your help. You're querying an xml string, so use xpath, e.g., via the XML library > as.character(xmlParse(y)[["//IMG/@SRC"]]) [1] "https://ad.doubleclick.net/ddm/trackimp/N344006.1960500FACEBOOKAD/B9589414.130145906;dc_trk_aid=303019819;dc_trk_cid=69763238;ord=[timestamp];dc_lat=;dc_rdid=;tag_for_child_directed_treatment=?"; `xmlParse()` translates the character string into an XML document. `[[` subsets the document to extract a single element. "//IMG/@SRC" follows the xpath specification (this section https://www.w3.org/TR/xpath-31/#abbrev of the specification provides a quick guide) to find, starting from the 'root' of the document, a node, at any depth, labeled IMG containing an attribute labeled SRC. A variation, if there were several IMG tags to be extracted, would be xpathSApply(xmlParse(y), "//IMG/@SRC", as.character) Omar Gonzáles. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R problem : Error: protect(): protection stack overflow
On 03/14/2016 06:39 PM, Mohammad Tanvir Ahamed via R-help wrote: Hi, i got an error while i am running a big data. Error has explained by the following sample sample This is an error in the package, and should be reported to the maintainer. Discover the maintainer with the command maintainer("impute") ## Load data mdata <- as.matrix(read.table('https://gubox.box.com/shared/static/qh4spcxe2ba5ymzjs0ynh8n8s08af7m0.txt', header = TRUE, check.names = FALSE, sep = '\t')) ## Install and load library source("https://bioconductor.org/biocLite.R";) biocLite("impute") library(impute) ## sets a limit on the number of nested expressions options(expressions = 50) ## Apply k-nearest neighbors for missing value imputation res <-impute.knn(mdata) Error: protect(): protection stack overflow If anybody has solution or suggestion, please share. Thanks . Tanvir Ahamed Göteborg, Sweden | mashra...@yahoo.com __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. This email message may contain legally privileged and/or confidential information. If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited. If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Is there "orphan code" in seq.default?
On 08/15/2015 02:01 PM, David Winsemius wrote: I was looking at the code in seq.default and saw code that I think would throw an error if it were ever executed, although it will not because there is first a test to see if one of its arguments is missing. Near the end of the function body is this code: else if (missing(by)) { if (missing(to)) to <- from + length.out - 1L if (missing(from)) from <- to - length.out + 1L if (length.out > 2L) if (from == to) rep.int(from, length.out) else as.vector(c(from, from + seq_len(length.out - 2L) * by, to)) Notice that the last call to `else` would be returning a value calculated with 'by' which was already established as missing. missing arguments can have default values > f = function(by="sea") if (missing(by)) by > f() [1] "sea" which is the case for seq.default > args(seq.default) function (from = 1, to = 1, by = ((to - from)/(length.out - 1)), length.out = NULL, along.with = NULL, ...) Martin Morgan -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] writing binary data from RCurl and postForm
On 08/05/2015 11:52 AM, Greg Donald wrote: I'm using RCurl with postForm() to post to a URL that responds with a PDF. I cannot figure out how to write the resulting PDF data to a file without corruption. result = postForm(url, binary=TRUE) Both this: capture.output(result, file='/tmp/export.pdf') and this: f = file('/tmp/export.pdf', 'wb') write(result, f) close(f) result in a corrupted PDF. I also tried postForm without binary=TRUE but that halts execution with an "embedded nul in string" error. I also tried writeBin() but that complains about my result not being a vector. I think that is because the value returned from postForm has an attribute; remove it by casting the return to a vector fl <- tempfile(fileext=".pdf") writeBin(as.vector(postForm(url, binary=TRUE)), fl) The httr package might also be a good bet writeBin(content(POST(url)), fl) I can use curl on the command line and this works fine, but I need to get this working in R. Any help would be greatly appreciated. Thanks. -- Greg Donald __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Release schedule (was (no subject) )
On 08/05/2015 10:08 AM, Jeff Newmiller wrote: New versions are released when they are ready. This is volunteer-driven software. From https://developer.r-project.org/ : The overall release schedule is to have annual x.y.0 releases in Spring, with patch releases happening on an as-needed basis. It is intended to have a final patch release of the previous version shortly before the next major release. --- Jeff NewmillerThe . . Go Live... DCN:Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On August 5, 2015 5:55:21 AM EDT, "Djossè Parfait" wrote: Good morning, I would like to know how often per year is a new full version release of R. Thanks -- Djossè Parfait BODJRENOU Chef de la Division Centralisation et Analyse des Données Statistiques /DPP/MESFTPRIJ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Error when compiling R-2.5.1 / *** [d-p-q-r-tests.Rout] Fehler 1
On 07/31/2015 10:48 PM, Joerg Kirschner wrote: Hi everyone, I am new to Linux and R - but I managed to build R-2.5.1 from source to use it in Genepattern. Genepattern does only support R-2.5.1 which I could not find anywhere for installation via apt-get or in the Ubuntu Software-Centre (I am using Ubuntu 14.04 (Trusty Tahr) 32-bit) Are you sure you want to do this? R 2.5.1 is from 2007, which is a very long time ago. It seems like GenePattern is not restricted to R-2.5.1, http://www.broadinstitute.org/cancer/software/genepattern/administrators-guide#using-different-versions-of-r and if their default distribution uses it, then I'm not sure I'd recommend using GenePattern for new analysis! (Maybe you're trying to re-do a previous analysis?) Since GenePattern modules that use R typically wrap individual CRAN or Bioconductor (http://bioconductor.org) packages, maybe you can take out the middleman ? Martin Morgan But after doing make check I get comparing 'method-dispatch.Rout' to './method-dispatch.Rout.save' ... OK running code in 'd-p-q-r-tests.R' ...make[3]: *** [d-p-q-r-tests.Rout] Fehler 1 make[3]: Verzeichnis »/home/karin/Downloads/R-2.5.1/tests« wird verlassen make[2]: *** [test-Specific] Fehler 2 make[2]: Verzeichnis »/home/karin/Downloads/R-2.5.1/tests« wird verlassen make[1]: *** [test-all-basics] Fehler 1 make[1]: Verzeichnis »/home/karin/Downloads/R-2.5.1/tests« wird verlassen make: *** [check] Fehler 2 but I can make install and use R for simple plots etc. afterwards - still I am worried something is wrong, can you give some advice. A closer look at the error gives ## PR#7099 : pf() with large df1 or df2: nu <- 2^seq(25,34, 0.5) y <- 1e9*(pf(1,1,nu) - 0.68268949) stopifnot(All.eq(pf(1,1,Inf), 0.68268949213708596), + diff(y) > 0, # i.e. pf(1,1, *) is monotone increasing + All.eq(y [1], -5.07420372386491), + All.eq(y[19], 2.12300110824515)) Error: All.eq(y[1], -5.07420372386491) is not TRUE Execution halted As I understand so far some errors are critical some are not - can you please give some advice on the error above? Can I still use R installed with that error? What do I need to solve the error? Thanks, Joerg [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S4 / operator "[" : Compatibility issue between lme4 and kml
On 06/05/2015 10:52 AM, Martin Maechler wrote: Christophe Genolini on Fri, 5 Jun 2015 00:36:42 -0700 writes: > Hi all, > There is a compatibility issue between the package 'lme4' and my package > 'kml'. I define the "[" operator. It works just fine in my package (1). If I > try to use the lme4 package, then it does no longer work (2). Moreover, it > has some kind of strange behavior (3). Do you know what is wrong? Any idea > of how I can correct that? > Here is a reproductible example, and the same code with the result follows. > Thanks for your help > Christophe [ ... I'm providing slightly different code below ] --- 8< - Execution of the previous code --- library(kml) Le chargement a nécessité le package : clv Le chargement a nécessité le package : cluster Le chargement a nécessité le package : class Le chargement a nécessité le package : longitudinalData Le chargement a nécessité le package : rgl Le chargement a nécessité le package : misc3d dn <- gald(1) ### ### (1) the "[" operator works just fine dn["traj"] t0 t1t2t3t4 t5 t6t7t8t9 t10 i1 -3.11 4.32 2.17 1.82 4.90 7.34 0.83 -2.70 5.36 4.96 3.16 i2 -7.11 1.40 -2.40 -2.96 4.31 0.50 1.25 0.52 -0.04 7.55 5.50 i3 2.80 6.23 6.08 2.87 2.58 2.88 6.58 -2.38 2.30 -1.74 -3.23 i4 2.24 0.91 6.50 10.92 11.32 7.79 7.78 10.69 9.15 1.07 -0.51 ### ### (2) using 'lme4', it does no longer work library(lme4) Le chargement a nécessité le package : Matrix Le chargement a nécessité le package : Rcpp dn["traj"] Error in x[i, j] : erreur d'évaluation de l'argument 'j' lors de la sélection d'une méthode pour la fonction '[' : Erreur : l'argument "j" est manquant, avec aucune valeur par défaut ### ### (3) If I define again the "[", it does not work the first time I call it, but it work the second time! setMethod("[", + signature=signature(x="ClusterLongData", i="character", j="ANY",drop="ANY"), + definition=function (x, i, j="missing", ..., drop = TRUE){ Your file has two definitions of setMethod("[", c("ClusterLongData", ... I deleted the first one. The second definition had signature=signature(x="ClusterLongData", i="character", j="ANY",drop="ANY"), whereas probably you mean to say that you'll handle signature=signature(x="ClusterLongData", i="character", j="missing", drop="ANY") The next line says definition=function (x, i, j="missing", ..., drop = TRUE){ which provides a default value for 'j' when j is not provided by the user. Thus later when you say x[i, j] you are performing dn["traj", "missing"] when probably you meant x[i, , drop=drop] Making these changes, so the definition is setMethod( "[", signature=signature(x="ClusterLongData", i="character", j="missing", drop="ANY"), definition=function (x, i, j, ..., drop = TRUE){ if (is.numeric(i)) { stop("[ClusterLongData:getteur]: to get a clusters list, use ['ci']") }else{} if (i %in% c("criterionValues", "criterionValuesAsMatrix")){ j <- x['criterionActif'] }else{} if (i %in% c(CRITERION_NAMES, "criterionActif", CLUSTER_NAMES, "criterionValues", "criterionValuesAsMatrix", "sorted", "initializationMethod")) { x <- as(x, "ListPartition") }else{ x <- as(x, "LongData") } x[i, , drop=drop] }) Allows operations to work correctly. > library(kml) Loading required package: clv Loading required package: cluster Loading required package: class Loading required package: longitudinalData Loading required package: rgl Loading required package: misc3d > library(Matrix) > x = gald(1)["traj"] > x t0t1t2t3t4t5t6t7t8t9 t10 i1 -3.18 -1.19 -1.17 1.56 -0.70 1.78 -0.95 -2.00 -5.05 1.05 2.84 i2 3.51 1.72 6.97 6.09 7.81 8.33 9.54 14.38 16.14 12.82 13.86 i3 9.60 11.59 9.09 6.31 9.24 7.69 4.26 -0.80 2.70 1.63 1.21 i4 -0.54 3.80 6.05 10.41 12.60 12.32 10.33 11.05 7.89 5.21 0.67 It's hard to tell whether is an issue with the methods package, or just that Matrix offered a better nearest 'method' than those provided by kml / longitudinalData. + x <- as(x, "LongData") + return(x[i, j]) + } + ) [1] "[" ### No working the first time I use it dn["traj"] Error in dn["traj"] : l'argument "j" est manquant, avec aucune valeur par défaut ### But working the second time dn["traj"] t0 t1t2t3t4 t5 t6t7t8t9 t10 i1 -3.11 4.32 2.17 1.82 4.90 7.34 0.83 -2.70 5.36 4.96 3.16 i2 -7.11 1.40 -2.40 -2.96 4.31 0.50 1.25 0.52 -0.04 7.55 5.50 i3 2.80 6.23 6.08 2.87 2.58 2.88 6.58 -2.38 2.30 -1.74 -3.23 i4 2.24 0.91 6.50 10.92 11.32 7.79 7.78 10.69 9.15 1.07 -0.5
Re: [R] is.na for S4 object
On 06/04/2015 10:08 AM, cgenolin wrote: Hi the list, I have a variable y that is either NA or some S4 object. I would like to know in which case I am, but it seems taht is.na does not work with S4 object, I get a warnings: --- 8< setClass("myClass",slots=c(x="numeric")) if(runif(1)>0.5){a <- new("myClass")}else{a <- NA} is.na(a) --- 8< Any solution? getGeneric("is.na") shows that it's an S4 generic, so implement a method setMethod("is.na", "myClass", function(x) FALSE) Martin Thanks Christophe -- View this message in context: http://r.789695.n4.nabble.com/is-na-for-S4-object-tp4708201.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Can't seem to install packages
On 05/28/2015 08:21 AM, Duncan Murdoch wrote: On 28/05/2015 6:10 AM, Claire Rioualen wrote: Hello, I can't seem to install R packages, since it seemed there were some permission problems I "chmoded" /usr/share/R/ and /usr/lib/R/. However, there are still errors in the process. Here's my config: > sessionInfo() R version 3.1.1 (2014-07-10) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C [3] LC_TIME=en_US.UTF-8LC_COLLATE=en_US.UTF-8 [5] LC_MONETARY=en_US.UTF-8LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C [9] LC_ADDRESS=C LC_TELEPHONE=C [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] ggplot2_1.0.1BiocInstaller_1.16.5 loaded via a namespace (and not attached): [1] colorspace_1.2-6 digest_0.6.8 grid_3.1.1 gtable_0.1.2 [5] magrittr_1.5 MASS_7.3-40 munsell_0.4.2plyr_1.8.2 [9] proto_0.3-10 Rcpp_0.11.6 reshape2_1.4.1 scales_0.2.4 [13] stringi_0.4-1stringr_1.0.0tcltk_3.1.1 tools_3.1.1 And here are some packages I tried to install: *> install.packages("XML")* Installing package into ���/packages/rsat/R-scripts/Rpackages��� (as ���lib��� is unspecified) trying URL 'http://ftp.igh.cnrs.fr/pub/CRAN/src/contrib/XML_3.98-1.1.tar.gz' Content type 'text/html' length 1582216 bytes (1.5 Mb) opened URL == downloaded 1.5 Mb * installing *source* package ���XML��� ... ** package ���XML��� successfully unpacked and MD5 sums checked checking for gcc... gcc checking for C compiler default output file name... rm: cannot remove 'a.out.dSYM': Is a directory a.out checking whether the C compiler works... yes checking whether we are cross compiling... no checking for suffix of executables... checking for suffix of object files... o checking whether we are using the GNU C compiler... yes checking whether gcc accepts -g... yes checking for gcc option to accept ISO C89... none needed checking how to run the C preprocessor... gcc -E checking for sed... /bin/sed checking for pkg-config... /usr/bin/pkg-config checking for xml2-config... no Cannot find xml2-config ERROR: configuration failed for package ���XML��� * removing ���/packages/rsat/R-scripts/Rpackages/XML��� this is a missing system dependency, requiring the libxml2 'dev' headers. On my linux this is sudo apt-get installl libxml2-dev likely you'll also end up needing curl via libcurl4-openssl-dev or similar The downloaded source packages are in ���/tmp/RtmphODjkn/downloaded_packages��� Warning message: In install.packages("XML") : installation of package ���XML��� had non-zero exit status *> install.packages("Biostrings")* Installing package into ���/packages/rsat/R-scripts/Rpackages��� (as ���lib��� is unspecified) Warning message: package ���Biostrings��� is not available (for R version 3.1.1) *> biocLite("Biostrings")* Yes,Bioconductor versions packages differently from CRAN (we have twice-yearly releases and stable 'release' and 'devel' branches). Following the instructions for package installation at http://bioconductor.org/packages/Biostrings but... [...] io_utils.c:16:18: fatal error: zlib.h: No such file or directory #include ^ this seems like a relatively basic header to be missing, installable from zlib1g-dev, but I wonder if you're taking a mis-step earlier, e.g., trying to install on a cluster node that is configured for software use but not installation? Also the instructions here to install R http://cran.r-project.org/bin/linux/ would likely include these basic dependencies 'out of the box'. Martin compilation terminated. /usr/lib/R/etc/Makeconf:128: recipe for target 'io_utils.o' failed make: *** [io_utils.o] Error 1 ERROR: compilation failed for package ���Biostrings��� * removing ���/packages/rsat/R-scripts/Rpackages/Biostrings��� The downloaded source packages are in ���/tmp/RtmphODjkn/downloaded_packages��� Warning message: In install.packages(pkgs = pkgs, lib = lib, repos = repos, ...) : installation of package ���Biostrings��� had non-zero exit status I've used R on several machines before and never had such problems. Thanks for any clue! It's hard to read your message (I think it was posted in HTML), but I think those are all valid errors in building those packages. You appear to be missing some of their dependencies. This is not likely related to permissions. Duncan Murdoch __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research C
Re: [R] Help manipulating 23andme data in R - reproducing relationship results
On 05/17/2015 01:52 PM, Lyle Warren wrote: Thanks Jeff, Bert! You are right - definitely out of my skill area.. I've no found some help on the bioconductor mailing list. I'm not sure that you've asked in the right place https://support.bioconductor.org see also http://www.vincebuffalo.com/2012/03/12/23andme-gwascat.html which is a little dated and maybe not relevant to your question. A little tangentially, see also https://support.bioconductor.org/p/67444/ Martin Morgan On 18 May 2015 at 03:04, Bert Gunter wrote: (No response necessary) What struck me about this post was the apparent mismatch: the OP seemed not to have a clue where to begin. Maybe he somehow has been assigned or chose a task for which his skills and background are inadequate. This is not really a criticism: if someone told me to make a dining room set, my reply would be: "Either find someone else or see you in about a year after which I may have learned enough to attempt the task. " So maybe the OP should give up looking for internet advice altogether and find someone local to work with? And, of course, apologies if I have misinterpreted. Cheers, Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 "Data is not information. Information is not knowledge. And knowledge is certainly not wisdom." Clifford Stoll On Sun, May 17, 2015 at 9:52 AM, Jeff Newmiller wrote: This is a very domain-specific question (genetic data analysis), not so much a question about how to use R, so does not seem on topic here. I also suspect that the company 23andme may use some proprietary algorithms, so "replicating their results" could be a tall order. You might start with the CRAN "Statistical Genetics" task view, and a textbook on the subject. The Bioconductor project may also be a useful resource. --- Jeff NewmillerThe . . Go Live... DCN:Basics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On May 16, 2015 6:53:46 AM PDT, Lyle Warren wrote: Hi, I'm trying to replicate 23andMe's parentage test results within R, using the 23andme raw data. Does anyone know a simple way to do this? I have read the data with gwascat and it seems to be in there fine. Thanks for any help you can give! Cheers, Lyle [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] some general advice sought on the use of gctorture()
On 04/24/2015 06:49 AM, Franckx Laurent wrote: Dear all I have bumped into the dreaded 'segfault' error type when running some C++ code using .Call(). segfaults often involve invalid memory access at the C level that are best discovered via valgrind or similar rather than gctorture. A good way to spot these is to (a) come up with a _minimal_ reproducible script test.R that takes just a few seconds to run and that tickles, at least some times, the segfault (b) make sure that your package is compiled without optimizations and with debugging symbols, e.g., in ~/.R/Makevars add the lines CFLAGS="-ggdb -O0" CXXFLAGS="-ggdb -O0" (c) run the code under 'valgrind' R -d valgrind -f test.r Look especially for 'invalid read' or 'invalid write' messages, and isolate _your_ code in the callback that the message produces. There is a 'worked example' at http://bioconductor.org/developers/how-to/c-debugging/#case-study Of course this might lead to nothing, and then you'll be back to your original question about using gctorture or other strategies. Martin Morgan I have already undertaken several attempts to debug the C++ code with gdb(), but until now I have been unable to pinpoint the origin of the problem. There are two elements that I think are puzzling (a) this .Call() has worked fine for about three years, for a variety of data (b) the actual crash occurs at random points during the execution of the function (well, random from a human eye's point of view). From what I understand in the "R extensions" manual, the actual problem may have been around for a while before the actual call to the C++ code. As recommended in the manual, I am now using gctorture() to try to pinpoint the origins of the problem. I can, alas, only confirm that gctorture() has an enormous impact on execution time, even for operations that are normally executed within the blink of an eye. From what I have seen until now, executing all the R code before the crash with gctorture(TRUE) could take months. I suppose then that the best way to proceed would be to proceed backward from the point where the crash occurs when gctorture(FALSE). I have tried to find some concrete examples of good practices in the use of gctorture() to identify memory problems in R, but most of what I have found on the web is simply a copy of the help page. Does anybody know more concrete and elaborated examples that could give an indication on how to best proceed further? Laurent Franckx, PhD Senior researcher sustainable mobility VITO NV | Boeretang 200 | 2400 Mol Tel. ++ 32 14 33 58 22| mob. +32 479 25 59 07 | Skype: laurent.franckx | laurent.fran...@vito.be | Twitter @LaurentFranckx VITO Disclaimer: http://www.vito.be/e-maildisclaimer __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Returning to parent function
On 03/16/2015 05:05 PM, Saptarshi Guha wrote: Example was complicated, but here is a simpler form continueIfTrue <- function(mm=return()){ eval(mm) } telemStats <- function(){ y <- substitute(return()) continueIfTrue(y) print("I would not like this message to be printed") } telemStats() Ideally, calling telemStats() should return to the prompt and the print in telemStats should not appear here's one way to implement your original example -- signal and handle, via tryCatch(), a custom condition created (modelled after simpleCondition()) as an S3 class with linear inheritance. X <- function() { print("I'm saying...") signalCondition(structure(list(), class=c("my", "condition"))) print("X") } Y <- function(){ tryCatch(XParent(), my=function(...) NULL) print("hello") } XParent <- function(){ X() print("H") } leading to > Y() [1] "I'm saying..." [1] "hello" callCC() is tricky for me to grasp, but I'll write Y to accept an argument X, which will be a function. It'll call XParent with that function, and XParent will use the function. Y <- function(X){ XParent(X) print("hello") } XParent <- function(X){ X("fun") print("H") } then we've got > Y(X) Error in XParent(X) (from tmp.R!4361C1Y#2) : object 'X' not found > Y(function(x) print("X")) [1] "X" [1] "H" [1] "hello" but more interestingly the long jump to the top (where callCC was invoked) > callCC(function(X) { Y(X) }) [1] "fun" or in a function y <- function() { value <- callCC(function(X) { Y(X) }) print(value) print("done") } Hope that helps and is not too misleading. Excellent question. Martin On Mon, Mar 16, 2015 at 4:02 PM, David Winsemius wrote: On Mar 16, 2015, at 3:08 PM, Saptarshi Guha wrote: Hello, I would like a function X to return to the place that called the function XParent that called this function X. Y calls XParent Y = function(){ XParent() print("hello") } XParent calls X XParent = function(){ X() print("H") } X returns to the point just after the call to XParent. Hence print("H") is not called, but instead "hello" is printed. ?sys.call # my second reading of your question makes me think this wasn't what was requested. ?return # this would do what was asked for XParent = function(){ + return(sys.call()) + print("H") + } Y() [1] "hello" # Success # now to show that a value could be returned if desired Y = function(){ + print(XParent()) + print("hello") + } XParent = function(){ + return(sys.call()) + print("H") + } Y() XParent() [1] "hello" X returns to the point just after the call to XParent. Hence print("H") is not called, but instead "hello" is printed. An example of what i'm going for is this continueIfTrue <- function(filterExp, grpname, subname,n=1){ y <- substitute(filterExp) res <- isn(eval(y, envir=parent.frame()),FALSE) ## if res is FALSE, I would like to return from telemStats } telemStats <- function(a,b){ b <- c(10,12) continueIfTrue( {length(b) >=10 }, "progStats","00") print("Since the above condition failed, I would not like this message to be printed") } I'm afraid there were too many undefined objects to make much sense of that example. I looked into callCC and signals but dont think i understood correctly. Any hints would be appreciated Kind Regards Saptarshi -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Checking whether specific packages (from bioconductor) are installed when loading a package
On 03/11/2015 01:36 AM, Søren Højsgaard wrote: Dear all, My package 'gRbase' uses three packages from Bioconductor and these are not automatically installed when gRbase is installed. My instructions (on the package webpage) to users are therefore to run: Treat Bioconductor packages as any other, listing them in Depends: or Imports: or Suggests: as described in 'Writing R Extensions'. CRAN builds packages with access to the Bioconductor repository. Your CRAN users chooseBioCmirror() and setRepositories() before using install.packages(), and Bioc dependencies are installed like any other dependency. source("http://bioconductor.org/biocLite.R";); biocLite(c("graph","RBGL","Rgraphviz")) When loading gRbase, it is checked whether these Bioconductor packages are available, but I would like to add a message about how to install the packages if they are not. This functionality is provided by Depends: and Imports:, so is not relevant for packages listed in this part of your DESCRIPTION file. You're only asking for advice on packages that are in Suggests:. It does not matter that these are Bioconductor packages or CRAN packages or ... the packages in Suggests: are not, by default, installed when your package was installed (see the 'dependencies' argument to install.packages()). Does this go into .onAttach or .onLoad or elsewhere? Or not at all. If the package belongs in Suggests: and provides some special functionality not needed by the package most of the time (else it would be in Imports: [most likely] or Depends:) then there will be some few points in the code where the package is used and you need to alert the user to the special condition they've encountered. You'll want to fully specify the package and function to be used RBGL::transitive.closure(...) (full specification provides similar advantage to Import:'ing a symbol into your package, avoiding symbol look-up along the search() path, potentially getting a function transitive.closure() defined by the user or a package different from RBGL). If RBGL is not available, the above code will fail, and the user will be told that "there is no package called 'RBGL'". One common strategy for nicer messages is to if (!requireNamespace("RBGL)") stop("your more tailored message") in the few code chunks before your use of RBGL::transitive.closure(). requireNamespace() loads but does not attach the RBGL package, so the symbols are available when fully qualified RBGL:: but the package does not interfere with the user search() path. Which I guess brings us to your question, and the answer is probably that if after the above you were to still wish to add a message at package start-up, then the right place would be .onLoad(), so that users of your package, as well as users of packages that Import: (load) but do not Depend: (attach) on your package, will see the message. Also, this belongs on the R-devel mailing list. Hope that's helpful, Martin Thanks in advance Søren __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Installing GO.db Package in R
On 03/05/2015 01:21 AM, Zaynab Mousavian wrote: Hi all, I have tried to install GO.db package in R, but the following error is please ask questions about Bioconductor packages on the Bioconductor support forum https://support.bioconductor.org. Please also review answers to your question on Biostars first. You are using an old version of R / Bioconductor but a new version of RSQLite. Use either a current version of R / Bioconductor or an old version of RSQLite, as explained here https://support.bioconductor.org/p/63555. If you are having trouble installing a current version of R on linux, indicate your OS and how you are currently installing R. Be sure to follow the relevant directions from, e.g., http://cran.r-project.org/. Perhaps the R-SIG-Debian archives and mailing list have additional hints https://stat.ethz.ch/pipermail/r-sig-debian/. Martin given to me: biocLite(c("GO.db")) BioC_mirror: http://bioconductor.org Using Bioconductor version 2.13 (BiocInstaller 1.12.1), R version 3.0.2. Installing package(s) 'GO.db' trying URL 'http://bioconductor.org/packages/2.13/data/annotation/src/contrib/GO.db_2.10.1.tar.gz' Content type 'application/x-gzip' length 26094175 bytes (24.9 Mb) opened URL== downloaded 24.9 Mb * installing *source* package �GO.db� ...** R** inst** preparing package for lazy loading** help*** installing help indices** building package indices** testing if installed package can be loaded Error : .onLoad failed in loadNamespace() for 'GO.db', details: call: match.arg(synchronous, c("off", "normal", "full")) error: 'arg' must be NULL or a character vector Error: loading failed Execution halted ERROR: loading failed* removing �/home/zmousavian/R/x86_64-pc-linux-gnu-library/3.0/GO.db� The downloaded source packages are in �/tmp/RtmpBDs1Tq/downloaded_packages� Warning messages:1: In install.packages(pkgs = pkgs, lib = lib, repos = repos, ...) : installation of package �GO.db� had non-zero exit status2: installed directory not writable, cannot update packages 'colorspace','lattice', 'mgcv', 'survival' Can anyone help me to install it? Regards __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] character type and memory usage
On 01/16/2015 10:21 PM, Mike Miller wrote: First, a very easy question: What is the difference between using what="character" and what=character() in scan()? What is the reason for the character() syntax? I am working with some character vectors that are up to about 27.5 million elements long. The elements are always unique. Specifically, these are names of genetic markers. This is how much memory those names take up: snps <- scan("SNPs.txt", what=character()) Read 27446736 items object.size(snps) 1756363648 bytes object.size(snps)/length(snps) 63.9917128215173 bytes As you can see, that's about 1.76 GB of memory for the vector at an average of 64 bytes per element. The longest string is only 14 bytes, though. The file takes up 313 MB. Using 64 bytes per element instead of 14 bytes per element is costing me a total of 1,372,336,800 bytes. In a different example where the longest string is 4 characters, the elements each use 8 bytes. So it looks like I'm stuck with either 8 bytes or 64 bytes. Is that true? There is no way to modify that? Hi Mike -- R represents the atomic vector types as so-called S-expressions, which in addition to the actual data contain information about whether they have been referenced by one or more symbols etc.; you can get a sense of this with > x <- 1:5 > .Internal(inspect(x)) @4c732940 13 INTSXP g0c3 [NAM(1)] (len=5, tl=0) 1,2,3,4,5 where the number after @ is the memory location, INTSXP indicates that the type of data is an integer, etc. So a vector requires memory for the S-expression, and for the actual data. A character vector is represented by an S-expression for the vector itself, and an S-expression for each element of the vector, and of course the data itself > .Internal(inspect(y)) @4ce72090 16 STRSXP g0c3 [NAM(1)] (len=3, tl=0) @137ccd8 09 CHARSXP g0c1 [gp=0x61] [ASCII] [cached] "a" @137ccd8 09 CHARSXP g0c1 [gp=0x61] [ASCII] [cached] "a" @15a6698 09 CHARSXP g0c1 [gp=0x61] [ASCII] [cached] "b" The large S-expression overhead is recouped by long (in the nchar() sense) or re-used strings, but that's not the case for your data. There is no way around this in base R. There are general-purpose solutions like the data.table package, or retaining your large data in a data base (like SQLite) that you interface from within R using e.g., sqldf or dplyr to do as much data reduction in the data base (and out of R) as possible. In your particular case the Bioconductor Biostrings package BStringSet() might be relevant http://bioconductor.org/packages/release/bioc/html/Biostrings.html This will consume memory more along the lines of 1 byte per character + 1 byte per string, and is of particular relevance because you are likely doing other genetic operations for which the Bioconductor project has relevant packages (see especially the GenomicRanges package). If your work is not particularly domain-specific, data.table would be a good bet (it also has an implementation for working with overlapping ranges, which is a very common task with SNPs). A lot of SNP data management is really relational, for which the SQL representation (and dplyr, for me) is the obvious choice. Bioconductor would be the choice if there is to be extensive domain-specific work. I am involved in the Bioconductor project, so not exactly impartial. Martin By the way... It turns out that 99.72% of those character strings are of the form paste("rs", Int) where Int is an integer of no more than 9 digits. So if I use only those markers, drop the "rs" off, and load them as integers, I see a huge improvement: snps <- scan("SNPs_rs.txt", what=integer()) Read 27369706 items object.size(snps) 109478864 bytes object.size(snps)/length(snps) 4.0146146985 bytes That saves 93.8% of the memory by dropping 0.28% of the markers and encoding as integers instead of strings. I might end up doing this by encoding the other characters as negative integers. Mike __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: which is faster "for" or "apply"
h cidx <- sapply(df, is.character) # index of columns to coerce df[cidx] <- lapply(df[cidx], as.numeric) which seems to be reasonably correct, expressive, compact, and speedy. Martin Morgan Ô__ c/ /'_;kmezhoud (*) \(*) ⴽⴰⵔⵉⵎ ⵎⴻⵣⵀⵓⴷ http://bioinformatics.tn/ On Wed, Dec 31, 2014 at 8:54 AM, Berend Hasselman wrote: On 31-12-2014, at 08:40, Karim Mezhoud wrote: Hi All, I would like to choice between these two data frame convert. which is faster? for(i in 1:ncol(DataFrame)){ DataFrame[,i] <- as.numeric(DataFrame[,i]) } OR DataFrame <- as.data.frame(apply(DataFrame,2 ,function(x) as.numeric(x))) Try it and use system.time. Berend Thanks Karim Ô__ c/ /'_;kmezhoud (*) \(*) ⴽⴰⵔⵉⵎ ⵎⴻⵣⵀⵓⴷ http://bioinformatics.tn/ [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] RCurl much faster than base R
On 12/05/2014 08:12 AM, Alex Gutteridge wrote: I'm trying to debug a curious network issue, I wonder if anyone can help me as I (and my local sysadmin) am stumped: This base R command takes ~1 minute to complete: readLines(url("http://bioconductor.org/biocLite.R";)) (biocLite.R is a couple of KB in size) Using RCurl (and so libcurl under the hood) is instantaneous (<1s): library(RCurl) getURL("http://bioconductor.org/biocLite.R";) I've not set it to use any proxies (which was my first thought) unless libcurl autodetects them somehow... And the speed is similarly fast using wget or curl on the command line. It just seems to be the base R commands which are slow (including install.packages etc...). Does anyone have hints on how to debug this (if not an answer directly)? Hi Alex -- maybe not surprisingly, both approaches are approximately equally speedy for me, at least on average. For what it's worth - there is no need to use url(), just readLines("http://...";) It would help to - provide the output of sessionInfo() - verify or otherwise that the problem is restricted to particular urls - work through a simple example where the test say 'works' when accessing a local http server (e.g., on the same machine and in a directory "mydir", python -m SimpleHTTPServer 1 in one terminal, the readLines("http://localhost:1/") but fails after some increasingly remote point, e.g., accessing a url outside your institution firewall hence indicating a firewall issue. Maybe at the end of this exercise the only insight will be that the R and curl implementations differ (a known known!). Also if this is really a problem with installing Bioconductor packages rather than a general R question, then https://support.bioconductor.org is a better place to post. If the problem is restricted to bioconductor.org, then: (a) for your sys.admin, the url is redirected (via DNS, not http:) to Amazon Cloud Front and from there to a regional Amazon data center; I'm not sure what the significance of this might be, e.g., the admin might have throttled download speeds from certain ip address ranges; and (b) if you're in Europe or elsewhere, you're trying to install Bioconductor packages, and the regional data center is not fast enough (it should be responsive, at least when the url has been seen 'recently'), then configure R to use a local mirror from http://bioconductor.org/about/mirrors/, e.g., chooseBioCmirror() Martin Morgan Bioconductor AlexG __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] need help with withRestarts ?
On 12/06/2014 02:53 PM, ce wrote: Dear all, Let's say I have this script , below. tryCatch indeed catches the error but exists, I want function to continue and stay in the loop. I found very examples of withRestarts on internet to figure it out. Could you help me how to do it ? myfunc <- function() { while(1) { x <- runif(1) if ( x > 0.3 ) a <- x/2 else a <- x/"b" print(a) Sys.sleep(1) } } Hi -- Modify your function so that the code that you'd like to restart after is surrounded with withRestarts(), and with a handler that performs the action you'd like, so myfunc <- function() { while(TRUE) { x <- runif(1) withRestarts({ if ( x > 0.3 ) a <- x/2 else a <- x/"b" print(a) }, restartLoop = function() { message("restarting") NULL }) Sys.sleep(1) } } Instead of using tryCatch(), which returns to the top level context to evaluate the handlers, use withCallingHandlers(), which retains the calling context. Write a handler that invokes the restart withCallingHandlers({ myfunc() }, error = function(e) { message("error") invokeRestart("restartLoop") }) It's interesting that tryCatch is usually used with errors (because errors are hard to recover from), and withCallingHandlers are usually used with warnings (because warnings can usually be recovered from), but tryCatch() and withCallingHandlers() can be used with any condition. Martin tryCatch({ myfunc() }, warning = function(w) { print("warning") }, error = function(e) { print("error") }, finally = { print("end") } ) __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] recoding genetic information using gsub
On 12/5/2014 11:24 AM, Kate Ignatius wrote: I have genetic information for several thousand individuals: A/T T/G C/G etc For some individuals there are some genotypes that are like this: A/, C/, T/, G/ or even just / which represents missing and I want to change these to the following: A/ A/. C/ C/. G/ G/. T/ T/. / ./. /A ./A /C ./C /G ./G /T ./T I've tried to use gsub with a command like the following: gsub("A/","[A/.]", GT[,6]) Hi Kate -- a different approach is to create a 'map' (named character vector) describing what you want in terms of what you have; the number of possible genotypes is not large. http://stackoverflow.com/questions/15912210/replace-a-list-of-values-by-another-in-r/15912309#15912309 Martin but if genotypes arent like the above, the command will change it to look something like: A/.T T/.G C/.G Is there anyway to be more specific in gsub? Thanks! __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Martin Morgan, PhD Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Profiling a C/C++ library from R
On 12/02/2014 01:43 PM, Charles Novaes de Santana wrote: Dear all, I am running a c++ library (a .so file) from a R code. I am using the function dyn.load("lib.so") to load the library. Do you know a way to profile my C library from R? Or should I compile my C library as an executable and profile it using the typical C-profilers? Thanks in advance for any help! Hi Charles Section 3.4 of RShowDoc("R-exts") discusses some options; I've had luck with operf & friends. Remember to compile without optimizations and with debugging information -ggdb -O0. (I think this is appropriate for the R-devel mailing list http://www.r-project.org/posting-guide.html#which_list) Martin Morgan Best, Charles -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] cyclic dependency when building a package
On 11/30/2014 07:15 AM, Glenn Schultz wrote: Hi All, I am working on a package BondLab for the analysis of fixed income securities. Building the package results in the following: Error in loadNamespace(package, c(which.lib.loc, lib.loc)) : cyclic namespace dependency detected when loading ‘BondLab’, already loading ‘BondLab’ It occurs when I set the generic for the function mortgagecashflow. Further if a function uses mortgagecashflow, similarly its generic, when set, causes the above error. Other generics do not throw this error so I am quite sure it is mortgagecashflow. The package and the code can be found on github. https://github.com/glennmschultz/BondLab.git I have been trying to figure this out for a couple of months to no avail. If anyone has familiarity with this issue I would certainly appreciate any help with the issue. Hi Glenn -- The root of the problem is that you are defining both a generic and a plain-old-function named MortgageCashFlow -- one or the other and you're fine. R CMD INSTALL pkgA, where pkgA contains a single R file R/test.R setGeneric("foo", function(x, ...) standardGeneric("foo")) foo <- function(x, ...) {} also generates this; maybe you meant something like .foo <- function(x, ...) {} setGeneric("foo", function(x, ...) standardGeneric("foo"), useAsDefault=".foo") or simply reversing the order of the declarations foo <- function(x, ...) {} setGeneric("foo", function(x, ...) standardGeneric("foo")) ? Martin Morgan Thanks, Glenn __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Gender balance in R
On 11/25/2014 04:11 AM, Scott Kostyshak wrote: On Mon, Nov 24, 2014 at 12:34 PM, Sarah Goslee wrote: I took a look at apparent gender among list participants a few years ago: https://stat.ethz.ch/pipermail/r-help/2011-June/280272.html Same general thing: very few regular participants on the list were women. I don't see any sign that that has changed in the last three years. The bar to participation in the R-help list is much, much lower than that to become a developer. I plotted the gender of posters on r-help over time. The plot is here: https://twitter.com/scottkosty/status/449933971644633088 The code to reproduce that plot is here: https://github.com/scottkosty/genderAnalysis The R file there will call devtools::install_github to install a package from Github used for guessing the gender based on the first name (https://github.com/scottkosty/gender). It would be great to include in your package the script that scraped author names from R-help archives (I guess that's what you did?). Presumably it easily applies to other mailing lists hosted at the same location (R-devel, further along the ladder from user to developer, and Bioconductor / Bioc-devel, in a different domain and perhaps confounded with a different 'feel' to the list). Also the R community is definitely international, so finding more versatile gender-assignment approaches seems important. it might be interesting to ask about participation in mailing list forums versus other, and in particular the recent Bioconductor transition from mailing list to 'StackOverflow' style support forum (https://support.bioconductor.org) -- on the one hand the 'gamification' elements might seem to only entrench male participation, while on the other we have already seen increased (quantifiable) and broader (subjective) participation from the Bioconductor community. I'd be happy to make support site usage data available, and am interested in collaborating in an academically well-founded analysis of this data; any interested parties please feel free to contact me off-list. Martin Morgan Bioconductor Note also on that tweet that Gabriela de Queiroz posted it, who is the founder of R-ladies; and that David Smith showed interest in discussing the topic. So there is definitely demand for some data analysis and discussion on the topic. It would be interesting to look at the stats for CRAN packages as well. The very low percentage of regular female participants is one of the things that keeps me active on this list: to demonstrate that it's not only men who use R and participate in the community. Thank you for that! Scott -- Scott Kostyshak Economics PhD Candidate Princeton University (If you decide to do the stats for 2014, be aware that I've been out on medical leave for the past two months, so the numbers are even lower than usual.) Sarah On Mon, Nov 24, 2014 at 10:10 AM, Maarten Blaauw wrote: Hi there, I can't help to notice that the gender balance among R developers and ordinary members is extremely skewed (as it is with open source software in general). Have a look at http://www.r-project.org/foundation/memberlist.html - at most a handful of women are listed among the 'supporting members', and none at all among the 29 'ordinary members'. On the other hand I personally know many happy R users of both genders. My questions are thus: Should R developers (and users) be worried that the 'other half' is excluded? If so, how could female R users/developers be persuaded to become more visible (e.g. added as supporting or ordinary members)? Thanks, Maarten -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Reading FCS files with flowCore package
On 11/24/2014 11:38 AM, William Dunlap wrote: If help files used the mustWork=TRUE argument to system.file() this sort of problem would become more apparent to the user. It would give a clear error message from or to change the default to mustWork=TRUE, since there are not many use cases for querying a non-existent system file? (one irony I've stumbled across in my own code is to misspell 'mustWork', e.g., system.file("foo", mustwork=TRUE), which happily returns ""). Martin system.file() instead of a mysterious error about file "" not being valid or, worse, a hang from an input command waiting for the user to type something into standard input (because scan() and others treat file="" the same as scan=stdin()). Bill Dunlap TIBCO Software wdunlap tibco.com <http://tibco.com> On Mon, Nov 24, 2014 at 10:36 AM, Martin Morgan mailto:mtmor...@fredhutch.org>> wrote: On 11/24/2014 06:18 AM, Luigi wrote: Dear all, I would like to use the R's Bioconductor package flowCore to do flow cytometry Please address questions about Bioconductor packages to the Bioconductor support site https://support.bioconductor.__org <https://support.bioconductor.org> and... analysis. I generated a FCS file using the file>export function of the FACSDiva Software Version 8 from a BD LSRII machine. I then used the functions: file.name <http://file.name> <-system.file("extdata", "cd cells_FMO 8_003.fcs", package="flowCore") system.file() is used to access files installed in R packages, but probably you want to access your own file. Try file.name <http://file.name> = file.choose() and selecting the file that you want to iniptu. Verify that the path is correct by displaying the result file.name <http://file.name> Martin x <-read.FCS(file.name <http://file.name>, transformation = FALSE) as shown in the flowCore: data structure package... vignette (20 May 2014) as available from the internet. However the result is an error: >Error in read.FCS(file.name <http://file.name>, transformation = FALSE) : ' ' is not a valid file I then used the function: isFCSfile("cd cells_FMO 8_003.fcs") where cd cells_FMO 8_003.fcs is the name of the file. As expected I obtained the following message: >cd cells_FMO 8_003.fcs FALSE meaning I reckon that the file is not a FCS. Since I am completely new to this kind of analysis but I would not like to use flowJo, could anybody tell me how to load the FCS files? In the rest of the file I am pasting the beginning of the cd cells_FMO 8_003.fcs file for further reference (I can't attach the whole thing or even attaching the file because it is too big). From its gibberish I reckon that the encoding is probably wrong: I was expecting a flatfile after all not ASCII. Would the problem be how the run was exported? FlowJo however recognizes the files... Best regards, Luigi == FCS3.0 25619271933 1192532 0 0 $BEGINANALYSIS0$ENDANALYSIS0$BEGINSTEXT0$ENDSTEXT0$BEGINDATA 1933$ENDDATA1192532 $FIL180444.fcs$SYSWindows 7 6.1$TOT29765 $PAR10$MODEL$BYTEORD4,3,2,1$DATATYPEF$NEXTDATA0CREATORBD FACSDiva Software Version 8.0TUBE NAMEFMO 8$SRCcd cellsEXPERIMENT NAMEExperiment_001GUID4171c2f1-427b-4cc5-bf86-__39bb76803c48$DATE 31-OCT-2014$BTIM16:07:12$ETIM16:09:25SETTINGSCytometerWINDOW EXTENSION0.00EXPORT USER NAMELuigiMarongiuEXPORT TIME31-OCT-2014-16:07:11FSC ASF0.78AUTOBSTRUE$INST $TIMESTEP0.01SPILL 3,405-450/50-A,405-655/8-A,__405-525/50-A,1,0.__0028442147740618787,0.__0923076944711957,0,1,0,0.__3425525014147933,0.__08630456626553264,1 APPLY COMPENSATIONTRUETHRESHOLDFSC,5000$P1NTime$P1R262144$P1B32$P1E 0,0$P1G0.01P1BS0P1MS0$P2NFSC-A$P2R262144$P2B32$P2E0,0$P2V450 $P2G1.0P2DISPLAYLINP2BS-1P2MS0$P3NFSC-H$P3R262144$P3B32$P3E0,0 $P3V450$P3G1.0P3DISPLAYLINP3BS-1P3MS0$P4NFSC-W$P4R262144$P4B32 $P4E0,0$P4V450$P4G1.0P4BS-1P4MS0$P5NSSC-A$P5R262144$P5B32$P5E 0,0$P5V319$P5G1.0P5DISPLAYLINP5BS-1P5MS0$P6NSSC-H$P6R262144$P6B 32$P6E0,0$P6V319$P6G1.0P6DISPLAYLINP6BS-1P6MS0$P7NSSC-W$P7R 262144$P7B32$P7E0,0$P7V319$P7G1.0P7BS-1P7MS0$P8N405-450/50-A$P8S cd8 - pac blue$P8R262144$P8B32$P8E0,0$P8V450$P8G1.0P8DISPLA
Re: [R] Reading FCS files with flowCore package
On 11/24/2014 06:18 AM, Luigi wrote: Dear all, I would like to use the R's Bioconductor package flowCore to do flow cytometry Please address questions about Bioconductor packages to the Bioconductor support site https://support.bioconductor.org and... analysis. I generated a FCS file using the file>export function of the FACSDiva Software Version 8 from a BD LSRII machine. I then used the functions: file.name <-system.file("extdata", "cd cells_FMO 8_003.fcs", package="flowCore") system.file() is used to access files installed in R packages, but probably you want to access your own file. Try file.name = file.choose() and selecting the file that you want to iniptu. Verify that the path is correct by displaying the result file.name Martin x <-read.FCS(file.name, transformation = FALSE) as shown in the flowCore: data structure package... vignette (20 May 2014) as available from the internet. However the result is an error: >Error in read.FCS(file.name, transformation = FALSE) : ' ' is not a valid file I then used the function: isFCSfile("cd cells_FMO 8_003.fcs") where cd cells_FMO 8_003.fcs is the name of the file. As expected I obtained the following message: >cd cells_FMO 8_003.fcs FALSE meaning I reckon that the file is not a FCS. Since I am completely new to this kind of analysis but I would not like to use flowJo, could anybody tell me how to load the FCS files? In the rest of the file I am pasting the beginning of the cd cells_FMO 8_003.fcs file for further reference (I can't attach the whole thing or even attaching the file because it is too big). From its gibberish I reckon that the encoding is probably wrong: I was expecting a flatfile after all not ASCII. Would the problem be how the run was exported? FlowJo however recognizes the files... Best regards, Luigi == FCS3.0 25619271933 1192532 0 0 $BEGINANALYSIS0$ENDANALYSIS0$BEGINSTEXT0$ENDSTEXT0$BEGINDATA1933 $ENDDATA1192532 $FIL180444.fcs$SYSWindows 7 6.1$TOT29765 $PAR10$MODEL$BYTEORD4,3,2,1$DATATYPEF$NEXTDATA0CREATORBD FACSDiva Software Version 8.0TUBE NAMEFMO 8$SRCcd cellsEXPERIMENT NAMEExperiment_001GUID4171c2f1-427b-4cc5-bf86-39bb76803c48$DATE31-OCT-2014 $BTIM16:07:12$ETIM16:09:25SETTINGSCytometerWINDOW EXTENSION0.00EXPORT USER NAMELuigiMarongiuEXPORT TIME31-OCT-2014-16:07:11FSC ASF0.78AUTOBSTRUE$INST $TIMESTEP0.01SPILL 3,405-450/50-A,405-655/8-A,405-525/50-A,1,0.0028442147740618787,0.0923076944711957,0,1,0,0.3425525014147933,0.08630456626553264,1 APPLY COMPENSATIONTRUETHRESHOLDFSC,5000$P1NTime$P1R262144$P1B32$P1E0,0 $P1G0.01P1BS0P1MS0$P2NFSC-A$P2R262144$P2B32$P2E0,0$P2V450$P2G 1.0P2DISPLAYLINP2BS-1P2MS0$P3NFSC-H$P3R262144$P3B32$P3E0,0$P3V 450$P3G1.0P3DISPLAYLINP3BS-1P3MS0$P4NFSC-W$P4R262144$P4B32$P4E 0,0$P4V450$P4G1.0P4BS-1P4MS0$P5NSSC-A$P5R262144$P5B32$P5E0,0 $P5V319$P5G1.0P5DISPLAYLINP5BS-1P5MS0$P6NSSC-H$P6R262144$P6B32 $P6E0,0$P6V319$P6G1.0P6DISPLAYLINP6BS-1P6MS0$P7NSSC-W$P7R262144 $P7B32$P7E0,0$P7V319$P7G1.0P7BS-1P7MS0$P8N405-450/50-A$P8Scd8 - pac blue$P8R262144$P8B32$P8E0,0$P8V450$P8G1.0P8DISPLAYLOGP8BS-1P8MS 0$P9N405-655/8-A$P9Scd45ra - q655$P9R262144$P9B32$P9E0,0$P9V450$P9G1.0P9DISPLAYLOGP9BS-1P9MS 0$P10N405-525/50-A$P10Sld - acqua$P10R262144$P10B32$P10E0,0$P10V450$P10G1.0P10DISPLAYLOGP10BS -1P10MS0CST BEADS EXPIREDFalse BHffEšùëGwI,E p F�ÑgG„F{¨ D˃×ÀG®CçË…BI33GAàõG¬‡GA G1ÊqGŒ ƒG"� Bôk…Ab=pBÜ.BI33EÝ-ÂG�ÊÀEÚ Fe�×G�h±Fc DN =ÀAë…C‰ÝqBK33FÀúG‚JF½– FVšG{ÚeF| Bp¤Cb=pAÊ BM33GõÇG¡Ã’GÁö G³ôãGš;G•Œ CÓ˜REY6�CiO\BO33EÑÞfGŠPlEÂ8 El G€4.E0 Cp¤ÃHýqC!™šBQ33FKòG�UùF6 FûG†¾vF Â-¸RC0À ÂJ BTffG^ùõG�m@G5L GH—îGŠÏüG8ø Ap¤Fœ�BÅõÂBVÌÍF¥Ý£G®ÑdFrä G8•ÐG‘÷âG!Ý C&¦fB—€ À�G®BZ F„Ž®G„)~F€b F±ÕŠG�´ôF ¢ Áâ=pB‡W B\)B]33FøuáGŽ0ÏFߪ G¸.G‡E¤G Õ Bâ=pF0fB=áHB_™šEÇÙÂG…ÒdE¿( FRÙGˆ˜ÈFE” DgŠáÃ…C•záBa™šF×Õ£G¥�ÌF¦Þ Fë¦HG�¬:FÐ~ BøuÂC#ž¸BAë…BbÌÍGœa€G”G‡: G$ÛOG‹Ü¡Gà C#ž¸Fï“=B�ffBhffGŽß^G“í¾Gw@ G~G�SFõ( BêQëFÕ¥…CW BhffFÜ (GƒÂOFÕÆ F}v)G¡ù˜FHL BÃð¤Âb=pB9× Bi™šG©MGœi0GŠŒ€GDžG“f·G*½ Bî\)C³Ç®CAë…Bj G5[ðG™sûGG G]¾G‹ö!G3 CR{F$HB�G®Bj FÜéG“/FÀ G(ñGŒGÛ B ffGxRCŒÍBlÌÍE××GŽ‡�EÁˆ E±…Gƒ»E¬ Àâ=pCtk…@�G®BnÌÍGœ›œG“cGˆ G/$;GŠë]G!` B¥£×FšŸBfG®BzffE³˜QG�:zE € F0y G€µ›F/€ D€EÂþ…Cš‡®B{™šG“¨\G—)JGz G,“G‘ª¬G¥ B�k…G(ÊÔBÜ.B|ffG*<ðGŒÚ}G´ F鈚G�Ø8FÎ` AúzáƒLÍB�\B|ÌÍG¬}\G˜G�œ€GY G�‘ÕG@) C‰ÝqGp¶CnB~ÌÍFñ–3G“ïpFÑ Fª¤G‰ŽFžà C�G®F9 etc. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave
Re: [R] Problem on annotation of Deseq2 on reportingtools
On 11/16/2014 10:25 AM, jarod...@libero.it wrote: Dear all!, I use this code: dds <- DESeq(ddHTSeq) res <-results(dds) #reporting library(ReportingTools) library("org.Hs.eg.db") des2Report <- HTMLReport(shortName ='RNAseq_analysis_DESeq2.html',title ='RNA-seq analysis of differential expression using DESeq2 ',reportDirectory = "./Reports") #publish(dds,des2Report,pvalueCutoff=0.05,annotation.db="org,Hs.eg.db") publish(dds,des2Report,pvalueCutoff=0.01,annotation.db="org.Hs.egENSEMBL2EG",factor=colData(dds)$condition,categorySize=5) finish(des2Report) and I have this error: Error in results(object, resultName) : 'contrast', as a character vector of length 3, should have the form: contrast = c('factorName','numeratorLevel','denominatorLevel'), see the manual page of ?results for more information is.factor(colData(dds)$condition) [1] TRUE What can I do? Please ask questions about Bioconductor packages on the Bioconductor support site https://support.bioconductor.org Martin sessionInfo() R version 3.1.1 (2014-07-10) Platform: x86_64-pc-linux-gnu (64-bit) locale: [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=en_US.UTF-8 [4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8 [7] LC_PAPER=en_US.UTF-8 LC_NAME=C LC_ADDRESS=C [10] LC_TELEPHONE=C LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C attached base packages: [1] parallel stats graphics grDevices utils datasets methods base other attached packages: [1] pvclust_1.2-2 gplots_2.13.0 genefilter_1.44.0 [4] ReportingTools_2.2.0 knitr_1.6 org.Hs.eg.db_2.10.1 [7] RSQLite_0.11.4DBI_0.2-7 annotate_1.40.1 [10] AnnotationDbi_1.24.0 Biobase_2.22.0biomaRt_2.18.0 [13] DESeq2_1.4.5 RcppArmadillo_0.4.300.8.0 Rcpp_0.11.2 [16] GenomicRanges_1.14.4 XVector_0.2.0 IRanges_1.20.7 [19] BiocGenerics_0.8.0 loaded via a namespace (and not attached): [1] AnnotationForge_1.4.4Biostrings_2.30.1biovizBase_1.10.8 [4] bitops_1.0-6 BSgenome_1.30.0 Category_2.28.0 [7] caTools_1.17 cluster_1.15.3 colorspace_1.2-4 [10] dichromat_2.0-0 digest_0.6.4 edgeR_3.4.2 [13] evaluate_0.5.5 formatR_0.10 Formula_1.1-1 [16] gdata_2.13.3 geneplotter_1.40.0 GenomicFeatures_1.14.5 [19] ggbio_1.10.16ggplot2_1.0.0GO.db_2.10.1 [22] GOstats_2.28.0 graph_1.40.1 grid_3.1.1 [25] gridExtra_0.9.1 GSEABase_1.24.0 gtable_0.1.2 [28] gtools_3.4.1 Hmisc_3.14-4 hwriter_1.3 [31] KernSmooth_2.23-13 lattice_0.20-29 latticeExtra_0.6-26 [34] limma_3.18.13locfit_1.5-9.1 MASS_7.3-34 [37] Matrix_1.1-4 munsell_0.4.2PFAM.db_2.10.1 [40] plyr_1.8.1 proto_0.3-10 RBGL_1.38.0 [43] RColorBrewer_1.0-5 RCurl_1.95-4.1 reshape2_1.4 [46] R.methodsS3_1.6.1R.oo_1.18.0 Rsamtools_1.14.3 [49] rtracklayer_1.22.7 R.utils_1.32.4 scales_0.2.4 [52] splines_3.1.1stats4_3.1.1 stringr_0.6.2 [55] survival_2.37-7 tools_3.1.1 VariantAnnotation_1.8.13 [58] XML_3.98-1.1 xtable_1.7-3 zlibbioc_1.8.0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] snow/Rmpi without MPI.spawn?
On 09/03/2014 10:24 PM, Leek, Jim wrote: Thanks for the tips. I'll take a look around for for loops in the morning. I think the example you provided worked for OpenMPI. (The default on our machine is MPICH2, but it gave the same error about calling spawn.) Anyway, with OpenMPI I got this: # salloc -n 12 orterun -n 1 R -f spawn.R library(Rmpi) ## Recent Rmpi bug -- should be mpi.universe.size() nWorkers <- mpi.universe.size() (the '## Recent Rmpi bug' comment should have been removed, it's a holdover from when the script was written several years ago) nslaves = 4 mpi.spawn.Rslaves(nslaves) The argument needs to be named mpi.spawn.Rslaves(nslaves=4) otherwise R matches unnamed arguments by position, and '4' is associated with the 'Rscript' argument. Martin Reported: 2 (out of 2) daemons - 4 (out of 4) procs Then it hung there. So things spawned anyway, which is progress. I'm just not sure is that expected behavior for parSupply or not. Jim -Original Message- From: Martin Morgan [mailto:mtmor...@fhcrc.org] Sent: Wednesday, September 03, 2014 5:08 PM To: Leek, Jim; r-help@r-project.org Subject: Re: [R] snow/Rmpi without MPI.spawn? On 09/03/2014 03:25 PM, Jim Leek wrote: I'm a programmer at a high-performance computing center. I'm not very familiar with R, but I have used MPI from C, C++, and Python. I have to run an R code provided by a guy who knows R, but not MPI. So, this fellow used the R snow library to parallelize his R code (theoretically, I'm not actually sure what he did.) I need to get this code running on our machines. However, Rmpi and snow seem to require mpi spawn, which our computing center doesn't support. I even tried building Rmpi with MPICH1 instead of 2, because Rmpi has that option, but it still tries to use spawn. I can launch plenty of processes, but I have to launch them all at once at the beginning. Is there any way to convince Rmpi to just use those processes rather than trying to spawn its own? I haven't found any documentation on this issue, although I would've thought it would be quite common. This script spawn.R === # salloc -n 12 orterun -n 1 R -f spawn.R library(Rmpi) ## Recent Rmpi bug -- should be mpi.universe.size() nWorkers <- mpi.universe.size() mpi.spawn.Rslaves(nslaves=nWorkers) mpiRank <- function(i) c(i=i, rank=mpi.comm.rank()) mpi.parSapply(seq_len(2*nWorkers), mpiRank) mpi.close.Rslaves() mpi.quit() can be run like the comment suggests salloc -n 12 orterun -n 1 R -f spawn.R uses slurm (or whatever job manager) to allocate resources for 12 tasks and spawn within that allocation. Maybe that's 'good enough' -- spawning within the assigned allocation? Likely this requires minimal modification of the current code. More extensive is to revise the manager/worker-style code to something more like single instruction, multiple data simd.R == ## salloc -n 4 orterun R --slave -f simd.R sink("/dev/null") # don't capture output -- more care needed here library(Rmpi) TAGS = list(FROM_WORKER=1L) .comm = 0L ## shared `work', here just determine rank and host work = c(rank=mpi.comm.rank(.comm), host=system("hostname", intern=TRUE)) if (mpi.comm.rank(.comm) == 0) { ## manager mpi.barrier(.comm) nWorkers = mpi.comm.size(.comm) res = list(nWorkers) for (i in seq_len(nWorkers - 1L)) { res[[i]] <- mpi.recv.Robj(mpi.any.source(), TAGS$FROM_WORKER, comm=.comm) } res[[nWorkers]] = work sink() # start capturing output print(do.call(rbind, res)) } else { ## worker mpi.barrier(.comm) mpi.send.Robj(work, 0L, TAGS$FROM_WORKER, comm=.comm) } mpi.quit() but this likely requires some serious code revision; if going this route then http://r-pbd.org/ might be helpful (and from a similar HPC environment). It's always worth asking whether the code is written to be efficient in R -- a typical 'mistake' is to write R-level explicit 'for' loops that "copy-and-append" results, along the lines of len <- 10 result <- NULL for (i in seq_len(len)) ## some complicated calculation, then... result <- c(result, sqrt(i)) whereas it's much better to "pre-allocate and fill" result <- integer(len) for (i in seq_len(len)) result[[i]] = sqrt(i) or lapply(seq_len(len), sqrt) and very much better still to 'vectorize' result <- sqrt(seq_len(len)) (timing for me are about 1 minute for "copy-and-append", .2 s for "pre-allocate and fill", and .002s for "vectorize"). Pushing back on the guy providing the code (grep for "for" loops, and look for that copy-and-append pattern) might save
Re: [R] snow/Rmpi without MPI.spawn?
On 09/03/2014 03:25 PM, Jim Leek wrote: I'm a programmer at a high-performance computing center. I'm not very familiar with R, but I have used MPI from C, C++, and Python. I have to run an R code provided by a guy who knows R, but not MPI. So, this fellow used the R snow library to parallelize his R code (theoretically, I'm not actually sure what he did.) I need to get this code running on our machines. However, Rmpi and snow seem to require mpi spawn, which our computing center doesn't support. I even tried building Rmpi with MPICH1 instead of 2, because Rmpi has that option, but it still tries to use spawn. I can launch plenty of processes, but I have to launch them all at once at the beginning. Is there any way to convince Rmpi to just use those processes rather than trying to spawn its own? I haven't found any documentation on this issue, although I would've thought it would be quite common. This script spawn.R === # salloc -n 12 orterun -n 1 R -f spawn.R library(Rmpi) ## Recent Rmpi bug -- should be mpi.universe.size() nWorkers <- mpi.universe.size() mpi.spawn.Rslaves(nslaves=nWorkers) mpiRank <- function(i) c(i=i, rank=mpi.comm.rank()) mpi.parSapply(seq_len(2*nWorkers), mpiRank) mpi.close.Rslaves() mpi.quit() can be run like the comment suggests salloc -n 12 orterun -n 1 R -f spawn.R uses slurm (or whatever job manager) to allocate resources for 12 tasks and spawn within that allocation. Maybe that's 'good enough' -- spawning within the assigned allocation? Likely this requires minimal modification of the current code. More extensive is to revise the manager/worker-style code to something more like single instruction, multiple data simd.R == ## salloc -n 4 orterun R --slave -f simd.R sink("/dev/null") # don't capture output -- more care needed here library(Rmpi) TAGS = list(FROM_WORKER=1L) .comm = 0L ## shared `work', here just determine rank and host work = c(rank=mpi.comm.rank(.comm), host=system("hostname", intern=TRUE)) if (mpi.comm.rank(.comm) == 0) { ## manager mpi.barrier(.comm) nWorkers = mpi.comm.size(.comm) res = list(nWorkers) for (i in seq_len(nWorkers - 1L)) { res[[i]] <- mpi.recv.Robj(mpi.any.source(), TAGS$FROM_WORKER, comm=.comm) } res[[nWorkers]] = work sink() # start capturing output print(do.call(rbind, res)) } else { ## worker mpi.barrier(.comm) mpi.send.Robj(work, 0L, TAGS$FROM_WORKER, comm=.comm) } mpi.quit() but this likely requires some serious code revision; if going this route then http://r-pbd.org/ might be helpful (and from a similar HPC environment). It's always worth asking whether the code is written to be efficient in R -- a typical 'mistake' is to write R-level explicit 'for' loops that "copy-and-append" results, along the lines of len <- 10 result <- NULL for (i in seq_len(len)) ## some complicated calculation, then... result <- c(result, sqrt(i)) whereas it's much better to "pre-allocate and fill" result <- integer(len) for (i in seq_len(len)) result[[i]] = sqrt(i) or lapply(seq_len(len), sqrt) and very much better still to 'vectorize' result <- sqrt(seq_len(len)) (timing for me are about 1 minute for "copy-and-append", .2 s for "pre-allocate and fill", and .002s for "vectorize"). Pushing back on the guy providing the code (grep for "for" loops, and look for that copy-and-append pattern) might save you from having to use parallel evaluation at all. Martin Thanks, Jim __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How should I do GO enrichment of differential expressed miRNA?
On 08/28/2014 11:47 PM, my1stbox wrote: Hi all, First, I carried out GO enrichment to predicted/validated target genes of those miRNA using GOstats package. Then I find myself in a dead end. So what is the good practice? Is it possible to directly do GO enrichment to miRNAs? Are they included in GO database? The Bioconductor mailing list http://bioconductor.org/help/mailing-list/mailform/ is a more appropriate forum for discussion of Bioconductor packages (like topGO). It's better to be more specific about what your question / problem is; 'dead end' might mean that you had technical problems, or that you managed to get results but that they were unsatisfactory for some specific reason, or... Martin Regards, Allen [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] dev-lang/R-3.1.0: biocLite("vsn") removes all files in /
On 05/19/2014 01:09 AM, Henric Winell wrote: On 2014-05-18 20:43, peter dalgaard wrote: On 18 May 2014, at 07:38 , Jeff Newmiller wrote: Then you had best not do it again if you don't like that result. 1) This is not the right mailing list for issues having to do with bioconductor. Please go to the bioconductor mailing list for that. Hmm, this is one case where I'd really rather not scare people off R-help. Non-BioC users do use BioC packages from time to time and what Juergen did is what the BioConductor web pages tells new users to do (probably minus the as-root bit). A warn-off on R-help seems entirely warranted. Good to see that Martin Morgan is taking the issue very seriously in his post below. As a non-BioC user using BioC packages I've always wondered why the standard R functionality isn't enough. Can someone, please, tell me why 'biocLite()' should be used? I've always succeeded installing BioC packages using the standard R tools (as indicated by Uwe in his reply). Conversely, I've always succeeded in installing CRAN and Bioc packages via source("http://bioconductor.org/biocLite.R";) biocLite(...) and am more-or-less flummoxed by the extra steps I'm asked to perform (to choose and set repositories) when I take that rare foray into install.packages()-land! One point is that http://bioconductor.org actually points to an Amazon CloudFront address, which means that the content comes from a geographically proximate and reliable location (this makes choice of repository mostly irrelevant for normal users, just point to bioconductor.org) Bioconductor has a repository and release schedule that differs from R (Bioconductor has a 'devel' branch to which new packages and updates are introduced, and a stable 'release' branch emitted once every 6 months to which bug fixes but not new features are introduced). A consequences of the mismatch between R and Bioconductor release schedules is that the Bioconductor version identified by Uwe's method is sometimes not the most recent 'release' available. For instance, R 3.1.1 will likely be introduced some months before the next Bioc release. After the Bioc release, 3.1.1 users will be pointed to an out-of-date version of Bioconductor. A consequence of the distinct 'devel' branch is that Uwe's method sometimes points only to the 'release' repository, whereas Bioconductor developers and users wanting leading-edge features wish to access the Bioconductor 'devel' repository. For instance, the next Bioc release will be available for R.3.1.x, so Bioconductor developers and leading-edge users need to be able to install the devel version of Bioconductor packages into the same version (though perhaps different instance or at least library location) of R that currently supports the release version. An indirect consequence of the structured release is that Bioconductor packages generally have more extensive dependencies with one another, both explicitly via the usual package mechanisms and implicitly because the repository, release structure, and Bioconductor community interactions favor re-use of data representations and analysis concepts across packages. There is thus a higher premium on knowing that packages are from the same release, and that all packages are current within the release. These days, the main purpose of source("http://bioconductor.org/biocLite.R";) is to install and attach the 'BiocInstaller' package. In a new installation, the script installs the most recent version of the BiocInstaller package relevant to the version of R in use, regardless of the relative times of R and Bioconductor release cycles. The BiocInstaller package serves as the primary way to identify the version of Bioconductor in use > library(BiocInstaller) Bioconductor version 2.14 (BiocInstaller 1.14.2), ?biocLite for help Since new features are often appealing to users, but at the same time require an updated version of Bioconductor, the source() command evaluated in an out-of-date R will nudge users to upgrade, e.g., in R-2.15.3 > source("http://bioconductor.org/biocLite.R";) A new version of Bioconductor is available after installing the most recent version of R; see http://bioconductor.org/install The biocLite() function is provided by BiocInstaller. This is a wrapper around install.packages, but with the repository chosen according to the version of Bioconductor in use, rather than to the version relevant at the time of the release of R. biocLite also nudges users to remain current within a release, by default checking for out-of-date packages and asking if the user would like to update > biocLite() BioC_mirror: http://bioconductor.org Using Bioconductor version 2.14 (BiocInstaller 1.14.2), R version 3.1.0. Old packages: '
Re: [R] dev-lang/R-3.1.0: biocLite("vsn") removes all files in /
This would be very bad and certainly unintended if it were the responsibility of biocLite. Can we communicate off-list about this? In particular can you report noquote(readLines("http://bioconductor.org/biocLite.R";)) ? Martin Morgan On 05/17/2014 10:16 PM, Juergen Rose wrote: I had the following files in /: root@caiman:/root(8)# ll / total 160301 drwxr-xr-x 2 root root 4096 May 16 12:23 bin/ drwxr-xr-x 6 root root 3072 May 14 13:58 boot/ -rw-r--r-- 1 root root 38673 May 14 14:22 boot_local-d.log lrwxrwxrwx 1 root root11 Jan 22 2011 data -> data_caiman/ drwxr-xr-x 7 root root 4096 Mar 9 22:29 data_caiman/ lrwxrwxrwx 1 root root23 Dec 29 13:43 data_impala -> /net/impala/data_impala/ lrwxrwxrwx 1 root root21 Jan 27 08:13 data_lynx2 -> /net/lynx2/data_lynx2/ drwxr-xr-x 21 root root 4040 May 14 14:40 dev/ drwxr-xr-x 160 root root 12288 May 17 17:14 etc/ -rw--- 1 root root 15687 Dec 26 13:42 grub.cfg_old lrwxrwxrwx 1 root root11 Jan 23 2011 home -> home_caiman/ drwxr-xr-x 5 root root 4096 Dec 26 11:31 home_caiman/ lrwxrwxrwx 1 root root23 Dec 29 13:43 home_impala -> /net/impala/home_impala/ lrwxrwxrwx 1 root root21 Jan 27 08:13 home_lynx2 -> /net/lynx2/home_lynx2/ lrwxrwxrwx 1 root root 5 Mar 30 04:25 lib -> lib64/ drwxr-xr-x 3 root root 4096 May 14 04:31 lib32/ drwxr-xr-x 17 root root 12288 May 16 12:23 lib64/ -rw-r--r-- 1 root root 1797418 May 14 14:22 login.log drwx-- 2 root root 16384 Jan 20 2011 lost+found/ drwxr-xr-x 2 root root 0 May 14 14:21 misc/ drwxr-xr-x 10 root root 4096 Nov 4 2013 mnt/ drwxr-xr-x 4 root root 0 May 17 17:38 net/ drwxr-xr-x 13 root root 4096 Feb 13 13:25 opt/ dr-xr-xr-x 270 root root 0 May 14 14:21 proc/ drwx-- 36 root root 4096 May 17 15:00 root/ drwxr-xr-x 30 root root 840 May 16 18:21 run/ drwxr-xr-x 2 root root 12288 May 16 12:23 sbin/ -rw-r--r-- 1 root root 162191459 Jan 13 2011 stage3-amd64-20110113.tar.bz2 dr-xr-xr-x 12 root root 0 May 14 14:21 sys/ drwxrwxrwt 16 root root 1648 May 17 17:14 tmp/ drwxr-xr-x 19 root root 4096 May 6 04:40 usr/ drwxr-xr-x 16 root root 4096 Dec 26 11:17 var/ Then I did as root: R source("http://bioconductor.org/biocLite.R";) biocLite("vsn") Save workspace image? [y/n/c]: n root@caiman:/root(15)# ll / total 93 drwxr-xr-x 2 root root 4096 May 16 12:23 bin/ drwxr-xr-x 6 root root 3072 May 14 13:58 boot/ drwxr-xr-x 7 root root 4096 Mar 9 22:29 data_caiman/ drwxr-xr-x 21 root root 4040 May 14 14:40 dev/ drwxr-xr-x 160 root root 12288 May 17 17:14 etc/ drwxr-xr-x 5 root root 4096 Dec 26 11:31 home_caiman/ drwxr-xr-x 3 root root 4096 May 14 04:31 lib32/ drwxr-xr-x 17 root root 12288 May 16 12:23 lib64/ drwx-- 2 root root 16384 Jan 20 2011 lost+found/ drwxr-xr-x 2 root root 0 May 14 14:21 misc/ drwxr-xr-x 10 root root 4096 Nov 4 2013 mnt/ drwxr-xr-x 2 root root 0 May 17 17:38 net/ drwxr-xr-x 13 root root 4096 Feb 13 13:25 opt/ dr-xr-xr-x 272 root root 0 May 14 14:21 proc/ drwx-- 36 root root 4096 May 17 15:00 root/ drwxr-xr-x 30 root root 840 May 16 18:21 run/ drwxr-xr-x 2 root root 12288 May 16 12:23 sbin/ dr-xr-xr-x 12 root root 0 May 17 17:38 sys/ drwxrwxrwt 19 root root 1752 May 17 18:33 tmp/ drwxr-xr-x 19 root root 4096 May 6 04:40 usr/ drwxr-xr-x 16 root root 4096 Dec 26 11:17 var/ I.e., all not directory files in / disappeared. This happens on two systems. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] flowDensity package
On 04/09/2014 07:23 AM, Raghu wrote: I am unable to install flowDensity package from bioconductor in R version 3.0 or 3.1. did anyone have the same problems with this. Please ask questions about Bioconductor packages on the Bioconductor mailing list http://bioconductor.org/help/mailing-list/ but as far as I can tell flowDensity is not a Bioconductor package! http://bioconductor.org/packages/release/BiocViews.html#___Software Don't forget to provide the output of the R command sessionInfo() to let us know about your operating system and R version. Martin Thanks, Raghu __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] memory use of copies
Hi Ross -- On 01/23/2014 05:53 PM, Ross Boylan wrote: [Apologies if a duplicate; we are having mail problems.] I am trying to understand the circumstances under which R makes a copy of an object, as opposed to simply referring to it. I'm talking about what goes on under the hood, not the user semantics. I'm doing things that take a lot of memory, and am trying to minimize my use. I thought that R was clever so that copies were created lazily. For example, if a is matrix, then b <- a b & a referred to to the same object underneath, so that a complete duplicate (deep copy) wasn't made until it was necessary, e.g., b[3, 1] <- 4 would duplicate the contents of a to b, and then overwrite them. Compiling your R with --enable-memory-profiling gives access to the tracemem() function, showing that your understanding above is correct > b = matrix(0, 3, 2) > tracemem(b) [1] "<0x7054020>" > a = b## no copy > b[3, 1] = 2 ## copy tracemem[0x7054020 -> 0x7053fc8]: > b = matrix(0, 3, 2) > tracemem(b) > tracemem(b) [1] "<0x680e258>" > b[3, 1] = 2 ## no copy > The same is apparent using .Internal(inspect()), where the first information @7053ec0 is the address of the data. The other relevant part is the 'NAM()' field, which indicates whether there are 0, 1 or (have been) at least 2 symbols referring to the data. NAM() increments from 1 (no duplication on modify required) on original creation to 2 when a = b (duplicate on modify) > b = matrix(0, 3, 2) > .Internal(inspect(b)) @7053ec0 14 REALSXP g0c4 [NAM(1),ATT] (len=6, tl=0) 0,0,0,0,0,... ATTRIB: @7057528 02 LISTSXP g0c0 [] TAG: @21c5fb8 01 SYMSXP g0c0 [LCK,gp=0x4000] "dim" (has value) @7056858 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 3,2 > b[3, 1] = 2 > .Internal(inspect(b)) @7053ec0 14 REALSXP g0c4 [NAM(1),ATT] (len=6, tl=0) 0,0,2,0,0,... ATTRIB: @7057528 02 LISTSXP g0c0 [] TAG: @21c5fb8 01 SYMSXP g0c0 [LCK,gp=0x4000] "dim" (has value) @7056858 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 3,2 > a = b > .Internal(inspect(b)) ## data address unchanced @7053ec0 14 REALSXP g0c4 [NAM(2),ATT] (len=6, tl=0) 0,0,0,0,0,... ATTRIB: @7057528 02 LISTSXP g0c0 [] TAG: @21c5fb8 01 SYMSXP g0c0 [LCK,gp=0x4000] "dim" (has value) @7056858 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 3,2 > b[3, 1] = 2 > .Internal(inspect(b)) ## data address changed @7232910 14 REALSXP g0c4 [NAM(1),ATT] (len=6, tl=0) 0,0,2,0,0,... ATTRIB: @7239d28 02 LISTSXP g0c0 [] TAG: @21c5fb8 01 SYMSXP g0c0 [LCK,gp=0x4000] "dim" (has value) @7237b48 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 3,2 The following log, from R 3.0.1, does not seem to act that way; I get the same amount of memory used whether I copy the same object repeatedly or create new objects of the same size. Can anyone explain what is going on? Am I just wrong that copies are initially shallow? Or perhaps that behavior only applies for function arguments? Or doesn't apply for class slots or reference class variables? > foo <- setRefClass("foo", fields=list(x="ANY")) > bar <- setClass("bar", slots=c("x")) using the approach above, we can see that creating an S4 or reference object in the way you've indicated (validity checks or other initialization might change this) does not copy the data although it is marked for duplication > x = 1:2; .Internal(inspect(x)) @7553868 13 INTSXP g0c1 [NAM(1)] (len=2, tl=0) 1,2 > .Internal(inspect(foo(x=x)$x)) @7553868 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 > .Internal(inspect(bar(x=x)@x)) @7553868 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 On the other hand, lapply is creating copies > x = 1:2; .Internal(inspect(x)) @757b5a8 13 INTSXP g0c1 [NAM(1)] (len=2, tl=0) 1,2 > .Internal(inspect(lapply(1:2, function(i) x))) @7551f88 19 VECSXP g0c2 [] (len=2, tl=0) @757b428 13 INTSXP g0c1 [] (len=2, tl=0) 1,2 @757b3f8 13 INTSXP g0c1 [] (len=2, tl=0) 1,2 One can construct a list without copies > x = 1:2; .Internal(inspect(x)) @7677c18 13 INTSXP g0c1 [NAM(1)] (len=2, tl=0) 1,2 > .Internal(inspect(list(x)[rep(1, 2)])) @767b080 19 VECSXP g0c2 [NAM(2)] (len=2, tl=0) @7677c18 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 @7677c18 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 but that (creating a list of identical elements) doesn't seem to be a likely real-world scenario and the gain is transient > x = 1:2; y = list(x)[rep(1, 4)] > .Internal(inspect(y)) @507bef8 19 VECSXP g0c3 [NAM(2)] (len=4, tl=0) @514ff98 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 @514ff98 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 @514ff98 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 @514ff98 13 INTSXP g0c1 [NAM(2)] (len=2, tl=0) 1,2 > y[[1]][1] = 2L ## everybody copied > .Internal(inspect(y)) @507bf40 19 VECSXP g0c3 [NAM(1)] (len=4, tl=0) @51502c8 13 INTSXP g0c1 [] (len=2, tl=0) 2,2 @51502f8 13 INTSXP g0c1 [] (len=2, tl=0) 1,2 @5150328 13 INTSXP g0c1 [] (len=2, tl=0) 1,2 @5150358 13 INTSXP g0c1 [] (len=2, tl=0) 1,2 Probably it is more helpful to think of r
Re: [R] Error in dispersionPlot using cummeRbund
Hi Nancy -- cummeRbund is a Bioconductor package so please ask questions about it on the Bioconductor mailing list. http://bioconductor.org/help/mailing-list/mailform/ Be sure to include the maintainer packageDescription("cummeRbund")$Maintainer in the email. You have the 'latest' version of cummeRbund for R-2.15.3; a more recent version is available when using R-3.0.2. Martin On 01/05/2014 08:12 AM, Yanxiang Shi wrote: Hi all, I'm new to RNA-seq analysis. And I'm now trying to use R to visualize the Galaxy data. I'm using the cummeRbund to deal with the data from cuffdiff in Galaxy. Here is the codes I've run: cuff= readCufflinks (dbFile = "output_database", geneFPKM = "gene_FPKM_tracking", geneDiff = "gene_differential_expression_testing", isoformFPKM = "transcript_FPKM_tracking",isoformDiff = "transcript_differential_expression_testing", TSSFPKM = "TSS_groups_FPKM_tracking", TSSDiff = "TSS_groups_differential_expression_testing", CDSFPKM = "CDS_FPKM_tracking", CDSExpDiff = "CDS_FPKM_differential_expression_testing", CDSDiff = "CDS_overloading_diffential_expression_testing", promoterFile = "promoters_differential_expression_testing", splicingFile = "splicing_differential_expression_testing", rebuild = T) cuff CuffSet instance with: 2 samples 26 genes 44 isoforms 36 TSS 0 CDS 26 promoters 36 splicing 0 relCDS disp<-dispersionPlot(genes(cuff)) disp *Error in `$<-.data.frame`(`*tmp*`, "SCALE_X", value = 1L) : replacement has 1 rows, data has 0 In addition: Warning message:In max(panels$ROW) : no non-missing arguments to max; returning -Inf* Does any one know why there's error? My cummeRbund is the latest version, R is 2.15.3, and cuffdiff v1.3.0. I've tried to search the internet for solutions but apparently it's not a problem that people discussed much. Thank you very much in advance!!! Nancy [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S4; Setter function is not chaning slot value as expected
On 11/09/2013 11:31 PM, Hadley Wickham wrote: Modelling a mutable entity, i.e. an account, is really a perfect example of when to use reference classes. You might find the examples on http://adv-r.had.co.nz/OO-essentials.html give you a better feel for the strengths and weaknesses of R's different OO systems. Reference classes provide less memory copying and a more familiar programming paradigm but not necessarily fantastic performance, as illustrated here http://stackoverflow.com/questions/18677696/stack-class-in-r-something-more-concise/18678440#18678440 and I think elsewhere on this or the R-devel list (sorry not to be able to provide a more precise recollection). Martin Hadley On Sat, Nov 9, 2013 at 9:31 AM, daniel schnaider wrote: It is my first time programming with S4 and I can't get the setter fuction to actually change the value of the slot created by the constructor. I guess it has to do with local copy, global copy, etc. of the variable - but, I could't find anything relevant in documentation. Tried to copy examples from the internet, but they had the same problem. # The code setClass ("Account" , representation ( customer_id = "character", transactions = "matrix") ) Account <- function(id, t) { new("Account", customer_id = id, transactions = t) } setGeneric ("CustomerID<-", function(obj, id){standardGeneric("CustomerID<-")}) setReplaceMethod("CustomerID", "Account", function(obj, id){ obj@customer_id <- id obj }) ac <- Account("12345", matrix(c(1,2,3,4,5,6), ncol=2)) ac CustomerID <- "54321" ac #Output > ac An object of class "Account" Slot "customer_id": [1] "12345" Slot "transactions": [,1] [,2] [1,]14 [2,]25 [3,]36 # CustomerID is value has changed to 54321, but as you can see it does't > CustomerID <- "54321" > ac An object of class "Account" Slot "customer_id": [1] "12345" Slot "transactions": [,1] [,2] [1,]14 [2,]25 [3,]36 Help! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S4; Setter function is not chaning slot value as expected
On 11/10/2013 03:54 AM, daniel schnaider wrote: Thanks Martin. It worked well. Two new questions related to the same subject. 1) Why create this semantic of a final argument name specifically names value? I do not know. It is a requirement of replacement methods in R in general, not just S4 methods. See section 3.4.4 of RShowDoc("R-lang"). 2) Regarding performance. When CustomerID(ac) <- "54321" runs, does it only change the slot from whatever it was to 54321, or it really create another object and change all the value of all slots, keeping technically all the other values equal and changing 54321? Copying is tricky in R. It behaves as though a copy has been made of the entire object. Whether a copy is actually made, or just marked as necessary on subsequent modification, requires deep consideration of the code. This is the way R works, not just the way S4 classes work. If instead of a single account you modelled 'Accounts', i.e., all accounts, then updating 1000 account id's would only make one copy, whereas if you model each account separately this would require 1000 copies. Martin thanks.. On Sat, Nov 9, 2013 at 4:20 PM, Martin Morgan mailto:mtmor...@fhcrc.org>> wrote: On 11/09/2013 06:31 AM, daniel schnaider wrote: It is my first time programming with S4 and I can't get the setter fuction to actually change the value of the slot created by the constructor. I guess it has to do with local copy, global copy, etc. of the variable - but, I could't find anything relevant in documentation. Tried to copy examples from the internet, but they had the same problem. # The code setClass ("Account" , representation ( customer_id = "character", transactions = "matrix") ) Account <- function(id, t) { new("Account", customer_id = id, transactions = t) } setGeneric ("CustomerID<-", function(obj, id){standardGeneric("__CustomerID<-")}) Replacement methods (in R in general) require that the final argument (the replacement value) be named 'value', so setGeneric("CustomerID<-", function(x, ..., value) standardGeneric("CustomerID")) setReplaceMethod("CustomerID", c("Account", "character"), function(x, , value) { x@customer_id <- value x }) use this as CustomerID(ac) <- "54321" setReplaceMethod("CustomerID", "Account", function(obj, id){ obj@customer_id <- id obj }) ac <- Account("12345", matrix(c(1,2,3,4,5,6), ncol=2)) ac CustomerID <- "54321" ac #Output > ac An object of class "Account" Slot "customer_id": [1] "12345" Slot "transactions": [,1] [,2] [1,]14 [2,]25 [3,]36 # CustomerID is value has changed to 54321, but as you can see it does't > CustomerID <- "54321" > ac An object of class "Account" Slot "customer_id": [1] "12345" Slot "transactions": [,1] [,2] [1,]14 [2,]25 [3,]36 Help! [[alternative HTML version deleted]] R-help@r-project.org <mailto:R-help@r-project.org> mailing list https://stat.ethz.ch/mailman/__listinfo/r-help <https://stat.ethz.ch/mailman/listinfo/r-help> PLEASE do read the posting guide http://www.R-project.org/__posting-guide.html <http://www.R-project.org/posting-guide.html> and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 -- Daniel Schnaider SP Phone: +55-11-9.7575.0822 d...@scaigroup.com <mailto:d...@scaigroup.com> skype dschnaider Linked In: http://www.linkedin.com/in/danielschnaider w <http://www.arkiagroup.c
Re: [R] S4 vs S3. New Package
On 11/09/2013 11:59 AM, Rolf Turner wrote: For my take on the issue see fortune("strait jacket"). cheers, Rolf Turner P. S. I said that quite some time ago and I have seen nothing in the intervening years to change my views. Mileage varies; the Bioconductor project attains a level of interoperability and re-use (http://www.nature.com/nbt/journal/v31/n10/full/nbt.2721.html) that would be difficult with a less formal class system. R. T. On 11/10/13 04:22, daniel schnaider wrote: Hi, I am working on a new credit portfolio optimization package. My question is if it is more recommended to develop in S4 object oriented or S3. It would be more naturally to develop in object oriented paradigm, but there is many concerns regarding S4. 1) Performance of S4 could be an issue as a setter function, actually changes the whole object behind the scenes. Depending on implementation, updating S3 objects could as easily trigger copies; this is a fact of life in R. Mitigate by modelling objects in a vector (column)-oriented approach rather than the row-oriented paradigm of Java / C++ / etc. Martin Morgan 2) Documentation. It has been really hard to find examples in S4. Most books and articles consider straightforward S3 examples. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] S4; Setter function is not chaning slot value as expected
On 11/09/2013 06:31 AM, daniel schnaider wrote: It is my first time programming with S4 and I can't get the setter fuction to actually change the value of the slot created by the constructor. I guess it has to do with local copy, global copy, etc. of the variable - but, I could't find anything relevant in documentation. Tried to copy examples from the internet, but they had the same problem. # The code setClass ("Account" , representation ( customer_id = "character", transactions = "matrix") ) Account <- function(id, t) { new("Account", customer_id = id, transactions = t) } setGeneric ("CustomerID<-", function(obj, id){standardGeneric("CustomerID<-")}) Replacement methods (in R in general) require that the final argument (the replacement value) be named 'value', so setGeneric("CustomerID<-", function(x, ..., value) standardGeneric("CustomerID")) setReplaceMethod("CustomerID", c("Account", "character"), function(x, , value) { x@customer_id <- value x }) use this as CustomerID(ac) <- "54321" setReplaceMethod("CustomerID", "Account", function(obj, id){ obj@customer_id <- id obj }) ac <- Account("12345", matrix(c(1,2,3,4,5,6), ncol=2)) ac CustomerID <- "54321" ac #Output > ac An object of class "Account" Slot "customer_id": [1] "12345" Slot "transactions": [,1] [,2] [1,]14 [2,]25 [3,]36 # CustomerID is value has changed to 54321, but as you can see it does't > CustomerID <- "54321" > ac An object of class "Account" Slot "customer_id": [1] "12345" Slot "transactions": [,1] [,2] [1,]14 [2,]25 [3,]36 Help! [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Inserting 17M entries into env took 18h, inserting 34M entries taking 5+ days
On 11/01/2013 08:22 AM, Magnus Thor Torfason wrote: Sure, I was attempting to be concise and boiling it down to what I saw as the root issue, but you are right, I could have taken it a step further. So here goes. I have a set of around around 20M string pairs. A given string (say, A) can either be equivalent to another string (B) or not. If A and B occur together in the same pair, they are equivalent. But equivalence is transitive, so if A and B occur together in one pair, and A and C occur together in another pair, then A and C are also equivalent. I need a way to quickly determine if any two strings from my data set are equivalent or not. Do you mean that if A,B occur together and B,C occur together, then A,B and A,C are equivalent? Here's a function that returns a unique identifier (not well tested!), allowing for transitive relations but not circularity. uid <- function(x, y) { i <- seq_along(x) # global index xy <- paste0(x, y) # make unique identifiers idx <- match(xy, xy) repeat { ## transitive look-up y_idx <- match(y[idx], x) # look up 'y' in 'x' keep <- !is.na(y_idx) if (!any(keep)) # no transitive relations, done! break x[idx[keep]] <- x[y_idx[keep]] y[idx[keep]] <- y[y_idx[keep]] ## create new index of values xy <- paste0(x, y) idx <- match(xy, xy) } idx } Values with the same index are identical. Some tests > x <- c(1, 2, 3, 4) > y <- c(2, 3, 5, 6) > uid(x, y) [1] 1 1 1 4 > i <- sample(x); uid(x[i], y[i]) [1] 1 1 3 1 > uid(as.character(x), as.character(y)) ## character() ok [1] 1 1 1 4 > uid(1:10, 1 + 1:10) [1] 1 1 1 1 1 1 1 1 1 1 > uid(integer(), integer()) integer(0) > x <- c(1, 2, 3) > y <- c(2, 3, 1) > uid(x, y) ## circular! C-c C-c I think this will scale well enough, but the worst-case scenario can be made to be log(longest chain) and copying can be reduced by using an index i and subsetting the original vector on each iteration. I think you could test for circularity by checking that the updated x are not a permutation of the kept x, all(x[y_idx[keep]] %in% x[keep])) Martin The way I do this currently is to designate the smallest (alphabetically) string in each known equivalence set as the "main" entry. For each pair, I therefore insert two entries into the hash table, both pointing at the mail value. So assuming the input data: A,B B,C D,E I would then have: A->A B->A C->B D->D E->D Except that I also follow each chain until I reach the end (key==value), and insert pointers to the "main" value for every value I find along the way. After doing that, I end up with: A->A B->A C->A D->D E->D And I can very quickly check equivalence, either by comparing the hash of two strings, or simply by transforming each string into its hash, and then I can use simple comparison from then on. The code for generating the final hash table is as follows: h : Empty hash table created with hash.new() d : Input data hash.deep.get : Function that iterates through the hash table until it finds a key whose value is equal to itself (until hash.get(X)==X), then returns all the values in a vector h = hash.new() for ( i in 1:nrow(d) ) { deep.a = hash.deep.get(h, d$a[i]) deep.b = hash.deep.get(h, d$b[i]) equivalents = sort(unique(c(deep.a,deep.b))) equiv.id= min(equivalents) for ( equivalent in equivalents ) { hash.put(h, equivalent, equiv.id) } } I would so much appreciate if there was a simpler and faster way to do this. Keeping my fingers crossed that one of the R-help geniuses who sees this is sufficiently interested to crack the problem Best, Magnus On 11/1/2013 1:49 PM, jim holtman wrote: It would be nice if you followed the posting guidelines and at least showed the script that was creating your entries now so that we understand the problem you are trying to solve. A bit more explanation of why you want this would be useful. This gets to the second part of my tag line: Tell me what you want to do, not how you want to do it. There may be other solutions to your problem. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Fri, Nov 1, 2013 at 9:32 AM, Magnus Thor Torfason wrote: Pretty much what the subject says: I used an env as the basis for a Hashtable in R, based on information that this is in fact the way environments are implemented under the hood. I've been experimenting with doubling the number of entries, and so far it has seemed to be scaling more or less linearly, as expected. But as I went from 17 million entries to 34 million entries, the completion time has gone from 18 hou
Re: [R] S4 base class
On 10/17/2013 08:54 AM, Michael Meyer wrote: Suppose you have a base class "Base" which implements a function "Base::F" which works in most contexts but not in the context of "ComplicatedDerived" class where some preparation has to happen before this very same function can be called. You would then define void ComplicatedDerived::F(...){ preparation(); Base::F(); } You can nealry duplicate this in R via setMethod("F", signature(this="ComplicatedDerived"), definition=function(this){ preparation(this) F(as(this,"Base")) }) but it will fail whenever F uses virtual functions (i.e. generics) which are only defined for derived classes of Base With .A <- setClass("A", representation(a="numeric")) .B <- setClass("B", representation(b="numeric"), contains="A") setGeneric("f", function(x, ...) standardGeneric("f")) setMethod("f", "A", function(x, ...) { message("f,A-method") g(x, ...) # generic with methods only for derived classes }) setMethod("f", "B", function(x, ...) { message("f,B-method") callNextMethod(x, ...) # earlier response from Duncan Murdoch }) setGeneric("g", function(x, ...) standardGeneric("g")) setMethod("g", "B", function(x, ...) { message("g,B-method") x }) one has > f(.B()) f,B-method f,A-method g,B-method An object of class "B" Slot "b": numeric(0) Slot "a": numeric(0) ? -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] how to merge GRange object?
On 10/16/2013 06:32 AM, John linux-user wrote: Hello everyone, I am wondering how to simply merge two GRanges objects by range field and add the value by additional vector. For example, I have two objects below Hi -- GRanges is from a Bioconductor package, so please ask on the Bioconductor mailing list http://bioconductor.org/help/mailing-list/ I think you might do hits = findOverlaps(obj1, obj2) to get indexes of overlapping ranges, then pmin(obj1[queryHits(obj1)], obj2[subjectHits(obj2)]) and pmax() to get start and end coordinates, and construct a new GRanges from those. If you provide an easily reproducile example (e.g., constructing some sample GRanges objects 'by hand' using GRanges()) and post to the Bioconductor mailing list you'll likely get a complete answer. Martin obj1 seqnames ranges strand | Val | [1] chr1_random [272531, 272571] + |88 [2] chr1_random [272871, 272911] + |45 obj2 seqnames ranges strand | Val | [1] chr1_random [272531, 272581] + |800 [2] chr1_random [272850, 272911] + |450 after merged, it should be an object as the following mergedObject and it would concern the differences in IRANGE data (e.g. 581 and 850 in obj2 above were different from those of obj1, which were 571 and 871 respectively) mergedObject seqnames ranges strand | object2Val object1Val | [1] chr1_random [272531, 272581] + |800 88 [2] chr1_random [272850, 272911] + |450 45 On Wednesday, October 16, 2013 8:31 AM, Terry Therneau wrote: On 10/16/2013 05:00 AM, r-help-requ...@r-project.org wrote: Hello, I'm trying to use coxph() function to fit a very simple Cox proportional hazards regression model (only one covariate) but the parameter space is restricted to an open set (0, 1). Can I still obtain a valid estimate by using coxph function in this scenario? If yes, how? Any suggestion would be greatly appreciated. Thanks!!! Easily: 1. Fit the unrestricted model. If the solution is in 0-1 you are done. 2. If it is outside, fix the coefficient. Say that the solution is 1.73, then the optimal solution under contraint is 1. Redo the fit adding the paramters "init=1, iter=0". This forces the program to give the loglik and etc for the fixed coefficient of 1.0. Terry Therneau __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Bioconductor / AnnotationDbi: Why does a GOAllFrame contain more rows than its argument GoFrame?
On 10/02/2013 09:28 AM, Asis Hallab wrote: Dear Bioconductor Experts, This will be responded to on the Bioconductor mailing list; please address any follow-ups there. http://bioconductor.org/help/mailing-list/ Martin thank you for providing such a useful tool-set. I have a question regarding the package AnnotationDbi, specifically the classes GOFrame and GOALLFrame. During a GO Enrichment Analysis I create a data frame with Arabidopsis thaliana GO annotations and from that first a GOFrame and than from this GOFrame a GOALLFrame. Checking the result with nrow( getGOFrameData( athal.go.all.frame ) ) # The GOAllFrame and comparing it with nrow( athal.go.frame ) # The GoFrame I realize that the GOALLFrame has more than 5 times more rows than my original GO annotation table. If I provide organism='Arabidopsis thaliana' to the constructor of GOFrame this ratio increases even further. Unfortunately I could not find any documentation on this, so I feel forced to bother you with my questions: 1) Why does GOALLFrame so many more annotations? 2) Why and from where does it retrieve the organism specific ones that are added when a model organism like 'Arabidopsis thaliana' is provided? 3) I suspected that all ancestors of annotated terms are added, but when I did so myself, I still got less GO term annotations? So do you add ancestors of the "is_a" type and possibly other relationship types like "part_of" etc. ? Please let me know your answers soon. Your help will be much appreciated. Kind regards! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] EdgeR annotation
On 08/24/2013 04:50 AM, Robin Mjelle wrote: after updating R and edgeR I lost the annotations in the final Diff.Expressed matrix (toptags) when running the edgeR pipeline. How do I get the row.names from the data matrix into the topTag-matrix? data <- read.table("KO_and_WT_Summary_miRNA_Expression.csv", row.names=1, sep="", header=T) edgeR is a Bioconductor package, so please ask on their mailing list (no subscription required!) http://bioconductor.org/help/mailing-list/ Remember to provide a reproducible example (people on the list will not be able to created your 'data' object; perhaps working with the simulated data on the help page ?glmFit is a good place to start?) and to include the output of sessionInfo() so that there is no ambiguity about the software version you are using. Martin keep <- rowSums(cpm(data)>2) >=2 data <- data[keep, ] table(keep) y <- DGEList(counts=data[,1:18], genes=data[,0:1]) y <- calcNormFactors(y) y$samples plotMDS(y,main="") Time=c("0.25h","0.5h","1h","2h","3h","6h","12h","24h","48h","0.25h","0.5h","1h","2h","3h","6h","12h","24h","48h") Condition=c("KO","KO","KO","KO","KO","KO","KO","KO","KO","WT","WT","WT","WT","WT","WT","WT","WT","WT") design <- model.matrix(~0+Time+Condition) rownames(design) <- colnames(y) y <- estimateGLMCommonDisp(y, design, verbose=TRUE, method="deviance",robust=TRUE, subset=NULL) y <- estimateGLMTrendedDisp(y, design) y <- estimateGLMTagwiseDisp(y, design) fit <- glmFit(y, design) lrt <- glmLRT(fit) topTags(lrt) Coefficient: ConditionWT genes logFClogCPMLR PValue FDR 189 5128 -11.028422 7.905804 4456.297 0 0 188 12271 -10.582267 9.061326 5232.075 0 0 167 121120 -9.831894 12.475576 5957.104 0 0 34 255235 -9.771266 13.592968 7355.592 0 0 168 311906 -9.597952 13.907951 10710.111 0 0 166 631262 -9.592550 14.932018 11719.222 0 0 79 79 9.517226 11.466696 7964.269 0 0 169 2512 -8.946429 6.758584 2502.548 0 0 448 3711 -7.650068 7.764682 2914.784 0 0 32 260769 -7.412197 13.633352 4906.198 0 0 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Method dispatch in S4
On 08/09/2013 07:45 AM, Bert Gunter wrote: Simon: Have a look at the "proto" package for which there is a vignette. You may find it suitable for your needs and less intimidating. Won't help much with S4, though! Some answers here http://stackoverflow.com/questions/5437238/which-packages-make-good-use-of-s4-objects including from Bioconductor simple class in EBImage, the advanced IRanges package and the 'toy' StudentGWAS. Martin Cheers, Bert On Fri, Aug 9, 2013 at 7:40 AM, Simon Zehnder wrote: Hi Martin, thank you very much for this profound answer! Your added design advice is very helpful, too! For the 'simple example': Sometimes I am still a little overwhelmed from a certain setting in the code and my ideas how I want to handle a process. But I learn from session to session. In future I will also span the lines more than 80 columns. I am used to the indent in my vim editor. I have one further issue: I do know, that you are one of the leading developers of the bioconductor package which uses (as far as I have read) extensively OOP in R. Is there a package you could suggest to me to learn from by reading and understanding the code? Where can I find the source code? Best Simon On Aug 8, 2013, at 10:00 PM, Martin Morgan wrote: On 08/04/2013 02:13 AM, Simon Zehnder wrote: So, I found a solution: First in the "initialize" method of class C coerce the C object into a B object. Then call the next method in the list with the B class object. Now, in the "initialize" method of class B the object is a B object and the respective "generateSpec" method is called. Then, in the "initialize" method of C the returned object from "callNextMethod" has to be written to the C class object in .Object. See the code below. setMethod("initialize", "C", function(.Object, value) {.Object@c <- value; object <- as(.Object, "B"); object <- callNextMethod(object, value); as(.Object, "B") <- object; .Object <- generateSpec(.Object); return(.Object)}) This setting works. I do not know though, if this setting is the "usual" way such things are done in R OOP. Maybe the whole class design is disadvantageous. If anyone detects a mistaken design, I am very thankful to learn. Hi Simon -- your 'simple' example is pretty complicated, and I didn't really follow it in detail! The code is not formatted for easy reading (e.g., lines spanning no more than 80 columns) and some of it (e.g., generateSpec) might not be necessary to describe the problem you're having. A good strategy is to ensure that 'new' called with no arguments works (there are other solutions, but following this rule has helped me to keep my classes and methods simple). This is not the case for new("A") new("C") The reason for this strategy has to do with the way inheritance is implemented, in particular the coercion from derived to super class. Usually it is better to provide default values for arguments to initialize, and to specify arguments after a '...'. This means that your initialize methods will respects the contract set out in ?initialize, in particular the handling of unnamed arguments: ...: data to include in the new object. Named arguments correspond to slots in the class definition. Unnamed arguments must be objects from classes that this class extends. I might have written initialize,A-method as setMethod("initialize", "A", function(.Object, ..., value=numeric()){ .Object <- callNextMethod(.Object, ..., a=value) generateSpec(.Object) }) Likely in a subsequent iteration I would have ended up with (using the convention that function names preceded by '.' are not exported) .A <- setClass("A", representation(a = "numeric", specA = "numeric")) .generateSpecA <- function(a) { 1 / a } A <- function(a=numeric(), ...) { specA <- .generateSpecA(a) .A(..., a=a, specA=specA) } setMethod(generateSpec, "A", function(object) { .generateSpecA(object@a) }) ensuring that A() returns a valid object and avoiding the definition of an initialize method entirely. Martin Best Simon On Aug 3, 2013, at 9:43 PM, Simon Zehnder wrote: setMethod("initialize", "C", function(.Object, value) {.Object@c <- value; .Object <- callNextMethod(.Object, value); .Object <- generateSpec(.Object); return(.Object)}) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Ca
Re: [R] Method dispatch in S4
On 08/04/2013 02:13 AM, Simon Zehnder wrote: So, I found a solution: First in the "initialize" method of class C coerce the C object into a B object. Then call the next method in the list with the B class object. Now, in the "initialize" method of class B the object is a B object and the respective "generateSpec" method is called. Then, in the "initialize" method of C the returned object from "callNextMethod" has to be written to the C class object in .Object. See the code below. setMethod("initialize", "C", function(.Object, value) {.Object@c <- value; object <- as(.Object, "B"); object <- callNextMethod(object, value); as(.Object, "B") <- object; .Object <- generateSpec(.Object); return(.Object)}) This setting works. I do not know though, if this setting is the "usual" way such things are done in R OOP. Maybe the whole class design is disadvantageous. If anyone detects a mistaken design, I am very thankful to learn. Hi Simon -- your 'simple' example is pretty complicated, and I didn't really follow it in detail! The code is not formatted for easy reading (e.g., lines spanning no more than 80 columns) and some of it (e.g., generateSpec) might not be necessary to describe the problem you're having. A good strategy is to ensure that 'new' called with no arguments works (there are other solutions, but following this rule has helped me to keep my classes and methods simple). This is not the case for new("A") new("C") The reason for this strategy has to do with the way inheritance is implemented, in particular the coercion from derived to super class. Usually it is better to provide default values for arguments to initialize, and to specify arguments after a '...'. This means that your initialize methods will respects the contract set out in ?initialize, in particular the handling of unnamed arguments: ...: data to include in the new object. Named arguments correspond to slots in the class definition. Unnamed arguments must be objects from classes that this class extends. I might have written initialize,A-method as setMethod("initialize", "A", function(.Object, ..., value=numeric()){ .Object <- callNextMethod(.Object, ..., a=value) generateSpec(.Object) }) Likely in a subsequent iteration I would have ended up with (using the convention that function names preceded by '.' are not exported) .A <- setClass("A", representation(a = "numeric", specA = "numeric")) .generateSpecA <- function(a) { 1 / a } A <- function(a=numeric(), ...) { specA <- .generateSpecA(a) .A(..., a=a, specA=specA) } setMethod(generateSpec, "A", function(object) { .generateSpecA(object@a) }) ensuring that A() returns a valid object and avoiding the definition of an initialize method entirely. Martin Best Simon On Aug 3, 2013, at 9:43 PM, Simon Zehnder wrote: setMethod("initialize", "C", function(.Object, value) {.Object@c <- value; .Object <- callNextMethod(.Object, value); .Object <- generateSpec(.Object); return(.Object)}) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Check the class of an object
On 07/23/2013 09:59 AM, Simon Zehnder wrote: Hi David, thanks for the reply. You are right. Using the %in% is more stable and I gonna change my code. you said you were you were using S4 classes. S4 classes do not report vectors of length != 1, from ?class For objects which have a formal class, its name is returned by 'class' as a character vector of length one so a first unit test could be stopifnot(length(class(myObject)) != 1L) When testing for a specific class using 'is' one has to start at the lowest heir and walk up the inheritance structure. Starting at the checks at the root will always give TRUE. Having a structure which is quite complicated let me move to the check I suggested in my first mail. Best Simon On Jul 23, 2013, at 6:15 PM, David Winsemius wrote: On Jul 23, 2013, at 5:36 AM, Simon Zehnder wrote: Dear R-Users and R-Devels, I have large project based on S4 classes. While writing my unit tests I found out, that 'is' cannot test for a specific class, as also inherited classes can be treated as their super classes. I need to do checks for specific classes. What I do right now is sth. like if (class(myClass) == "firstClass") { I would think that you would need to use `%in%` instead. if( "firstClass" %in% class(myObject) ){ Objects can have more than one class, so testing with "==" would fail in those instances. } else if (class(myClass) == "secondClass") { } Is this the usual way how classes are checked in R? Well, `inherits` IS the usual way. I was expecting some specific method (and 'inherits' or 'extends' is not what I look for)... Best Simon [[alternative HTML version deleted]] Plain-text format is the recommended format for Rhelp -- David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] 'save' method for S4 class
On 07/18/2013 03:47 AM, Simon Zehnder wrote: Hi Christopher, I think, that "save" is no generic function like "plot", "show", etc. So at first you have to determine a generic. setGeneric("save", function(x, file_Path) standardGeneric("save")) The implementation offered by Christofer shows write.table, and the end result is a text file rather than a binary file expected from base::save. This makes it seem inappropriate to use 'save' in this context. Instead, it seems that what Cristofer wants to implement is functionality to support write.table. ?write.table says 'write.table' prints its required argument 'x' (after converting it to a data frame if it is not one nor a matrix) So implementing an S3 method as.data.frame.MyClass <- function(x, row.names=NULL, optional=FALSE, ...) { x@x } is all that is needed, gaining lots of flexibility by re-using the code of write.table. myClass = new("MyClass", x=data.frame(x=1:3, y=3:1)) write.table(myClass, stdout()) In the case of a 'save' method producing binary output (but this is what save does already...), I think it's better practice to promote the non-generic 'save' to an S4 generic using it's existing arguments; in this case it makes sense to restrict dispatch to '...', so setGeneric("save", signature="...") The resulting generic is > getGeneric("save") standardGeneric for "save" defined from package ".GlobalEnv" function (..., list = character(), file = stop("'file' must be specified"), ascii = FALSE, version = NULL, envir = parent.frame(), compress = !ascii, compression_level, eval.promises = TRUE, precheck = TRUE) standardGeneric("save") Methods may be defined for arguments: ... Use showMethods("save") for currently available ones. This means that a method might be defined as setMethod("save", "MyClass", function(..., list = character(), file = stop("'file' must be specified"), ascii = FALSE, version = NULL, envir = parent.frame(), compress = !ascii, compression_level, eval.promises = TRUE, precheck = TRUE) { ## check non-sensical or unsupported user input for 'MyClass' if (!is.null(version)) stop("non-NULL 'version' not supported for 'MyClass'") ## ... ## implement save on MyClass }) It might be that Christofer wants to implement a 'write.table-like' (text output) or a 'save-like' (binary output) function that really does not conform to the behavior of write.table (e.g., producing output that could not be input by read.table) or save. Then I think the better approach is to implement writeMyClass (for text output) or saveMyClass (for binary output). Martin Now your definition via setMethod. Best Simon On Jul 18, 2013, at 12:09 PM, Christofer Bogaso wrote: Hello again, I am trying to define the 'save' method for my S4 class as below: setClass("MyClass", representation( Slot1 = "data.frame" )) setMethod("save", "MyClass", definition = function(x, file_Path) { write.table(x@Slot1, file = file_Path, append = FALSE, quote = TRUE, sep = ",", eol = "\n", na = "NA", dec = ".", row.names = FALSE, col.names = TRUE, qmethod = c("escape", "double"), fileEncoding = "") }) However while doing this I am getting following error: Error in conformMethod(signature, mnames, fnames, f, fdef, definition) : in method for ‘save’ with signature ‘list="MyClass"’: formal arguments (list = "MyClass", file = "MyClass", ascii = "MyClass", version = "MyClass", envir = "MyClass", compress = "MyClass", compression_level = "MyClass", eval.promises = "MyClass", precheck = "MyClass") omitted in the method definition cannot be in the signature Can somebody point me what will be the correct approach to define 'save' method for S4 class? Thanks and regards, __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, re
Re: [R] Setting Derived Class Slots
On 07/16/2013 06:36 AM, Steve Creamer wrote: Dear AllI am really struggling with oo in R. Trying to set attributes of the base class in a derived class method but the slot is only populated in the method itself, not when I try to print out the object from the console. Code is library(RODBC) # # - # Define a medical event class. This is abstract (VIRTUAL) # - # setClass("Medical_Event", representation( Event_Name="character", Capacity_Profile="numeric", Delay_Profile="numeric", "VIRTUAL"), prototype(Event_Name="An Event",Capacity_Profile=c(.2,.2,.2,.2,.2,0,0))) setGeneric("getDelayProfile",function(object){standardGeneric("getDelayProfile")},simpleInheritanceOnly=T) # -- # Now define a derived class called GP_Event # -- setClass("GP_Event",representation(Surgery_Name="character"),contains=c("Medical_Event"),prototype(Surgery_Name="Unknown")) # - # Now define a derived class called OP_Appt # - setClass("OP_Appt",representation(Clinic_Name="character"),contains=c("Medical_Event"),prototype(Clinic_Name="Unknown")) setMethod(f="getDelayProfile",signature("OP_Appt"),definition=function(object) { OpTablesDB<-odbcDriverConnect("DRIVER=Microsoft Access Driver (*.mdb, *.accdb); DBQ=Z:\\srp\\Development Code\\Projects\\CancerPathwaySimulation\\Database\\CancerPathway.accdb") strQuery<-"select * from op_profile" odbcQuery(OpTablesDB,strQuery) dfQuery<-odbcFetchRows(OpTablesDB) odbcClose(OpTablesDB) delay<-dfQuery$data[[1]][1:70] prob<-dfQuery$data[[2]][1:70] # as(object,"Medical_Event")@Delay_Profile<-prob object@Delay_Profile <- prob object } ) if I instantiate a new instance of the derived class *aTest<-new("OPP_Appt")*and then try and populate the attribute Delay_Profile by *getDelayProfile(aTest) * the object slot seems to be populated in the method because I can print it out, viz An object of class "OP_Appt" Slot "Clinic_Name": [1] "Unknown" Slot "Event_Name": [1] "An Event" Slot "Capacity_Profile": [1] 0.2 0.2 0.2 0.2 0.2 0.0 0.0 *Slot "Delay_Profile": [1] 14 21 25 29 27 49 72 71 43 65 102 134 223 358 24 14 21 25 35 31 38 43 31 23 21 26 46 54 42 26 [31] 34 24 25 41 48 33 30 17 18 31 24 35 35 24 16 32 36 39 46 36 26 16 27 21 30 32 33 27 7 5 [61] 9 10 9 11 8 6 1 11 14 10* but when the method returns and I type *aTest* I get An object of class "OP_Appt" Slot "Clinic_Name": [1] "Unknown" Slot "Event_Name": [1] "An Event" Slot "Capacity_Profile": [1] 0.2 0.2 0.2 0.2 0.2 0.0 0.0 *Slot "Delay_Profile": numeric(0)* ie the Delay_Profile slot is empty What haven't I done - can anybody help me please? It helps to provide a more minimal example, preferably reproducible (no data base queries needed to illustrate your problem); I'm guessing that, just as with f = funtion(l) { l$a = 1; l } lst = list(a=0, b=1) one would 'update' lst with lst = f(lst) and not f(lst) you need to assign the return value to the original object aTest <- getDelayProfile(aTest) Martin Many Thanks Steve Creamer -- View this message in context: http://r.789695.n4.nabble.com/Setting-Derived-Class-Slots-tp4671683.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] suppress startup messages from default packages
On 07/15/2013 06:25 AM, Duncan Murdoch wrote: On 15/07/2013 8:49 AM, Andreas Leha wrote: Hi Helios, "Helios de Rosario" writes: >> Hi all, >> >> several packages print messages during loading. How do I avoid to > see >> them when the packages are in the defaultPackages? >> >> Here is an example. >> >> With this in ~/.Rprofile >> ,[ ~/.Rprofile ] >> | old <- getOption("defaultPackages") >> | options(defaultPackages = c(old, "filehash")) >> | rm(old) >> ` >> >> I get as last line when starting R: >> , >> | filehash: Simple key-value database (2.2-1 2012-03-12) >> ` >> >> Another package with (even more) prints during startup is > tikzDevice. >> >>How can I avoid to get these messages? > > > There are several options in ?library to control the messages that are > displayed when loading packages. However, this does not seem be able to > supress all the messages. Some messages are defined by the package > authors, because they feel necessary that the user reads them. > Thanks for your answer. When I actually call library() or require() myself I can avoid all messages. There are hacks to do that even for the very persistent messages [fn:1]. My question is how to suppress these messages, when it is not me who calls library() or require(), but when the package is loaded during R's startup through the defaultPackages option. You could try the --slave command line option on startup. If that isn't sufficient, try getting the maintainer to change the package behaviour, or do it yourself. In a hack-y way ?setHook and ?sink seem to work > setHook(packageEvent("filehash", "onLoad"), function(...) sink(file(tempfile(), "w"), type="message")) > setHook(packageEvent("filehash", "attach"), function(...) sink(file=NULL, type="message"), "append") > library(filehash) > Martin Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Computational Biology / Fred Hutchinson Cancer Research Center 1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109 Location: Arnold Building M1 B861 Phone: (206) 667-2793 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Check a list of genes for a specific GO term
Please ask follow-up questions about Bioconductor packages on the Bioconductor mailing list. http://bioconductor.org/help/mailing-list/mailform/ If you are interested in organisms rather than chips, use the organism package, e.g., for Homo sapiens library(org.Hs.eg.db) df0 = select(org.Hs.eg.db, keys(org.Hs.eg.db), "GO") giving > head(df) ENTREZID GO EVIDENCE ONTOLOGY 11 GO:0003674 ND MF 21 GO:0005576 IDA CC 31 GO:0008150 ND BP 4 10 GO:0004060 IEA MF 5 10 GO:0005829 TAS CC 6 10 GO:0006805 TAS BP from which you might df = unique(df0[df0$ONTOLOGY == "BP", c("ENTREZID", "GO")]) len = tapply(df$ENTREZID, df$GO, length) keep = len[len < 1000] to get a vector of counts, with names being GO ids. Remember that the GO is a directed acyclic graph, so terms are nested; you'll likely want to give some thought to what you're actually wanting. The vignettes in the AnnotationDbi and Category packages http://bioconductor.org/packages/release/bioc/html/AnnotationDbi.html http://bioconductor.org/packages/release/bioc/html/Category.html are two useful sources of information, as is the annotation work flow http://bioconductor.org/help/workflows/annotation/ Martin - Chirag Gupta wrote: > Hi > I think I asked the wrong question. Apologies. > > Actually I want all the GO BP annotations for my organism and from them I > want to retain only those annotations which annotate less than a specified > number of genes. (say <1000 genes) > > I hope I have put it clearly. > > sorry again. > > Thanks! > > > On Sun, Jul 7, 2013 at 6:55 AM, Martin Morgan wrote: > > > In Bioconductor, install the annotation package > > > > > > http://bioconductor.org/packages/release/BiocViews.html#___AnnotationData > > > > corresponding to your chip, e.g., > > > > source("http://bioconductor.org/biocLite.R";) > > biocLite("hgu95av2.db") > > > > then load it and select the GO terms corresponding to your probes > > > > library(hgu95av2.db) > > lkup <- select(hgu95av2.db, rownames(dat), "GO") > > > > then use standard R commands to find the probesets that have the GO id > > you're interested in > > > > keep = lkup$GO %in% "GO:0006355" > > unique(lkup$PROBEID[keep]) > > > > Ask follow-up questions about Bioconductor packages on the Bioconductor > > mailing list > > > > http://bioconductor.org/help/mailing-list/mailform/ > > > > Martin > > - Rui Barradas wrote: > > > Hello, > > > > > > Your question is not very clear, maybe if you post a data example. > > > To do so, use ?dput. If your data frame is named 'dat', use the > > following. > > > > > > dput(head(dat, 50)) # paste the output of this in a post > > > > > > > > > If you want to get the rownames matching a certain pattern, maybe > > > something like the following. > > > > > > > > > idx <- grep("GO:0006355", rownames(dat)) > > > dat[idx, ] > > > > > > > > > Hope this helps, > > > > > > Rui Barradas > > > > > > > > > Em 07-07-2013 07:01, Chirag Gupta escreveu: > > > > Hello everyone > > > > > > > > I have a dataframe with rows as probeset ID and columns as samples > > > > I want to check the rownames and find which are those probes are > > > > transcription factors. (GO:0006355 ) > > > > > > > > Any suggestions? > > > > > > > > Thanks! > > > > > > > > > > __ > > > R-help@r-project.org mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > -- > *Chirag Gupta* > Department of Crop, Soil, and Environmental Sciences, > 115 Plant Sciences Building, Fayetteville, Arkansas 72701 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.