Re: [R] Maximum amount of memory
On 21 Mar 2005, at 4:42 pm, [EMAIL PROTECTED] wrote: Hi, I have a problem:I need to use the maximum amount of memory in order to perform a very tough analysis. By purchasing the suitable computer, what's the maximum amount of memory obtainable in R? Assuming that R is happy to use 64-bit memory pointers, the limit will be your wallet. You could buy an SGI Altix and just keep buying more and more memory for it. I don't know the limit - I know that SGI have sold one machine in Japan with 13 terabytes of memory. We have two of them here with 192 GB of RAM each, but I haven't tried R on them yet - they're used for other things. Whether such a course of action is sensible is another matter. Large memory machines rapidly become *extremely* expensive; once you have to use DIMMs larger than 1GB each, the price becomes prohibitive. Consider spending the same amount of money on employing several programmers and/or statisticians to break your problem down into smaller tasks than are tractable on smaller machines. Our 192 GB machine cost quite a lot more than 192 desktop PCs with 1GB of RAM each. In fact, the memory becomes so expensive the rest of the machine is virtually free, in comparison. :-) If you can get away with more modest amounts of memory, then a machine like the HP DL-585 might suit you - a quad processor Opteron, which can take up to 32GB or so of memory. Fairly modest price. Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] X11 Fonts sizes
On 21 Mar 2005, at 11:09 am, Wolfgang Waser wrote: In postscript graphs (pointsize = 10, different sizes in graph adjusted via cex) I would like to use different font sizes but get the following warning message: Warning messages: 1: X11 used font size 8 when 9 was requested 2: X11 used font size 8 when 7 was requested 3: X11 used font size 8 when 5 was requested This is probably not a R but a X11 problem, nevertheless I would be most obliged for any help how to actually use font sizes 9, 7, and 5 and others. X11 default fonts are usually bitmaps, and not scalable, so if you don't have the 9, 7 and 5 point versions of the font installed, it may be automatically reverting to the nearest decent point size. Have you tried changing the font to something that is scalable, like a PostScript or TrueType font? You can also change the X server's configuration so that it will attempt to scale bitmap fonts, but the results are not pretty, so I never enable that. Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R: install - Perl dependency problem in Debian/Damn Small Linux
On 14 Mar 2005, at 12:46 pm, Stuart Leask wrote: (I vaguely recall a suggestion there be a r-debian list, which this would be appropriate for, but I can't find such a list if it exists. Apologies) OS: Damn Small Linux 1.0rc1 - a tiny (50MB) debian/knoppix-based distro I can't apt-get R. [EMAIL PROTECTED] apt-get install -testing r-base r-base-core r-recommended Reading Package Lists... Done Building Dependency Tree... Done Some packages could not be installed. This may mean that you have requested an impossible situation or if you are using the unstable distribution that some required packages have not yet been created or been moved out of Incoming. The following information may help to resolve the situation: The following packages have unmet dependencies: r-base-core: Depends: perl but it is not going to be installed E: Broken packages PROBLEM 1: Perl is certainly installed. However, it may have various bits removed to save space. However, if I try to correct this with: apt-get install perl I get: The following packages have unmet dependencies: perl: depends perl-base (=5.6.1-8.7) but 5.8.0-18 is to be installed E: broken packages I've tried different feeds eg. -testing, but get the same problem. Has anyone come across this, or have any ideas? It's the perl upgrade that's the problem, obviously. Is there any particular reason why you're using this tiny distribution rather than just using plain, normal Debian? Trying to upgrade this with full Debian packages sounds like asking for a world of pain, especially since the package versions in testing, in particular, are way ahead of what your existing system is using. It's a bit like trying to install Fedora packages on a Red Hat 7.2 box; i.e. pretty unlikely to work. You might do better to actually compile R from scratch on your system, rather than installing the binary packages. Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Concatenate vector into string
On 4 Mar 2005, at 11:32 am, Matthieu Cornec wrote: Hello, I would like to convert c("a","b","c") into "abc". Anyone could help? paste(c("a", "b", "c"), sep="") Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Is aggregate() what I need here?
I'm pretty new to R, and I've been given a script by a user who wants some help with it. I know enough about the way R works to know that this is a very inefficient way to do what the user wants (the LSB_JOBINDEX stuff is added by me so that this can work on many hundreds of input data files as LSF jobs - it's the nested loops I'm really interested in): gtpfile<-paste("genotype_chunk.", sep="", Sys.getenv("LSB_JOBINDEX")) snps<-read.table(gtpfile,header=TRUE) exp<-read.table("data.TXT",header=TRUE) # the above two files have columns for individuals and snps (or genes) for rows # so the next two lines simply transpose these data matrices tsnps<-t(snps) texp<-t(exp) sink(paste("output.", sep="", Sys.getenv("LSB_JOBINDEX"))) #loops below are hardwired for 5 gene-expression levels (some genes have two #probes, and those are treated as separate genes for now) and 100 SNPs. for (iexp in 1:5){ for (isnp in 1:100){ genotype<-factor(tsnps[,isnp]) # make sure there is more than one genotypic class before doing ANOVA if (length(unique(genotype))>1) { expression<-texp[,iexp] stuff<-anova(lm(expression ~ genotype)) qq=c(iexp,isnp) print (qq) print(stuff) # plot(factor(tsnps[,isnp]),texp[,iexp], xlab="Genotype", ylab="Expression level") } } } sink() Eventually, on the full data set, the size of the two data sets being related to each other will be much larger - several hundred thousand rows in snps, and several hundred rows in exp. Clearly, the genotype<-factor line is being calculated repeatedly, and in the wrong place; it should be done en masse up front once the data has been read, and I'm sure both loops could actually be expressed some other way. It seems to me that I ought to be able to calculate all the factors in a statement before the iexp loop, presumably by using apply() or aggregate(), but apply() seems to coerce the result so the results aren't factors any more, and I'm clearly missing something with regard to the way aggregate() works; I really don't understand what I'd need to set the 'by' parameter to. I apologise if this is a hopelessly naive question, but I've got both the "Introduction to R" and Peter's introductory statistics with R book here, and I'm having some difficulty finding the information I'm after. "Introduction to R" doesn't mention the word "aggregate" once... :-) Thanks in advance, Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 __ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] A possible way to reduce basic questions
On 2 Dec 2004, at 1:23 am, Gabor Grothendieck wrote: Jim Lemon ozemail.com.au> writes: I have been thinking about how to reduce the number of basic questions that elicit the ...ahem... robust debate that has occurred about how to answer The traffic on r-help could be reduced by creating a second list where more elementary questions are asked. But how many people here would read it, and help the novices (like me) out? There is always the danger that novice lists just become write-only lists. Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The hidden costs of GPL software?
On 18 Nov 2004, at 10:27 am, Tim Cutts wrote: The R Intro PDF is good, but it would be nice if it were integrated better, with hyperlinks to the reference documentation, or to other parts of the introduction, for those platforms that support such things I should correct myself here, and note that there are some cross-references within the PDF document, it's not completely devoid of them. Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] The hidden costs of GPL software?
On 17 Nov 2004, at 2:27 pm, Patrick Burns wrote: I think Ted Harding was on the mark when he said that it is the help system that needs enhancement. I can imagine a system that gets the user to the right function and then helps fill in the arguments; all of the time pointing them towards the command line rather than away from it. I think this is spot on. My situation is that I am a scientist turned system administrator, and R is a package which I am increasingly being asked to install for the use of scientists at this Institute. I am by no means a statistician; the statistics I learned in A-level maths almost 20 years ago were as far as I got, and most of that I have forgotten. But I like to have some understanding of the software packages I am asked to support, so I've been looking at R with a view to learning some of its more basic functions. It looks potentially very useful to me anyway for summarising activity on the supercomputing cluster that I run. So I'm a newbie to R, armed with only a very basic knowledge of statistics (I know the difference between a Normal and a Poisson distribution at least, and with a bit of prodding could probably remember a binomial distribution too). I'm an experienced programmer in several languages, and a PhD-level scientist. And yet I have still found R really quite hard to learn, and this is principally because the on-line help is a reference manual. I'm sure it's a fabulous resource if you're a statistician who uses R every day, but for me it's not very helpful. The R Intro PDF is good, but it would be nice if it were integrated better, with hyperlinks to the reference documentation, or to other parts of the introduction, for those platforms that support such things (it looks like this was intended for MacOS X, which is the version I am playing with for my own use, although the version I maintain for users is on Linux [ and would be on Alpha/Tru64 too if I could get it to pass its tests ]) but the on-line help link to the Intro on the Aqua R version brings up a blank page, so I'm using the generic PDF document instead. I think the GUI question has nothing to do with the hidden costs of the GPL, or otherwise. This is the age-old ease-of-use versus power and capability argument. I don't think a fancy GUI is necessary - the GUI aspects that have been added to R on Mac OS X are sufficient. I get the impression that the real power of R is the fact that really it's a programming language, and should probably be treated and learned as such. Quite apart from the fact that a GUI will necessarily be a somewhat restricted subset of the total functionality, and a lot slower to use once you've taken the effort to learn the software, I think there is another danger, which I have already seen in other pieces of software in the bioinformatics community. Users frequently run completely pointless analyses through the GUI wrappers we provide. The users using the command line interfaces typically do much more sensible things. If you make a piece of software trivial for a user to use without thinking about what they're doing, then the users won't think. I may not know much about statistics, but what little I do know is that understanding exactly what form of analysis or significance test is required to be meaningful is a real skill that takes a lot of experience to master. Having to perform that analysis with written commands means that your method is recorded, and could be published, and more importantly be checked and reproduced by other researchers. It also gives you ample time to think about what you're doing, rather than just bashing out a pretty graph which actually has no real meaning whatsoever. Any GUI to R could (and should) be able to store the command line equivalent to what it has just done, to satisfy the reproducible criterion above, but I suspect it could still lead to some pretty shoddy work being done by careless and lazy scientists, and we get enough of that already. Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R 2.0.0 Installation Problem
On 17 Nov 2004, at 9:00 am, Uwe Ligges wrote: David Rocke wrote: I and my students have been having an odd problem with this release, which is that packages are disappearing. After installation the package is found with the library command, but later in the same session or in a later session, the library command returns a not found error. Then later it is back. Happening on both Windows and OS X, mostly but not entirely with Bioconductor packages. Please tell us more details. It never happened to anybody else that packages dissapear. Are the files still at the correct location? Have you instaled into another library tree? Have you installed them on a network volume? I've seen this sort of behaviour with overloaded NFS servers in the past. Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R under Pocket PC
On 9 Nov 2004, at 5:14 pm, Peter Dalgaard wrote: PalmOS would probably be right out. :-) I think you are belittling the work done to get R running on as wide a range of platforms as it does. ? I only saw a bit of excessive pessimism in that remark. Yes, no disrespect was intended at all. Porting to PalmOS is quite a different kettle of fish compared to porting between (say) Windows and UNIX, because the OS doesn't have a libc; if you use almost any standard C library function your binary becomes a huge (because it needs to be statically linked with the C library that comes with your cross compiler). A true PalmOS port of any piece of software often requires a lot of work to replace standard C library calls with the equivalents built into the machine. Someone may have created some sort of wrapper C library, but when I last wrote any software for Palm devices, such a thing did not exist. Admittedly, that was about two years ago, when I wrote some stuff for PalmOS 4. Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] R under Pocket PC
On 9 Nov 2004, at 12:27 pm, Prof Brian Ripley wrote: On Tue, 9 Nov 2004, Lars Strand wrote: Will R run under Windows Pocket PC? We don't know! There are no binary versions of R for that platform, but perhaps you could find a suitable compiler and manage to build the sources. Outside pure mathematics it is usually very hard to establish that something cannot be done (and it can be very hard in pure mathematics, too). Do PocketPCs generally have enough memory to run something as big as R? A standard R install requires a lot of storage space, by PDA standards... Technically, I don't suppose there's much to stop R running on some PDAs, especially those based on Linux like the Sharp Zaurus, other than storage and memory requirements. If there's a Qt graphical interface available for R, you could even get graphics working on the Zaurus, potentially. The Windows API on Pocket PC is quite a reduced subset compared to full Windows, so you might have problems with that. PalmOS would probably be right out. :-) Tim -- Dr Tim Cutts Informatics Systems Group, Wellcome Trust Sanger Institute GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5 860B 3CDD 3F56 E313 4233 __ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Compiling R with Intel compilers - recommended options?
On 18 Jun 2004, at 11:58 am, Prof Brian Ripley wrote: Fine. The issue on gcc 3.4.0 is that if you compile src/modules/lapack/dlamc.f without -ffloat-store it loops forever. I suspect you need to compile it without optimization or with the equivalent of -ffloat-store. It's a separate file in 1.9.1 to make this easier/less penalty and configure sets -ffloat-store if g77 is detected. LAPACK calls Fortran functions to try to force stores, and that worked on versions of gcc <= 3.3.3 (and of course only machines with extended registers like ix86 need the subterfuge). From my reading of the Intel compiler documentation, the equivalent option for icc and ifort is -mp, so I'll have a go at that. Tim -- Dr Tim Cutts Informatics Systems Group Wellcome Trust Sanger Institute Hinxton, Cambridge, CB10 1SA, UK __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
Re: [R] Compiling R with Intel compilers - recommended options?
Sorry, I was a bit sparing with details. Early in the morning, and all that. Many thanks for the swift reply! On 18 Jun 2004, at 9:11 am, Prof Brian Ripley wrote: What OS and hardware is this? Hardware: 700+ machines are 800 MHz Pentium III, 1 GB RAM, Red Hat 7.2 168 machines are dual 2.8 GHz Xeon, 4 GB RAM, Red Hat 8.0 I'm compiling on the older machines to make sure there won't be library problems. The compiler versions are: Intel(R) C++ Compiler for 32-bit applications, Version 8.0 Build 20031016Z Package ID: l_cc_p_8.0.055 Intel(R) Fortran Compiler for 32-bit applications, Version 8.0 Build 20040412Z Package ID: l_fc_pc_8. Eventually you mention `X86 Linux' but are you using x86 Linux and if so how fast processors? On a 3GHz machine make check takes a couple of minutes using gcc (and in our experience Portland Group is faster than gcc -- we have not tried Intel as local advice suggests it is slower than PG). So 12 hours is bit excessive. :-) At least now I know. Thanks! Which version of R? R 1.9.0 does this with gcc 3.4.0 on Linux ix86, at the first use of LAPACK. Ah. This is indeed 1.9.0, since that is what was asked for. So if you are not trying 1.9.1beta (to be released on Monday), please do so. After that, make check does give output so where exactly does it hang, and if you interrupt that check, how far has it got in the output file? With 1.9.0, I never get an output file. On interrupting the make check, it says: make[4]: *** Deleting file `base-Ex.Rout' and hangs again. I have to background it and kill it. The same thing happens with 1.9.1. I think I'll use gcc/g77 as suggested by others to get a working install for now, and then I will look into a better optimised version at a more leisurely pace. Regards, Tim -- Dr Tim Cutts Informatics Systems Group Wellcome Trust Sanger Institute Hinxton, Cambridge, CB10 1SA, UK __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
[R] Compiling R with Intel compilers - recommended options?
Hi, I'm a sysadmin who's been tasked with installing R on our 1000-node compute cluster. I have licences for the Intel C and FORTRAN compilers, so I'm using the following to compile: CFLAGS="-O2 -axWK" FFLAGS=$CFLAGS CXXFLAGS=$CFLAGS CC=icc F77=ifort CXX=icc FPICFLAGS=-fpic ./configure --without-x --without-tcltk The compilation seems to go OK, with a few warnings. What concerns me is when I run the make check - how long should this take? I've had one running for over 12 hours now, and it's still on: running code in 'base-Ex.R' ... and hasn't produced any output since; it just sits there burning CPU. I guess my question is: I'm sure someone has successfully compiled R on X86 Linux using the Intel compilers before - what options did you use to make it work? And how long should this check phase take? Many thanks... Tim -- Dr Tim Cutts Informatics Systems Group Wellcome Trust Sanger Institute Hinxton, Cambridge, CB10 1SA, UK __ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html