Re: [R] Maximum amount of memory

2005-03-21 Thread Tim Cutts
On 21 Mar 2005, at 4:42 pm, [EMAIL PROTECTED] wrote:
Hi,
I have a problem:I need to use the maximum amount of memory in order to
perform a very tough analysis. By purchasing the suitable computer, 
what's
the maximum amount of memory obtainable in R?
Assuming that R is happy to use 64-bit memory pointers, the limit will 
be your wallet.  You could buy an SGI Altix and just keep buying more 
and more memory for it.  I don't know the limit - I know that SGI have 
sold one machine in Japan with 13 terabytes of memory.  We have two of 
them here with 192 GB of RAM each, but I haven't tried R on them yet - 
they're used for other things.

Whether such a course of action is sensible is another matter.  Large 
memory machines rapidly become *extremely* expensive; once you have to 
use DIMMs larger than 1GB each, the price becomes prohibitive.  
Consider spending the same amount of money on employing several 
programmers and/or statisticians to break your problem down into 
smaller tasks than are tractable on smaller machines.

Our 192 GB machine cost quite a lot more than 192 desktop PCs with 1GB 
of RAM each.  In fact, the memory becomes so expensive the rest of the 
machine is virtually free, in comparison.  :-)

If you can get away with more modest amounts of memory, then a machine 
like the HP DL-585 might suit you - a quad processor Opteron, which can 
take up to 32GB or so of memory.  Fairly modest price.

Tim
--
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] X11 Fonts sizes

2005-03-21 Thread Tim Cutts
On 21 Mar 2005, at 11:09 am, Wolfgang Waser wrote:
In postscript graphs (pointsize = 10, different sizes in graph 
adjusted via
cex) I would like to use different font sizes but get the following 
warning
message:

Warning messages:
1: X11 used font size 8 when 9 was requested
2: X11 used font size 8 when 7 was requested
3: X11 used font size 8 when 5 was requested
This is probably not a R but a X11 problem, nevertheless I would be 
most
obliged for any help how to actually use font sizes 9, 7, and 5 and 
others.
X11 default fonts are usually bitmaps, and not scalable, so if you 
don't have the 9, 7 and 5 point versions of the font installed, it may 
be automatically reverting to the nearest decent point size.  Have you 
tried changing the font to something that is scalable, like a 
PostScript or TrueType font?

You can also change the X server's configuration so that it will 
attempt to scale bitmap fonts, but the results are not pretty, so I 
never enable that.

Tim
--
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R: install - Perl dependency problem in Debian/Damn Small Linux

2005-03-15 Thread Tim Cutts
On 14 Mar 2005, at 12:46 pm, Stuart Leask wrote:
(I vaguely recall a suggestion there be a r-debian list, which this 
would be
appropriate for, but I can't find such a list if it exists. Apologies)

OS: Damn Small Linux 1.0rc1 - a tiny (50MB) debian/knoppix-based distro
I can't apt-get R.
[EMAIL PROTECTED] apt-get  install -testing r-base r-base-core 
r-recommended
Reading Package Lists... Done
Building Dependency Tree... Done
Some packages could not be installed. This may mean that you have
requested an impossible situation or if you are using the unstable
distribution that some required packages have not yet been created
or been moved out of Incoming.
The following information may help to resolve the situation:

The following packages have unmet dependencies:
  r-base-core: Depends: perl but it is not going to be installed
E: Broken packages
PROBLEM 1: Perl is certainly installed. However, it may have various 
bits
removed to save space.

However, if I try to correct this with:
apt-get install perl
I get:
The following packages have unmet dependencies:
  perl: depends perl-base (=5.6.1-8.7) but 5.8.0-18 is to be installed
E: broken packages
I've tried different feeds eg. -testing, but get the same problem.
Has anyone come across this, or have any ideas?
It's the perl upgrade that's the problem, obviously.  Is there any 
particular reason why you're using this tiny distribution rather than 
just using plain, normal Debian?

Trying to upgrade this with full Debian packages sounds like asking for 
a world of pain, especially since the package versions in testing, in 
particular, are way ahead of what your existing system is using.  It's 
a bit like trying to install Fedora packages on a Red Hat 7.2 box; i.e. 
pretty unlikely to work.

You might do better to actually compile R from scratch on your system, 
rather than installing the binary packages.

Tim
--
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Concatenate vector into string

2005-03-04 Thread Tim Cutts
On 4 Mar 2005, at 11:32 am, Matthieu Cornec wrote:
Hello,
I would like to convert c("a","b","c") into "abc".
Anyone could help?
paste(c("a", "b", "c"), sep="")
Tim
--
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Is aggregate() what I need here?

2005-03-03 Thread Tim Cutts
I'm pretty new to R, and I've been given a script by a user who wants 
some help with it.  I know enough about the way R works to know that 
this is a very inefficient way to do what the user wants (the 
LSB_JOBINDEX stuff is added by me so that this can work on many 
hundreds of input data files as LSF jobs - it's the nested loops I'm 
really interested in):

gtpfile<-paste("genotype_chunk.", sep="", Sys.getenv("LSB_JOBINDEX"))
snps<-read.table(gtpfile,header=TRUE)
exp<-read.table("data.TXT",header=TRUE)
# the above two files have columns for individuals and snps (or genes) 
for rows
# so the next two lines simply transpose these data matrices
tsnps<-t(snps)
texp<-t(exp)
sink(paste("output.", sep="", Sys.getenv("LSB_JOBINDEX")))
#loops below are hardwired for 5 gene-expression levels (some genes 
have two
#probes, and those are treated as separate genes for now) and 100 SNPs.
for (iexp in 1:5){
   for (isnp in 1:100){
   genotype<-factor(tsnps[,isnp])
# make sure there is more than one genotypic class before doing 
ANOVA
   if (length(unique(genotype))>1)  {
  expression<-texp[,iexp]
  stuff<-anova(lm(expression ~ genotype))
  qq=c(iexp,isnp)
  print (qq)
  print(stuff)
# plot(factor(tsnps[,isnp]),texp[,iexp], xlab="Genotype", 
ylab="Expression
 level")
   }
}
}
sink()

Eventually, on the full data set, the size of the two data sets being 
related to each other will be much larger - several hundred thousand 
rows in snps, and several hundred rows in exp.

Clearly, the genotype<-factor line is being calculated repeatedly, and 
in the wrong place; it should be done en masse up front once the data 
has been read, and I'm sure both loops could actually be expressed some 
other way.

It seems to me that I ought to be able to calculate all the factors in 
a statement before the iexp loop, presumably by using apply() or 
aggregate(), but apply() seems to coerce the result so the results 
aren't factors any more, and I'm clearly missing something with regard 
to the way aggregate() works; I really don't understand what I'd need 
to set the 'by' parameter to.

I apologise if this is a hopelessly naive question, but I've got both 
the "Introduction to R" and Peter's introductory statistics with R book 
here, and I'm having some difficulty finding the information I'm after. 
"Introduction to R" doesn't mention the word "aggregate" once... :-)

Thanks in advance,
Tim
--
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233
__
R-help@stat.math.ethz.ch mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] A possible way to reduce basic questions

2004-12-02 Thread Tim Cutts
On 2 Dec 2004, at 1:23 am, Gabor Grothendieck wrote:
Jim Lemon  ozemail.com.au> writes:
I have been thinking about how to reduce the number of basic 
questions that
elicit the ...ahem... robust debate that has occurred about how to 
answer

The traffic on r-help could be reduced by creating a second list where
more elementary questions are asked.
But how many people here would read it, and help the novices (like me) 
out?  There is always the danger that novice lists just become 
write-only lists.

Tim
--
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] The hidden costs of GPL software?

2004-11-18 Thread Tim Cutts
On 18 Nov 2004, at 10:27 am, Tim Cutts wrote:
The R Intro PDF is good, but it would be nice if it were integrated 
better, with hyperlinks to the reference documentation, or to other 
parts of the introduction, for those platforms that support such 
things
I should correct myself here, and note that there are some 
cross-references within the PDF document, it's not completely devoid of 
them.

Tim
--
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] The hidden costs of GPL software?

2004-11-18 Thread Tim Cutts
On 17 Nov 2004, at 2:27 pm, Patrick Burns wrote:
I think Ted Harding was on  the mark when he said that it is the help
system that needs enhancement.  I can imagine a system that gets the
user to the right function and then helps fill in the arguments; all 
of the
time pointing them towards the command line rather than away from
it.
I think this is spot on.  My situation is that I am a scientist turned 
system administrator, and R is a package which I am increasingly being 
asked to install for the use of scientists at this Institute.  I am by 
no means a statistician;  the statistics I learned in A-level maths 
almost 20 years ago were as far as I got, and most of that I have 
forgotten.  But I like to have some understanding of the software 
packages I am asked to support, so I've been looking at R with a view 
to learning some of its more basic functions.  It looks potentially 
very useful to me anyway for summarising activity on the supercomputing 
cluster that I run.

So I'm a newbie to R, armed with only a very basic knowledge of 
statistics (I know the difference between a Normal and a Poisson 
distribution at least, and with a bit of prodding could probably 
remember a binomial distribution too).  I'm an experienced programmer 
in several languages, and a PhD-level scientist.

And yet I have still found R really quite hard to learn, and this is 
principally because the on-line help is a reference manual.  I'm sure 
it's a fabulous resource if you're a statistician who uses R every day, 
but for me it's not very helpful.

The R Intro PDF is good, but it would be nice if it were integrated 
better, with hyperlinks to the reference documentation, or to other 
parts of the introduction, for those platforms that support such things 
(it looks like this was intended for MacOS X, which is the version I am 
playing with for my own use, although the version I maintain for users 
is on Linux [ and would be on Alpha/Tru64 too if I could get it to pass 
its tests ]) but the on-line help link to the Intro on the Aqua R 
version brings up a blank page, so I'm using the generic PDF document 
instead.

I think the GUI question has nothing to do with the hidden costs of the 
GPL, or otherwise.  This is the age-old ease-of-use versus power and 
capability argument.

I don't think a fancy GUI is necessary - the GUI aspects that have been 
added to R on Mac OS X are sufficient.  I get the impression that the 
real power of R is the fact that really it's a programming language, 
and should probably be treated and learned as such.  Quite apart from 
the fact that a GUI will necessarily be a somewhat restricted subset of 
the total functionality, and a lot slower to use once you've taken the 
effort to learn the software, I think there is another danger, which I 
have already seen in other pieces of software in the bioinformatics 
community.  Users frequently run completely pointless analyses through 
the GUI wrappers we provide.  The users using the command line 
interfaces typically do much more sensible things.

If you make a piece of software trivial for a user to use without 
thinking about what they're doing, then the users won't think.  I may 
not know much about statistics, but what little I do know is that 
understanding exactly what form of analysis or significance test is 
required to be meaningful is a real skill that takes a lot of 
experience to master.   Having to perform that analysis with written 
commands means that your method is recorded, and could be published, 
and more importantly be checked and reproduced by other researchers.  
It also gives you ample time to think about what you're doing, rather 
than just bashing out a pretty graph which actually has no real meaning 
whatsoever.

Any GUI to R could (and should) be able to store the command line 
equivalent to what it has just done, to satisfy the reproducible 
criterion above, but I suspect it could still lead to some pretty 
shoddy work being done by careless and lazy scientists, and we get 
enough of that already.

Tim
--
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R 2.0.0 Installation Problem

2004-11-17 Thread Tim Cutts
On 17 Nov 2004, at 9:00 am, Uwe Ligges wrote:
David Rocke wrote:
I and my students have been having an odd problem with this release,
which is that packages are disappearing. After installation the
package is found with the library command, but later in the same
session or in a later session, the library command returns a not found
error. Then later it is back. Happening on both Windows and OS X,
mostly but not entirely with Bioconductor packages.
Please tell us more details. It never happened to anybody else that 
packages dissapear. Are the files still at the correct location? Have 
you instaled into another library tree?
Have you installed them on a network volume?  I've seen this sort of 
behaviour with overloaded NFS servers in the past.

Tim
--
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R under Pocket PC

2004-11-10 Thread Tim Cutts
On 9 Nov 2004, at 5:14 pm, Peter Dalgaard wrote:
PalmOS would probably be right out.  :-)
I think you are belittling the work done to get R running on as wide a
range of platforms as it does.
?
I only saw a bit of excessive pessimism in that remark.
Yes, no disrespect was intended at all.  Porting to PalmOS is quite a 
different kettle of fish compared to porting between (say) Windows and 
UNIX, because the OS doesn't have a libc; if you use almost any 
standard C library function your binary becomes a huge (because it 
needs to be statically linked with the C library that comes with your 
cross compiler).  A true PalmOS port of any piece of software often 
requires a lot of work to replace standard C library calls with the 
equivalents built into the machine.  Someone may have created some sort 
of wrapper C library, but when I last wrote any software for Palm 
devices, such a thing did not exist.

Admittedly, that was about two years ago, when I wrote some stuff for 
PalmOS 4.

Tim
--
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] R under Pocket PC

2004-11-09 Thread Tim Cutts
On 9 Nov 2004, at 12:27 pm, Prof Brian Ripley wrote:
On Tue, 9 Nov 2004, Lars Strand wrote:
Will R run under Windows Pocket PC?
We don't know!
There are no binary versions of R for that platform, but perhaps you 
could
find a suitable compiler and manage to build the sources.

Outside pure mathematics it is usually very hard to establish that
something cannot be done (and it can be very hard in pure mathematics,
too).
Do PocketPCs generally have enough memory to run something as big as R? 
 A standard R install requires a lot of storage space, by PDA 
standards...

Technically, I don't suppose there's much to stop R running on some 
PDAs, especially those based on Linux like the Sharp Zaurus, other than 
storage and memory requirements.  If there's a Qt graphical interface 
available for R, you could even get graphics working on the Zaurus, 
potentially.  The Windows API on Pocket PC is quite a reduced subset 
compared to full Windows, so you might have problems with that.  PalmOS 
would probably be right out.  :-)

Tim
--
Dr Tim Cutts
Informatics Systems Group, Wellcome Trust Sanger Institute
GPG: 1024D/E3134233 FE3D 6C73 BBD6 726A A3F5  860B 3CDD 3F56 E313 4233
__
[EMAIL PROTECTED] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Compiling R with Intel compilers - recommended options?

2004-06-18 Thread Tim Cutts
On 18 Jun 2004, at 11:58 am, Prof Brian Ripley wrote:
Fine.  The issue on gcc 3.4.0 is that if you compile
src/modules/lapack/dlamc.f without -ffloat-store it loops forever.
I suspect you need to compile it without optimization or with the
equivalent of -ffloat-store.  It's a separate file in 1.9.1 to make 
this
easier/less penalty and configure sets -ffloat-store if g77 is 
detected.
LAPACK calls Fortran functions to try to force stores, and that worked 
on
versions of gcc <= 3.3.3 (and of course only machines with extended
registers like ix86 need the subterfuge).
From my reading of the Intel compiler documentation, the equivalent 
option for icc and ifort is -mp, so I'll have a go at that.

Tim
--
Dr Tim Cutts
Informatics Systems Group
Wellcome Trust Sanger Institute
Hinxton, Cambridge, CB10 1SA, UK
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


Re: [R] Compiling R with Intel compilers - recommended options?

2004-06-18 Thread Tim Cutts
Sorry, I was a bit sparing with details.  Early in the morning, and all 
that.

Many thanks for the swift reply!
On 18 Jun 2004, at 9:11 am, Prof Brian Ripley wrote:
What OS and hardware is this?
Hardware:  700+ machines are 800 MHz Pentium III, 1 GB RAM, Red Hat 7.2
   168 machines are dual 2.8 GHz Xeon, 4 GB RAM, Red Hat 8.0
I'm compiling on the older machines to make sure there won't be library 
problems.

The compiler versions are:
Intel(R) C++ Compiler for 32-bit applications, Version 8.0   Build 
20031016Z Package ID: l_cc_p_8.0.055

Intel(R) Fortran Compiler for 32-bit applications, Version 8.0   Build 
20040412Z Package ID: l_fc_pc_8.


Eventually you mention `X86 Linux' but are
you using x86 Linux and if so how fast processors?  On a 3GHz machine 
make
check takes a couple of minutes using gcc (and in our experience 
Portland
Group is faster than gcc -- we have not tried Intel as local advice
suggests it is slower than PG).
So 12 hours is bit excessive.  :-)  At least now I know.  Thanks!
Which version of R?  R 1.9.0 does this with gcc 3.4.0 on Linux ix86, at
the first use of LAPACK.
Ah.  This is indeed 1.9.0, since that is what was asked for.
  So if you are not trying 1.9.1beta (to be
released on Monday), please do so.  After that, make check does give
output so where exactly does it hang, and if you interrupt that check, 
how
far has it got in the output file?
With 1.9.0, I never get an output file.  On interrupting the make 
check, it says:

make[4]: *** Deleting file `base-Ex.Rout'
and hangs again.  I have to background it and kill it.
The same thing happens with 1.9.1.  I think I'll use gcc/g77 as 
suggested by others to get a working install for now, and then I will 
look into a better optimised version at a more leisurely pace.

Regards,
Tim
--
Dr Tim Cutts
Informatics Systems Group
Wellcome Trust Sanger Institute
Hinxton, Cambridge, CB10 1SA, UK
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html


[R] Compiling R with Intel compilers - recommended options?

2004-06-18 Thread Tim Cutts
Hi,
I'm a sysadmin who's been tasked with installing R on our 1000-node 
compute cluster.
I have licences for the Intel C and FORTRAN compilers, so I'm using the 
following to compile:

CFLAGS="-O2 -axWK"
FFLAGS=$CFLAGS
CXXFLAGS=$CFLAGS
CC=icc
F77=ifort
CXX=icc
FPICFLAGS=-fpic
./configure --without-x --without-tcltk
The compilation seems to go OK, with a few warnings.  What concerns me 
is when I run the make check - how long should this take?  I've had one 
running for over 12 hours now, and it's still on:

running code in 'base-Ex.R' ...
and hasn't produced any output since; it just sits there burning CPU.  
I guess my question is:  I'm sure someone has successfully compiled R 
on X86 Linux using the Intel compilers before - what options did you 
use to make it work?  And how long should this check phase take?

Many thanks...
Tim
--
Dr Tim Cutts
Informatics Systems Group
Wellcome Trust Sanger Institute
Hinxton, Cambridge, CB10 1SA, UK
__
[EMAIL PROTECTED] mailing list
https://www.stat.math.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html