Re: [R] specifications windows pc

2008-11-28 Thread seanpor

Good morning Ruud,

What sort of tasks are you going to be doing in R?  Some tasks will be
faster on a single core extreme type processor, and other tasks can benefit
from a multi-core processor (which run at slower clock speeds than extreme
single-core).  If you're working with large matrices, then an optimized BLAS
can help.

Do the problems you'll be working on require more than 1500mb of RAM?  If so
then you should consider looking at a 64-bit linux on a 64-bit CPU.

The more performance you're looking for - the more work you have to do to
get it!

As an aside - I don't know whether AMD or Intel processors are faster -
clock-speed for clock-speed or / bang-for-buck... doing R-ish tasks (int /
float etc)

Kind Regards,
Sean


R.H. Koning wrote:
 
 Hello, I am about to order a new workstation at my university that will be
 used for R (and other research related tasks). I would appreciate any
 feedback on the specifications of a very fast machine. The machine should
 run windows (XP probably better than vista). Which chip, memory size and
 specification, etc should I be looking for? Thanks, Ruud
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/specifications-windows-pc-tp20730325p20733228.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create a string containing '\/' to be used with SED?

2008-11-26 Thread seanpor

Good morning,

You do not need to quote a forward slash / in R, but you do need to quote a
backslash when you're inputting it... so to get a string which actually
contains blah\/blah... you need to use blah\\/blah

http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-does-backslash-behave-strangely-inside-strings_003f

Unless this is a very very big file you shouldn't need to go out to sed, as
gsub() should work adequately... and probably quicker and cleaner.  So
something along the lines of.. (UNTESTED!!! since I don't have a
reproduceable example)

tmp1 - readLines(configurationFile)
tmp1 - gsub(^instance .*, paste(instance = , data$instancePath, /,
data$newInstance, sep = ), tmp1)


I'm working on 50mb text files, and doing all sorts of manipulations and I
do it all inside R under windows XP...  reading a 50mb text file across the
100mb network and doing a gsub() on most lines takes an elapsed 16 seconds
on this office desktop.

hth...

Regards,
Sean


ikarus wrote:
 
 Hi guys,
 I've been struggling to find a solution to the following issue:
 I need to change strings in .ini files that are given in input to a
 program whose output is processed by R. The strings to be changed looks
 like: 
 instance = /home/TSPFiles/TSPLIB/berlin52.tsp 
 
 I normally use Sed for this kind of things. So, inside R I'd like to write
 something like:
 
  command - paste(sed -i 's/^instance .*/instance = , data$instancePath,
data$newInstance, /' , configurationFile, sep = )
  system(command)
 
 This will overwrite the line starting with instance  using instance =
 the_new_instance
 In the example I gave, data$instancePath = /home/TSPFiles/TSPLIB/ and 
 data$newInstance = berlin52.tsp
 
 The problem is that I need to pass the above path string to sed in the
 form:
 \/home\/TSPFiles\/TSPLIB\/ 
 
 However, I couldn't find a way to create such a string in R. I tried in
 several different ways,
 but it always complains saying that '\/' is an unrecognized escape!
  
 Any suggestion? 
 
 Thanks!
 

-- 
View this message in context: 
http://www.nabble.com/How-to-create-a-string-containing-%27%5C-%27-to-be-used-with-SED--tp20694319p20696613.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] memory limit

2008-11-26 Thread seanpor

Good afternoon,

The short answer is yes, the long answer is it depends.

It all depends on what you want to do with the data, I'm working with
dataframes of a couple of million lines, on this plain desktop machine and
for my purposes it works fine.  I read in text files, manipulate them,
convert them into dataframes, do some basic descriptive stats and tests on
them, a couple of columns at a time, all quick and simple in R.  There are
some libraries which are setup to handle very large datasets, e.g. biglm
[1].

If you're using algorithms which require vast quantities of memory, then as
the previous emails in this thread suggest, you might need R running on
64-bit.

If you're working with a problem which is embarrassingly parallel[2], then
there are a variety of solutions - if you're in between then the solutions
are much more data dependant.

the flip question: how long would it take you to get up and running with the
functionallity (tried and tested in R) you require if you're going to be
re-working things in C++?

I suggest that you have a look at R, possibly using a subset of your full
set to start with - you'll be amazed how quickly you can get up and running.

As suggested at the start of this email... it depends...

Best Regards,
Sean O'Riordain
Dublin

[1] http://cran.r-project.org/web/packages/biglm/index.html
[2] http://en.wikipedia.org/wiki/Embarrassingly_parallel


iwalters wrote:
 
 I'm currently working with very large datasets that consist out of
 1,000,000 + rows.  Is it at all possible to use R for datasets this size
 or should I rather consider C++/Java.
 
 
 

-- 
View this message in context: 
http://www.nabble.com/increasing-memory-limit-in-Windows-Server-2008-64-bit-tp20675880p20700590.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to create a string containing '\/' to be used with SED?

2008-11-26 Thread seanpor

What is the problem error message?  I can say

 fred - blah1\\/blah2\\/blah3

and then the string looks like...

 cat(#, fred, '#\n', sep='')
#blah1\/blah2\/blah3#

If you just ask R to print it then it looks like...
 fred
[1] blah1\\/blah2\\/blah3


when you're playing with strings and regular expressions, it's vital to
understand the backslash quoting mechanism...

Best regards,
Sean


ikarus wrote:
 
 I still can't create a string with inside \/  (e.g., a -
 ..\\/path\\/file
 doesn't work, R complains and the \\ are removed), ... snip
 
-- 
View this message in context: 
http://www.nabble.com/How-to-create-a-string-containing-%27%5C-%27-to-be-used-with-SED--tp20694319p20713699.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] exponential of a matrix

2008-11-11 Thread seanpor

Good morning,

Try expm() in the Matrix package by Douglas Bates and Martin Maechler
http://www.stats.bris.ac.uk/R/web/packages/Matrix/index.html

Note that there is a revised version of that paper, refer:
Cleve Moler and Charles Van Loan (2003) Nineteen dubious ways to compute the
exponential of a matrix, twenty-five years later. SIAM Review 45, 1, 3–49.

Regards,
Sean O'Riordain
[EMAIL PROTECTED]


Peter Dalgaard wrote:
 
 Terry Therneau wrote:
  Is the matrix exponential available in some package?
 
 Multiple. At least Matrix and msm. One of Jim Lindsey's too, but I think 
 that's one of the more dubious ones.
 
  The cannonical reference is Nineteen dubious ways to take the
 exponential of a matrix.  (Love that title)
 
 Yes, it's a classic. As I recall it, the paper misses one point, though: 
 You often want a fast way of computing exp(tQ) (or exp(tQ)%*%v or 
 u%*%exp(tQ)) for multiple values of t, and it is mainly about finding 
 exp(Q) as accurately as possible.
 
  Terry T.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 -- 
 O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
   (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45) 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45) 35327907
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/exponential-of-a-matrix-tp20449726p20454590.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Pros and Cons of R

2008-05-23 Thread seanpor


Neil Shephard wrote:
 
 Another pro to consider is the cost, you can obtain R for free,
 SAS/S-Plus/Stata all have licenses of some sort that require purchasing.
 
 Neil
 

Which has the side effect of *not* restricting how many machines are
available for use or where; e.g. I was running big process a couple of
different times with different scenarios, so I just fired up a few un-used
machines and had them all running in parallel for the afternoon - no
installation issues as I was able to run it off the network drive (windows
as it happens).  If I was licence restricted this would not have been
possible.

Similarly I can do analyses at home on any machine or even if I'm visiting
somewhere else!

Regards
Sean

-- 
View this message in context: 
http://www.nabble.com/Pros-and-Cons-of-R-tp17407521p17424335.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to plot wind direction and strength field

2008-04-29 Thread seanpor

Jenny,

Have a look at the R Newsletter Volume 3/2, October 2003

Regards,
Sean


Jenny Barnes wrote:
 
 Dear R-help community,
 
 I have searched through the archives and not been able ot find any advice 
 on how to plot a wind field with one arrow per grid square with the arrow 
 pointing in the direction of the wind and it's size proportional to the 
 wind strength.
 
 I have the wind speed data in arrays of [lon,lat,uwind] and 
 [lon,lat,vwind] so it is broken down into u and v components. How do I 
 plot it though?!?!
 
 Any suggestions very wecome indeed - I seem to have hit a brick wall.
 
 All the best,
 
 Jenny
 
 ~~
 Jennifer Barnes
 PhD student: long range drought prediction 
 Climate Extremes Group
 Department of Space and Climate Physics
 University College London
 Holmbury St Mary 
 Dorking, Surrey, RH5 6NT
 Web: http://climate.mssl.ucl.ac.uk
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/generic-question--%3E-Genomics-with-R-tp16954827p16958167.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] select rows from data based on a vector of char strings

2008-04-24 Thread seanpor

or using the %in% operator...

?%in%

data[data$label %in% flist,]

regards,
Sean


Applejus wrote:
 
 Hi,
 
 You are right the == doesn't work, but there's a workaround using regular
 expressions:
 
 flist-fun|food
 grep(flist, data$label)
 
 will give you the vector [2 4] which are the numbers of the rows of
 interest!
 
 
 Dirkheld wrote:
 
 Hi,
 
 I have loaded a dataset in R :
 data = 
 
 label   freq1   freq2
 news   54  35
 fun  37  21
 milk19  7
 food 3   3
  etc
 
 And I have a vector
 flist-c(fun,food)
 
 Now I want to use the vector 'flist' for selecting these values from
 'data' so that I get the following dataset :
 label   freq1   freq2
 fun  37  21
 food 3   3
 
 When I do 'data$label==flist[1]' I get 'F T F F', so it works for one
 item in the char vector flist.
 But, when I do 'data$label==flist' I get 'F F F F' while I expected 'F T
 F T'. It seems that I can't perform this action with a vector of
 charstrings? 
 
 Is there an other way to do so?
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/select-rows-from-data-based-on-a-vector-of-char-strings-tp16832735p16848199.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Documentation General Comments

2008-04-23 Thread seanpor

Good morning,

Firstly I'd like to say that I'm a huge fan of R and I think it's great
system.

Part of the problem in searching for information is knowing what buzzwords /
keywords to use.  I was recently caught out like this as I didn't see my
problem as a cumulative sum (keyword=cumsum) only as referencing one line of
a dataframe from another.  Academic papers and certain webpages add special
classification keywords to the text of a page to help.  Searching is a
general problem - not just within R - ask any archivist or librarian!

A partial solution is to have disambiguation pages,
e.g. http://en.wikipedia.org/wiki/Comma
Is it reasonable to have help pages with no specific R / package item behind
it only a See Also section?  Does somebody have access to the most
frequent RSiteSearch() terms?

It would probably help to increase the number of See Also details - for
example when I run into a problem the first thing I do is try to recreate it
as a reproduceable toy problem which I could send to this list (which
incidentally is actually a great way of figuring out solution to the problem
without having to bother the list!!!).  To do this I invariably want to
generate some random numbers, and I can never remember the names runif() or
rnorm() so I say help.search(random) which doesn't actually reference
either runif() or rnorm() directly so I look at ?RNG which leads me to
rnorm() - and already knowing that this is what I'm looking for I'm ok - but
if somebody didn't already know this it is not obvious.

I appreciate that there is always a difficult balance when writing
documentation between having enough and too much.  Just looking at the core
documentation for R-2.6.2 (and ignoring the many many additional packages)
The introduction to R is 100 pages of PDF and the reference manual runs to
1,576 pages of PDF.  Adding more information as many of us want would make
the reference manual even more unwieldy and far too big to print out to
peruse, which gives rise to a market for books which take over where the
introduction manual leaves off...

Part of the difficulty that we encounter is that sometimes our difficulties
are pure R, and other times the difficulty is statistical or mathematical -
more often than not the problem is between the two... and frequently those
of us asking the question don't actually know where on the spectrum it is...

Q: Could there be ways other than submitting a bug / patch to help improve
R?

Q: Should this discussion be on r-devel or r-help?


Best Regards,
Sean




The root of the problem is that R is a voluntary/cooperative project  
and those who develop and maintain R are (generously) contributing their
time and
probably have little-to-no time left over to devote to the  
improvement of the documentation.
snip...

This is why the documentation tends to be opaque in the first place.   
The people who build R are so clever and understand so much that
they cannot put themselves in the shoes of those of us who are
not so blessed with intelligence and erudition.  So they (often)
write terse cryptic instructions which (often) depend on background
knowledge that many of us lack.  That background knowledge can
of course be found ***if you know where to look***
--- or even if you don't, given that you are prepared to put in  
sufficient time and effort searching ***and*** are clever at
searching.  It's that last requirement that leaves *me* out in the cold.

snip...

-- 
View this message in context: 
http://www.nabble.com/Documentation-General-Comments-tp16821085p16833353.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] running balance down a dataframe referring back to previous row

2008-03-19 Thread seanpor

Good morning, I've searched high and low and I've tried many different ways
of doing this, but I can't seem to get it to work.

I'm looking for a way of vectorising a running balance; i.e. the value in
the first row of the dataframe is zero, and following rows add to this
running balance.  This is easy to write in a loop, but I can't seem to get
it working in vectorised code.  Hopefully the example below will explain
what I'm trying to do...

#
# create a dummy dataframe
txns - data.frame(LETTERS)
set.seed(123)
# randomly specify debit / credit columns
txns$drcr - sample(c('d','c'), nrow(txns), replace=T)
txns$dr - 0
txns$cr - 0
# give values to the debits / credits...
txns[txns$drcr == 'd', 'dr'] -  runif(nrow(txns[txns$drcr == 'd',]), min=0,
max=1)
txns[txns$drcr == 'c', 'cr'] -  runif(nrow(txns[txns$drcr == 'c',]), min=0,
max=1)

# reset the initial dr/cr value to zero...
txns[1,'dr'] - 0
txns[1,'cr'] - 0

# intialize the entire running balance column to zero
txns$rbal - 0

# setup a row index starting at row 2 so that we only operate on these
rows...
r0 - c(2:nrow(txns))

# setup a row index offset by 1 so that we can access the running balance
# from the previous line...
r1 - c(2:nrow(txns)) - 1

# calculate the running balance using vectorized code unfortunately this
doesn't work...
txns[r0,'rbal'] - txns[r1,'rbal'] +  txns[r0,'dr'] - txns[r0,'cr']

# calculate the running balance using a loop
txns$running.bal - 0
for (i in (2:nrow(txns))) {
txns[i,'running.bal'] - txns[(i-1), 'running.bal'] + txns[i, 'dr'] -
txns[i, 'cr']
}
txns
#

I was hoping that rbal and running.bal would be the same... evidently not...
I've even tried --vanilla...

Is there a specified order in which vectorized dataframe calculations are
carried out? Top to bottom or unspecified?  Does it work off a copy and then
replace the old column?  Do I just have to use a loop for this?

   
platform   i386-pc-mingw32 
arch   i386
os mingw32 
system i386, mingw32   
status 
major  2   
minor  6.2 
year   2008
month  02  
day08  
svn rev44383   
language   R   
version.string R version 2.6.2 (2008-02-08)


Many thanks in advance,

Best regards,
Sean O'Riordain

-- 
View this message in context: 
http://www.nabble.com/running-balance-down-a-dataframe-referring-back-to-previous-row-tp16142263p16142263.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] spliting strings ...

2007-12-13 Thread seanpor

Good afternoon Monica,

Relying on regular expressions, substituting nothing  for everything
starting with a space until the end of the line (i.e. with a dollar sign) 

str1 - sub( .*$, , str)

Regards,
Sean


Monica Pisica wrote:
 
 
 Hi everyone,
  
 I have a vector of strings, each string made up by different number of
 words. I want to get a new vector which has only the first word of each
 string in the first vector. I came up with this:
  
 str - c('aaa bbb', 'cc', 'd eee aa', 'mmm o n')
 str1 - rep(1, length(str))
 for (i in 1:length(str)) {
  str1[i] - strsplit(str,  )[[i]][1]
 }
  str1
 'aaa'   'cc'   'd'  'mmm'
  
 Now, is there any way to do this simpler?
  
 Thanks,
  
 Monica
  
 _
 Get the power of Windows + Web with the new Windows Live.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/spliting-strings-...-tp14316255p14316361.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Re placing values job

2007-11-29 Thread seanpor

fyi On my machine match runs *much* faster...

 t0 - Sys.time(); for (i in 1:reps) { match(Y,X) }; print(Sys.time() - t0)
Time difference of 0.1570001 secs
 t0 - Sys.time(); for (i in 1:reps) { sapply(Y,function(Y){which(Y==X)})
 }; print(Sys.time() - t0)
Time difference of 6.093 secs
 6.09/.157
[1] 38.78981

Regards,
Sean


Peter Dalgaard wrote:
 
 Ingmar Visser wrote:
 does this do what you want?

 sapply(y,function(y){which(y==x)})
   
 Maybe, but match(Y,X) would be more to the point.
 
 hth, Ingmar

 On 28 Nov 2007, at 15:53, Serguei Kaniovski wrote:

   
 Hallo,

 I have two vectors of different lengths which contain the same set of
 values:

 X  -c(2,6,1,7,4,3,5)
 Y - c(1,1,6,4,6,1,4,1,2,3,6,6,1,2,4,4,5,4,1,7,6,6,4,4,7,1,2)

 How can I replace the values in Y with the index (!) of the  
 corresponding
 values in X. So 2 appears in X in the first coordinate, so all 2�s  
 in Y
 should be replaced by 1, etc.

 Thank you for your help,
 Serguei

 
 Austrian Institute of Economic Research (WIFO)

 P.O.Box 91  Tel.: +43-1-7982601-231
 1103 Vienna, AustriaFax: +43-1-7989386

 Mail: [EMAIL PROTECTED]
 http://www.wifo.ac.at/Serguei.Kaniovski
 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting- 
 guide.html
 and provide commented, minimal, self-contained, reproducible code.
 

 Ingmar Visser
 Department of Psychology, University of Amsterdam
 Roetersstraat 15
 1018 WB Amsterdam
 The Netherlands
 t: +31-20-5256723



  [[alternative HTML version deleted]]

   
 

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
   
 
 
 -- 
O__   Peter Dalgaard Øster Farimagsgade 5, Entr.B
   c/ /'_ --- Dept. of Biostatistics PO Box 2099, 1014 Cph. K
  (*) \(*) -- University of Copenhagen   Denmark  Ph:  (+45)
 35327918
 ~~ - ([EMAIL PROTECTED])  FAX: (+45)
 35327907
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Replacing-values-job-tf4889131.html#a14021232
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.