date:20110802

2011-08-02 Thread Thaler, Thorn, LAUSANNE, Applied Mathematics


On Aug 2, 2011, at 08:02 , Rolf Turner wrote:

 
 
 Why does R think these numbers ***are*** equal?
 
 In a somewhat bizarre set of circumstances I calculated
 
x0 - 0.03580067
x1 - 0.03474075
y0 - 0.4918823
y1 - 0.4474461
dx - x1 - x0
dy - y1 - y0
xx - (x0 + x1)/2
yy - (y0 + y1)/2
chk - yy*dx - xx*dy + x0*dy - y0*dx
 
 If you think about it ***very*** carefully ( :-) ) you'll see that ``chk'' 
 ought to be zero.
 
 Blow me down, R gets 0.  Exactly.  To as many significant digits/decimal 
 places
 as I can get it to print out.
 
 But  I wrote a wee function in C to do the *same* calculation and 
 dyn.load()-ed
 it and called it with .C().  And I got -1.248844e-19.
 
 This is of course zero, to all floating point arithmetic intents and 
 purposes.  But if
 I name the result returned by my call to .C() ``xxx'' and ask
 
xxx = 0
 
 I get FALSE whereas ``chk = 0'' returns TRUE (as does ``chk = 0'', of 
 course).
 (And inside my C function, the comparison ``xxx = 0'' yields ``false'' as 
 well.)
 
 I was vaguely thinking that raw R arithmetic would be equivalent to C 
 arithmetic.
 (Isn't R written in C?)
 
 Can someone explain to me how it is that R (magically) gets it exactly right, 
 whereas
 a call to .C() gives the sort of ``approximately right'' answer that one 
 might usually
 expect?  I know that R Core is ***good*** but even they can't make C do 
 infinite
 precision arithmetic. :-)
 
 This is really just idle curiosity --- I realize that this phenomenon is one 
 that I'll simply have
 to live with.  But if I can get some deeper insight as to why it occurs, 
 well, that would
 be nice.

I think the long and the short of it is that R lost a couple of bits of 
precision that C retained. This sort of thing happens if R stores things into 
64 bit floating point objects while C keeps them in 80 bit CPU registers. In 
general, floating point calculations do not obey the laws of math, for example 
the associative law (i.e., (a+b)-c ?= a+(b-c), especially if b and c are large 
and nearly equal), so any reordering of expressions by the compiler may give a 
slightly different result.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com
Døden skal tape! --- Nordahl Grieg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting problems directional or rose plots

2011-08-02 Thread Jim Lemon


On 08/02/2011 01:38 AM, kitty wrote:

Hi again,

I have tried playing around with the code given to me by Alan and Jim, thank
you for the code but unfortunatelyI can't seem to get either of them to
work... Alans does not work with the sample data and Jims is giving the
error :

Error in radial.grid(labels = labels, label.pos = label.pos, radlab =
radlab,  :
   could not find function boxed.labels

I have also tried Rose plots in the (heR.Misc) library to to avail.

Sorry, does anyone know how to get the plots I need?



Hi kitty,
Oops, I forgot that the code calls boxed.labels, a function in the 
plotrix package. Install that and it should work.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Environment of a LM created in a function

Dear Peter,

Thanks for your concise answer, it works perfectly. 

By the way, I fully agree that data or df are not good names for
data.frames and I am/was aware of that and I usually avoid those names
(not consequently though I've to admit, it is too tempting ;). However,
if one uses those evil names, one cannot expect to receive meaningful
error messages. Thus, I was not astonished by the peculiar error message
itself (in fact I was well aware that this has to do with the bad naming
and the fact that data is, above all, a function) and I suspect the
error to be due to environment issues.

I tried the workaround with passing the very same data argument
explicitly to update:

 update(models[[1]], . ~ ., data = dat)

which worked but which left the stale impression of redundancy and even
more dangerous error proneness: what happens if the name of the data
frame is changed earlier?

Finally, your suggestion with 

 update(models[[1]], . ~ ., data = model.frame(models[[1]]))

solved all the issues (and I was wondering why I did not try it out
myself, so obviously I was not seeing the wood for the trees). So,
thanks a lot for your help.

Have a nice day.

KR,

-Thorn

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] if function problems

2011-08-02 Thread Petr PIKAL

Hi

another possibility is to use logical values properties

 (x  0)*x
[1] -3 -2 -1  0  0  0  0

Regards
Petr

 
 In addition to what David said:
 
 On Mon, Aug 1, 2011 at 6:57 PM, zoe_zhang 1987.zhan...@gmail.com 
wrote:
  Dear All,
  Sorry to bother
  I want to write a function in R using if
  Say I have a dataset x,
  if x[i]0, then x[i]=x[i],
  if x[i]0, then x[i]=0
 
  for example, x=-3:3,
  then using the function, x becomes [-3,-2,-1,0,0,0,0]
 
  I write the codes as follows,
 
  gjr=function(x)
  {lena=length(x)
  for(i in 1:lenx)
  if (x[i]0) return (x[i])
  if (x[i]0) return (0)
  x}
 
  but then, doing
  gjr(x）
  it only comes out with one number
 
  Does anyone have any suggestions?
 
 You define `lena`, but then use `lenx` in `for (i in 1:lenx)` in your
 function ... I guess this might have something to do with it.
 
 You shouldn't use a for loop, though, and just follow david's advice
 by using logical indexing, or the `ifelse` function, ie:
 
 R ifelse(x  0, x, 0)
 
 HTH,
 -steve
 
 -- 
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
  | Memorial Sloan-Kettering Cancer Center
  | Weill Medical College of Cornell University
 Contact Info: http://cbio.mskcc.org/~lianos/contact
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] ivreg and structural change

2011-08-02 Thread Achim Zeileis


On Mon, 1 Aug 2011, Claudio Shikida (??) wrote:


Hello,

I am looking for some help with this question: how could I test structural
breaks in a instrumental variables?s model?


In principle, most of tests used in the standard linear regression model 
can also be transferred to the IV case. However, many of the functions in 
strucchange do not do this. A notable exception is the function gefp(), 
see its manual page for references. This allows you to do something like


  gefp(y ~ x1 + x2 | z1 + z2, fit = ivreg, data = d)

etc.


For example, I was trying to do something with my model with three time
series.

tax_ivreg - ivreg(l_y ~ l_x2 + l_x1+ dl_y | lag(l_x2, -1)+lag(l_x2, -2)+
lag(l_x1, -1)+lag(l_x1, -2)+lag(l_y, -1)+lag(l_y, -2), data=tax1)
summary(tax_ivreg)


I guess that this does not do what you want it to do. I would guess that 
this essentially yields a standard linear regression because the lag() is 
not correctly processed. If you want to use ivreg(), you need to set up 
the lagged variables by hand in advance. Alternatively, you can use 
dynlm() from the dynlm package which allows you to use lag() or the 
simpler L() function in the formula together with zoo data.


For an example, how to set up the lagged variables by hand, you can look 
at the manual page of breakpoints(), especially the seatbelt data example.


hth,
Z


## after estimating it, something weird happened with the several tests in
package strucchange. For example:

cusum - efp(l_y ~ l_x2 + l_x1+ dl_y | lag(l_x2, -1)+lag(l_x2, -2)+
lag(l_x1, -1)+lag(l_x1, -2)+lag(l_y, -1)+lag(l_y, -2), data=tax1,
type=OLS-CUSUM)
sctest(cusum)
plot(cusum)
coef(cusum, breaks=2)

## And:

cusum - efp(tax_ivreg, data=tax1, type=OLS-CUSUM)
sctest(cusum)
plot(cusum)
coef(cusum, breaks=2)

## 1. The plot of the two above were very different and
## 2. When I ask for the breaks, instead of the dates, it returned me a line
of the summary of the estimated tax_ivreg

Any help would be very appreciated.

Thanks

Claudio




--
http://www.shikida.net  and http://works.bepress.com/claudio_shikida/

Esta mensagem pode conter informa??o confidencial e/ou privilegiada. Se voc?
n?o for o destinat?rio ou a pessoa autorizada a receber esta mensagem, n?o
poder? usar, copiar ou divulgar as informa??es nela contidas ou tomar
qualquer a??o baseada nessas informa??es. Se voc? recebeu esta mensagem por
engano, por favor avise imediatamente o remetente, respondendo o presente
e-mail e apague-o em seguida.
This message may contain confidential and/or privileged ...{{dropped:9}}




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] if function problems

2011-08-02 Thread zoe_zhang

Thank you for your adding, Steve, i followed Daivd's suggection and finally
got the answer.
It is my careless that should put lena instead of lenx.
I also tried your codes and worked well. I appreciate your help. I learnt a
lot from this forum.

Cheers,
Zoe

--
View this message in context: 
http://r.789695.n4.nabble.com/if-function-problems-tp3710995p3711340.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R CMD check problem

2011-08-02 Thread Baidya Nath Mandal

Dear friends,

I am building an R package called *mypackage*. I followed every possible
steps (to my understanding) for the same. I got following problem while
doing *R CMD check mypackage*.

* installing *source* package 'mypackage' ...
** libs
cygwin warning:
  MS-DOS style path detected: C:/PROGRA~1/R/R-213~1.0/etc/i386/Makeconf
  Preferred POSIX equivalent is:
/cygdrive/c/PROGRA~1/R/R-213~1.0/etc/i386/Makeconf
  CYGWIN environment variable option nodosfilewarning turns off this
warning.
  Consult the user's guide for more details about POSIX paths:
http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
ERROR: compilation failed for package 'mypackage'
* removing 'C:/Rpackages/mypackage.Rcheck/mypackage'.

What I understood from above is that it is something with PATH variable. I
had set the following PATH variable:
C:\Rtools\bin;C:\Rtools\MinGW\bin;C:\Program
Files\R\R-2.13.0\bin;C:\Program Files\MiKTeX
2.9\miktex\bin;%SystemRoot%\system32;%SystemRoot%;%SystemRoot%\System32\Wbem;%SYSTEMROOT%\System32\WindowsPowerShell\v1.0\;C:\Program
Files\HTML Help Workshop


Can anybody suggest what possibly could have gone wrong?

Thanks,
BN Mandal

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] How to 'mute' a function (like confint())

2011-08-02 Thread Remko Duursma

Dear R-helpers,

I am using confint() within a function, and I want to turn off the message
it prints:

x - rnorm(100)
y - x^1.1+rnorm(100)
nlsfit - nls(y ~ g0*x^g1, start=list(g0=1,g1=1))

 confint(nlsfit)
Waiting for profiling to be done...
2.5%97.5%
g0 0.4484198 1.143761
g1 1.0380479 2.370057


I cannot find any way to turn off 'Waiting for. ..

I tried 

options(max.print=0)

and even

sink(tempfile())
confint(nlsfit)
sink()

This suppresses the printing of the table, but not the cat()-ing of the
'Waiting for...'.

But it keeps writing this message; is there any way to mute it, for this
function and more generally?


thanks,
Remko


--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-mute-a-function-like-confint-tp3711537p3711537.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Fitting ELISA measurements unknowns to 4 parameter logistic model

2011-08-02 Thread assaywiz

Try http://www.myassays.com/four-parameter-fit.assay

It’s free, requires no install and pre-configured for ELISAs.  Just paste
and go 

AW


--
View this message in context: 
http://r.789695.n4.nabble.com/Fitting-ELISA-measurements-unknowns-to-4-parameter-logistic-model-tp3252381p3711676.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Clean up a scatterplot with too much data

2011-08-02 Thread DimmestLemming

I'm working with a lot of data right now, but I'm new to R, and not very good
with it, hence my request for help. What type of graph could I use to
straighten out things like...

http://r.789695.n4.nabble.com/file/n3711389/Untitled.png 

...this?

I want to see general frequencies. Should I use something like a 3D
histogram, or is there an easier way like, say, shading? I'm sure these are
both possible, but I don't know which is easiest or how to implement either
of them.

Thanks!

--
View this message in context: 
http://r.789695.n4.nabble.com/Clean-up-a-scatterplot-with-too-much-data-tp3711389p3711389.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to 'mute' a function (like confint())

2011-08-02 Thread Prof Brian Ripley


See ?suppressMessages


On Tue, 2 Aug 2011, Remko Duursma wrote:


Dear R-helpers,

I am using confint() within a function, and I want to turn off the message
it prints:

x - rnorm(100)
y - x^1.1+rnorm(100)
nlsfit - nls(y ~ g0*x^g1, start=list(g0=1,g1=1))


confint(nlsfit)

Waiting for profiling to be done...
   2.5%97.5%
g0 0.4484198 1.143761
g1 1.0380479 2.370057


I cannot find any way to turn off 'Waiting for. ..

I tried

options(max.print=0)

and even

sink(tempfile())
confint(nlsfit)
sink()

This suppresses the printing of the table, but not the cat()-ing of the
'Waiting for...'.

But it keeps writing this message; is there any way to mute it, for this
function and more generally?


thanks,
Remko


--
View this message in context: 
http://r.789695.n4.nabble.com/How-to-mute-a-function-like-confint-tp3711537p3711537.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  rip...@stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Clean up a scatterplot with too much data

Hi,

One solution could be to subsample the data, or jitter the data (give it
some random noise). A more elegant solution, imho, is to use a 2d
histogram (3d histogram is not a good alternative, I think it is much
better to use color instead of a third dimension). I don't think this is
easy to make using the standard plot system in R, but ggplot2 handles it
nicely. This would involve you needing to learn ggplot2, but I would
highly recommend that anyways :). An example of the plot I have in mind
can be seen at:

http://had.co.nz/ggplot2/stat_bin2d.html

Just scroll down a bit for some examples.

cheers,
Paul

On 08/02/2011 05:26 AM, DimmestLemming wrote:
I'm working with a lot of data right now, but I'm new to R, and not very good
with it, hence my request for help. What type of graph could I use to
straighten out things like...

http://r.789695.n4.nabble.com/file/n3711389/Untitled.png

...this?

I want to see general frequencies. Should I use something like a 3D
histogram, or is there an easier way like, say, shading? I'm sure these are
both possible, but I don't know which is easiest or how to implement either
of them.

Thanks!

--
View this message in context:
http://r.789695.n4.nabble.com/Clean-up-a-scatterplot-with-too-much-data-tp3711389p3711389.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

Re: [R] R-help Digest, Vol 102, Issue 2

2011-08-02 Thread fraenzi . korner

Wir sind bis am 20. August in den Ferien und werden keine e-mails beantworten. 
Bei dringenden Fällen melden Sie sich bei Stefanie von Felten 
steffi.vonfel...@oikostat.ch

We are on vacation until 20. August. In urgent cases, please contact Stefanie 
von Felten steffi.vonfel...@oikostat.ch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Reorganize(stack data) a dataframe inducing names

2011-08-02 Thread Francesca

Works perfectly. Thanks.
f.

On 1 August 2011 18:22, jim holtman jholt...@gmail.com wrote:

 Try this:  had to add extra names to your data since it was not clear
 how it was organized.  Next time use 'dput' to enclose data.

  x - read.table(textConnection( index time  key date   values
 + 13732  27965 DATA.Q211.SUM.Index04/08/11 1.42
 + 13733  27974 DATA.Q211.SUM.Index05/10/11 1.45
 + 13734  27984 DATA.Q211.SUM.Index06/01/11 1.22
 + 13746  28615 DATA.Q211.TDS.Index04/07/11 1.35
 + 13747  28624 DATA.Q211.TDS.Index05/20/11 1.40
 + 13754  29262 DATA.Q211.UBS.Index05/02/11 1.30
 + 13755  29272 DATA.Q211.UBS.Index05/03/11 1.48
 + 13761  29915 DATA.Q211.UCM.Index04/28/11 1.43
 + 13768  30565 DATA.Q211.VDE.Index05/02/11 1.48
 + 13775  31215 DATA.Q211.WF.Index 04/14/11 1.44
 + 13776  31225 DATA.Q211.WF.Index 05/12/11 1.42
 + 13789  31865 DATA.Q211.WPC.Index04/01/11 1.40
 + 13790  31875 DATA.Q211.WPC.Index04/08/11 1.42
 + 13791  31883 DATA.Q211.WPC.Index05/10/11 1.43
 + 13804  32515 DATA.Q211.XTB.Index04/29/11 1.50
 + 13805  32525 DATA.Q211.XTB.Index05/30/11 1.40
 + 13806  32532 DATA.Q211.XTB.Index06/28/11 1.43)
 + , header = TRUE
 + , as.is = TRUE
 + )
  closeAllConnections()
  x
   index  time key date values
 1  13732 27965 DATA.Q211.SUM.Index 04/08/11   1.42
 2  13733 27974 DATA.Q211.SUM.Index 05/10/11   1.45
 3  13734 27984 DATA.Q211.SUM.Index 06/01/11   1.22
 4  13746 28615 DATA.Q211.TDS.Index 04/07/11   1.35
 5  13747 28624 DATA.Q211.TDS.Index 05/20/11   1.40
 6  13754 29262 DATA.Q211.UBS.Index 05/02/11   1.30
 7  13755 29272 DATA.Q211.UBS.Index 05/03/11   1.48
 8  13761 29915 DATA.Q211.UCM.Index 04/28/11   1.43
 9  13768 30565 DATA.Q211.VDE.Index 05/02/11   1.48
 10 13775 31215  DATA.Q211.WF.Index 04/14/11   1.44
 11 13776 31225  DATA.Q211.WF.Index 05/12/11   1.42
 12 13789 31865 DATA.Q211.WPC.Index 04/01/11   1.40
 13 13790 31875 DATA.Q211.WPC.Index 04/08/11   1.42
 14 13791 31883 DATA.Q211.WPC.Index 05/10/11   1.43
 15 13804 32515 DATA.Q211.XTB.Index 04/29/11   1.50
 16 13805 32525 DATA.Q211.XTB.Index 05/30/11   1.40
 17 13806 32532 DATA.Q211.XTB.Index 06/28/11   1.43
  # get index of first occurance of 'key' column
  indx - !duplicated(x$key)
  x[indx,]
   index  time key date values
 1  13732 27965 DATA.Q211.SUM.Index 04/08/11   1.42
 4  13746 28615 DATA.Q211.TDS.Index 04/07/11   1.35
 6  13754 29262 DATA.Q211.UBS.Index 05/02/11   1.30
 8  13761 29915 DATA.Q211.UCM.Index 04/28/11   1.43
 9  13768 30565 DATA.Q211.VDE.Index 05/02/11   1.48
 10 13775 31215  DATA.Q211.WF.Index 04/14/11   1.44
 12 13789 31865 DATA.Q211.WPC.Index 04/01/11   1.40
 15 13804 32515 DATA.Q211.XTB.Index 04/29/11   1.50
 
 



 On Mon, Aug 1, 2011 at 11:13 AM, Francesca francesca.panco...@gmail.com
 wrote:
  Dear Contributors
  thanks for any help you can provide. I searched the threads
  but I could not find any query that satisfied my needs.
  This is my database:
   index time values
  13732  27965 DATA.Q211.SUM.Index04/08/11 1.42
  13733  27974 DATA.Q211.SUM.Index05/10/11 1.45
  13734  27984 DATA.Q211.SUM.Index06/01/11 1.22
  13746  28615 DATA.Q211.TDS.Index04/07/11 1.35
  13747  28624 DATA.Q211.TDS.Index05/20/11 1.40
  13754  29262 DATA.Q211.UBS.Index05/02/11 1.30
  13755  29272 DATA.Q211.UBS.Index05/03/11 1.48
  13761  29915 DATA.Q211.UCM.Index04/28/11 1.43
  13768  30565 DATA.Q211.VDE.Index05/02/11 1.48
  13775  31215 DATA.Q211.WF.Index 04/14/11 1.44
  13776  31225 DATA.Q211.WF.Index 05/12/11 1.42
  13789  31865 DATA.Q211.WPC.Index04/01/11 1.40
  13790  31875 DATA.Q211.WPC.Index04/08/11 1.42
  13791  31883 DATA.Q211.WPC.Index05/10/11 1.43
  13804  32515 DATA.Q211.XTB.Index04/29/11 1.50
  13805  32525 DATA.Q211.XTB.Index05/30/11 1.40
  13806  32532 DATA.Q211.XTB.Index06/28/11 1.43
 
  I need to select only the rows of this database that correspond to each
  of the first occurrences of the string represented in column
  index. In the example shown I would like to obtain a new
  data.frame which is
 
  index time values
  13732  27965 DATA.Q211.SUM.Index04/08/11 1.42
  13746  28615 DATA.Q211.TDS.Index04/07/11 1.35
  13754  29262 DATA.Q211.UBS.Index05/02/11 1.30
  13761  29915 DATA.Q211.UCM.Index04/28/11 1.43
  13768  30565 DATA.Q211.VDE.Index05/02/11 1.48
  13775  31215 DATA.Q211.WF.Index04/14/11 1.44
  13789  31865 DATA.Q211.WPC.Index04/01/11 1.40
  13804  32515 DATA.Q211.XTB.Index04/29/11 1.50
 
  As you can see, it is not the whole string to change,
  rather a

Re: [R] Clean up a scatterplot with too much data

2011-08-02 Thread Karl Ove Hufthammer

DimmestLemming wrote:

 I'm working with a lot of data right now, but I'm new to R, and not very
 good with it, hence my request for help. What type of graph could I use to
 straighten out things like...
 
 http://r.789695.n4.nabble.com/file/n3711389/Untitled.png

Three nice alternatives:

example(smoothScatter)
example(sunflowerplot)
library(hexbin)
example(hexbinplot)

(And do remove the outliers before plotting.)

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R CMD check problem

2011-08-02 Thread Duncan Murdoch


On 11-08-02 5:26 AM, Baidya Nath Mandal wrote:

Dear friends,

I am building an R package called *mypackage*. I followed every possible
steps (to my understanding) for the same. I got following problem while
doing *R CMD check mypackage*.

* installing *source* package 'mypackage' ...
** libs
cygwin warning:
   MS-DOS style path detected: C:/PROGRA~1/R/R-213~1.0/etc/i386/Makeconf
   Preferred POSIX equivalent is:
/cygdrive/c/PROGRA~1/R/R-213~1.0/etc/i386/Makeconf
   CYGWIN environment variable option nodosfilewarning turns off this
warning.
   Consult the user's guide for more details about POSIX paths:
 http://cygwin.com/cygwin-ug-net/using.html#using-pathnames


I believe that warning is ignorable, but you can turn it off using

set CYGWIN=nodosfilewarning

It probably didn't cause the error below.


ERROR: compilation failed for package 'mypackage'


I don't know what did cause that error, but it's likely something in 
your src directory of the package.  What do you have there?


Duncan Murdoch


* removing 'C:/Rpackages/mypackage.Rcheck/mypackage'.

What I understood from above is that it is something with PATH variable. I
had set the following PATH variable:
C:\Rtools\bin;C:\Rtools\MinGW\bin;C:\Program
Files\R\R-2.13.0\bin;C:\Program Files\MiKTeX
2.9\miktex\bin;%SystemRoot%\system32;%SystemRoot%;%SystemRoot%\System32\Wbem;%SYSTEMROOT%\System32\WindowsPowerShell\v1.0\;C:\Program
Files\HTML Help Workshop


Can anybody suggest what possibly could have gone wrong?

Thanks,
BN Mandal

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R CMD check problem

2011-08-02 Thread Joshua Wiley

The cygwin warning should not be fatal.  Is that what made you think there's a 
problem with your path?  Can you upload mypackage online?  Two options would be 
Github hosts that sort of thing or you could use a tar ball and any file 
hosting service.  I (and possibly others more skilled) would be happy to try it 
on my system if I had it.

You should also be able to see exactly where in the build process it failed 
from the log.

Cheers,

Josh

On Aug 2, 2011, at 2:26, Baidya Nath Mandal mandal.s...@gmail.com wrote:

 Dear friends,
 
 I am building an R package called *mypackage*. I followed every possible
 steps (to my understanding) for the same. I got following problem while
 doing *R CMD check mypackage*.
 
 * installing *source* package 'mypackage' ...
 ** libs
 cygwin warning:
  MS-DOS style path detected: C:/PROGRA~1/R/R-213~1.0/etc/i386/Makeconf
  Preferred POSIX equivalent is:
 /cygdrive/c/PROGRA~1/R/R-213~1.0/etc/i386/Makeconf
  CYGWIN environment variable option nodosfilewarning turns off this
 warning.
  Consult the user's guide for more details about POSIX paths:
http://cygwin.com/cygwin-ug-net/using.html#using-pathnames
 ERROR: compilation failed for package 'mypackage'
 * removing 'C:/Rpackages/mypackage.Rcheck/mypackage'.
 
 What I understood from above is that it is something with PATH variable. I
 had set the following PATH variable:
 C:\Rtools\bin;C:\Rtools\MinGW\bin;C:\Program
 Files\R\R-2.13.0\bin;C:\Program Files\MiKTeX
 2.9\miktex\bin;%SystemRoot%\system32;%SystemRoot%;%SystemRoot%\System32\Wbem;%SYSTEMROOT%\System32\WindowsPowerShell\v1.0\;C:\Program
 Files\HTML Help Workshop
 
 
 Can anybody suggest what possibly could have gone wrong?
 
 Thanks,
 BN Mandal
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Plotting question

2011-08-02 Thread Karl Ove Hufthammer

Andrew McCulloch wrote:

 I use R to draw my graphs. I have 100 points on a simple xy-plot. The
 points are distinguished by a third variable which is categorical with 10
 levels. I have been plotting x against y and using gray scales to
 distinguish the level of the categorical variable for each point. It looks
 ok to me but a journal reviewer says this is not any use. I cannot afford
 to pay for colour prints. Any ideas on what is the best way to distinguish
 10 groups on an xy scatter plot?

How about having *10* scatterplots + an identical grid in each plot? Try

  example(coplot)

for an idea about it could look (ignore the marginal plots). Of course, do 
use the lattice or the ggplot2 package, not the coplot function.

Too bad you have 10 groups and not 9 (or 12), BTW ... :-/

-- 
Karl Ove Hufthammer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is R the right choice for simulating first passage times of random walks?

2011-08-02 Thread Paul Menzel

Dear Dennis and Steve,


Am Sonntag, den 31.07.2011, 23:32 -0400 schrieb Steve Lianoglou:

[…]

 How about trying to write the of this `f4` function below using the
 rcpp/inline combo. The C/C++ you will need to write looks to be quite
 trivial, let's change f4 to accept an x argument as a vector:
 
 I've defined f4 in the same way as Dennis did:
 
  f4 - function()
   {
  x - sample(c(-1L,1L), 1)
 
   if (x = 0 ) {return(1)} else {
csum - x
len - 1
while(csum  0) {
csum - csum + sample(c(-1, 1), 1)
len - len + 1
   } }
   len
   }
 
 Now, let's do some inline/c++ mojo:
 
 library(inline)
 inc - 
 #include stdio.h
 #include stdlib.h
 #include time.h
 
 
 fxx -cxxfunction(includes=inc, plugin=Rcpp, body=
   int len = 1;
   int x = ((rand() % 2 ) == 0) ? 1 : -1;
   int csum = x;
 
   while (csum  0) {
 x = ((rand() % 2 ) == 0) ? 1 : -1;
 len++;
 csum = csum + x;
   }
 
   return wrap(len);
 )
 
 Assuming I've faithfully translated this into c++, the timings aren't
 all that comparable.
 
 Doing 500 replicates with the pure R version:
 
 set.seed(123)
 system.time(out - replicate(500, f4()))
user  system elapsed
  31.525   0.120  32.510
 
 Doing 10,000 replicates using the fxx function doesn't even break a sweat:
 
 system.time(outxx - replicate(1, fxx()))
user  system elapsed
   0.371   0.001   0.373
 
 range(out)
 [1]   1 1994308
 
 range(outxx)
 [1]1 11909394

thank you very much for your suggestions.

This is indeed a nice speed.

1. I first had that implemented in FORTRAN (and Python) too, but turned
to R for two reasons. First I wanted to use also other distributions
later on and thought that it would be easier with R and that R would
have that implemented as fast as possible. Secondly I thought that R
would also operate faster having the right vectorization and using
`csum()`. But I guess it is difficult to find a good model to use the
advantages of R.

Especially looking at `top` when running this example CPU is used 100 %
but memory only 40 MB from 2 GB. So if one could use another data
structure maybe the calculations could be done on more walks at once.

2. It is indeed possible that the walk never returns to zero, so I
should make sure, that I abort the while loop after a certain length.

3. Looking at the data types I am wondering if some integer overflow(?)
could happen. I could make the length variable unsigned I suppose [1].
But still `csum` could go from `-len` to 0 and for the normal random
walk unsigned should not be a problem too besides that the logic/checks
have to be adapted.

For integrated random walks, `ccsum += csum`, `ccsum` would go from
-(ccsum**2)/2 up to 0. So later on I should use probably the 64 bit data
type (unsigned) `long` for `ccsum`, `csum` and `length` to avoid those
problems. Memory does not seem to be a problem. Also I need to add an
additional check for the height and length in the while loop like the
following.

(csum  0)  (csum  -ULONG_MAX)  (len = ULONG_MAX)

So I came up with the following and to use unsigned I only consider that
the random walk stays positive instead of negative.

 8  code  8 
library(inline)
inc - 
#include climits
#include stdio.h
#include stdlib.h
#include time.h


f9 -cxxfunction(includes=inc, plugin=Rcpp, body=
  unsigned long len = 1;

  if ((rand() % 2 ) == 0) {
return wrap(len);
  }

  unsigned long x = 1;

  for (unsigned long csum = x; csum  0; csum = ((rand() % 2 ) == 0) ? csum + 
1: csum - 1) {
len++;
if ((csum == ULONG_MAX)  (len == ULONG_MAX)) {
  return wrap(len);
}
  }

  return wrap(len);
)
 8  code  8 

I do not know if the compiler would have optimized it that way anyway
and if there is any difference (besides the overflow checks).

 set.seed(1); system.time( z9_1 - replicate(1000, f9()) )
   User  System verstrichen 
  0.076   0.004   0.084 
 range(z9_1)
[1]   1 1449034
 length(z9_1)
[1] 1000


Thanks,

Paul


[1] 
https://secure.wikimedia.org/wikipedia/en/wiki/Integer_(computer_science)#Common_integral_data_types


signature.asc
Description: This is a digitally signed message part
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Memory limit in Aggregate()

2011-08-02 Thread Guillaume

Dear all,
I am trying to aggregate a table (divided in two lists here), but get a
memory error.
Here is the code I'm running : 

sessionInfo()

print(paste(memory.limit() , memory.limit()))
print(paste(memory.size() , memory.size()))
print(paste(memory.size(TRUE) , memory.size(TRUE)))

print(paste(size listX , object.size(listX)))
print(paste(size listBy , object.size(listBy)))
print(paste(length , object.size(nrow(listX

tableAgg - aggregate(x = listX
,   by  = listBy
,   FUN = max)


It returns :

R version 2.9.0 Patched (2009-05-09 r48513) 
i386-pc-mingw32 
locale:
LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
attached base packages:
[1];stats;graphics;grDevices;utils;datasets;methods;base
other attached packages:
[1];RODBC_1.3-2;HarpTools_1.4;HarpReport_1.9
loaded via a namespace (and not attached):
[1];tools_2.9.0
[1];memory.limit()  4095
[1];memory.size()  31.92
[1];memory.size(TRUE)  166.94
[1];size listX  218312
[1];size listBy  408552
[1];length  9083
Erreur in vector(list, prod(extent)) : 
  cannot allocate vector of length 1224643220

(the last line is translated from the french error message impossible
d'allouer un vecteur de longueur 1224643220 )

Why would R create such a long vector (my original lists , and is there a
way to avoid this error ?

Thank you for your help,

Guillaume

--
View this message in context: 
http://r.789695.n4.nabble.com/Memory-limit-in-Aggregate-tp3711819p3711819.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] efficient way to reduce running time

2011-08-02 Thread Kathie

Dear R users,

Would you plz tell me how to avoid this for loop blow??

I think there might be a better way to reduce running time.

--
## y1 and y2 are n*1 vectors

for (k in 1:n){
ymax - max( y1[k], y2[k] )
   
i - 0:ymax

sums- -lgamma(y1[k]-i+1)-lgamma(i+1)-lgamma(y2[k]-i+1)

maxsums - max(sums)

sums - sums - maxsums

lsum - log( sum(exp(sums)) ) + maxsums

logbp[k] - y1[k]  + y2[k]  + lsum
}



Any suggestion will be greatly appreciated.

Regards,

Kathryn Lord 

--
View this message in context: 
http://r.789695.n4.nabble.com/efficient-way-to-reduce-running-time-tp3711985p3711985.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Standard Deviation of a matrix

2011-08-02 Thread chakri

Hello,

My R knowledge could not take me any further, so this request !

I have a matrix of dimensions (1185 X 1185). I want to calculate standard
deviation of entire matrix. 
sd function of {stats} calculates standard deviation for each row/column,
giving 1 X 1185 matrix as result. I would like to have 1 X 1 matrix as
result.

Any ideas, how to do this ?

TIA
Chakri 

--
View this message in context: 
http://r.789695.n4.nabble.com/Standard-Deviation-of-a-matrix-tp3711991p3711991.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Using Function

2011-08-02 Thread Silvano


Hi,

I have some simple statistics to calculate for a large 
number of variables.

I created a simple function to apply to variables.
I would like the variable name to be placed automatically.
I tried the following function but is not working.

desc = function(x){
media = mean(x, na.rm=T)
desvio = sd(x, na.rm=T)
cv = desvio/media*100
saida = cbind(media, desvio, cv)
colnames(saida) = c(NULL, 'Média', 
'Desvio', 'CV')

rownames(saida) = c(x)
saida
}

desc(Idade)

Média  Desvio  CV
Idade 44.04961 16.9388 38.4539

How do you get the variable name is placed as the first 
element?


My objective is get something like:

rbind(
desc(Altura),
desc(Idade),
desc(IMC),
desc(FC),
desc(CIRCABD),
desc(GLICOSE),
desc(UREIA),
desc(CREATINA),
desc(CTOTAL),
desc(CHDL),
desc(CLDL),
desc(CVLDL),
desc(TRIG),
desc(URICO),
desc(SAQRS),
desc(SOKOLOW_LYON),
desc(CORNELL),
desc(QRS_dur),
desc(Interv_QT)
)

Thanks a lot,

--
Silvano Cesar da Costa
Departamento de Estatística
Universidade Estadual de Londrina
Fone: 3371-4346

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Standard Deviation of a matrix

 Hi!

The sample below should give you what you want:

M = matrix(runif(100), 10, 10)
sd(as.numeric(M))

So the as.numeric command is the key. It transforms the matrix to a 1D
vector. Or alternatively without using as.numeric:

M = matrix(runif(100), 10, 10)
M
dim(M) = 100
M
sd(M)

Here I use the dim command to set the dimensions to a vector of 100 long.

cheers,
Paul

On 08/02/2011 11:07 AM, chakri wrote:
 Hello,

 My R knowledge could not take me any further, so this request !

 I have a matrix of dimensions (1185 X 1185). I want to calculate standard
 deviation of entire matrix. 
 sd function of {stats} calculates standard deviation for each row/column,
 giving 1 X 1185 matrix as result. I would like to have 1 X 1 matrix as
 result.

 Any ideas, how to do this ?

 TIA
 Chakri 

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Standard-Deviation-of-a-matrix-tp3711991p3711991.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Errors, driving me nuts

 On 08/01/2011 08:47 PM, Matt Curcio wrote:
 Greetings all,
 I am getting this error that is driving me nuts... (not a long trip, haha)

 I have a set of files and in these files I want to calculate ttests on
 rows 'compareA' and 'compareB' (these will change over time there I
 want a variable here). Also these files are in many different
 directories so I want a way filter out the junk...  Anyway I don't
 believe that this is related to my errors but I mention it none the
 less.

 files_to_test - list.files (pattern = kegg.combine)
 for (i in 1:length (files_to_test)) {
 +raw_data - read.table (files_to_test[i], header=TRUE, sep= )
 +tmpA - raw_data[,compareA]
 +tmpB - raw_data[,compareB]
 +tt - t.test (tmpA, tmpB, var.equal=TRUE)
 +tt_pvalue[i] - tt$p.value
 + }
 Error in tt_pvalue[i] - tt$p.value : object 'tt_pvalue' not found
 # I tried setting up a vector...
 # as.vector(tt_pvalue, mode=any) ### but NO GO
...an awesome alternative is to use ldply from the plyr package:

library(plyr)
files_to_test - list.files (pattern = kegg.combine)
tt_pvalue - ldply(files_to_test, function(fname) {
raw_data - read.table (files_to_test[i], header=TRUE, sep= )
tmpA - raw_data[,compareA]
tmpB - raw_data[,compareB]
tt - t.test (tmpA, tmpB, var.equal=TRUE)
return(data.frame(fname = fname, pvalue = tt$p.value))
}, .progress = TRUE)

This saves you some bookkeeping (no need to create tt_pvalue in advance
and keep track of the iterator (i)) and you get a nice progress bar
(good when loops take long). ldply (and other plyr functions) are what I
use most when processing large amounts of information.

cheers,
Paul

 file.name = paste(ttest.results., compareA, compareB, )
 setwd(save_to)
 write.table(tt_pvalue, file=file.name, sep=\t )
 Error in inherits(x, data.frame) : object 'tt_pvalue' not found
 # No idea??

 What is going wrong??
 M


 Matt Curcio
 M: 401-316-5358
 E: matt.curcio...@gmail.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Standard Deviation of a matrix

2011-08-02 Thread Petr PIKAL

Hi

  Hi!
 
 The sample below should give you what you want:
 
 M = matrix(runif(100), 10, 10)
 sd(as.numeric(M))
 
 So the as.numeric command is the key. It transforms the matrix to a 1D
 vector. Or alternatively without using as.numeric:
 
 M = matrix(runif(100), 10, 10)
 M
 dim(M) = 100

or dim(M)-NULL

 M
 sd(M)
 
 Here I use the dim command to set the dimensions to a vector of 100 
long.
 
 cheers,
 Paul
 
 On 08/02/2011 11:07 AM, chakri wrote:
  Hello,
 
  My R knowledge could not take me any further, so this request !
 
  I have a matrix of dimensions (1185 X 1185). I want to calculate 
standard
  deviation of entire matrix. 
  sd function of {stats} calculates standard deviation for each 
row/column,
  giving 1 X 1185 matrix as result. I would like to have 1 X 1 matrix as
  result.
 
  Any ideas, how to do this ?
 
  TIA
  Chakri 
 
  --
  View this message in context: http://r.789695.n4.nabble.com/Standard-
 Deviation-of-a-matrix-tp3711991p3711991.html
  Sent from the R help mailing list archive at Nabble.com.
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 -- 
 Paul Hiemstra, Ph.D.
 Global Climate Division
 Royal Netherlands Meteorological Institute (KNMI)
 Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
 P.O. Box 201 | 3730 AE | De Bilt
 tel: +31 30 2206 494
 
 http://intamap.geo.uu.nl/~paul
 http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Standard Deviation of a matrix



On Aug 2, 2011, at 8:48 AM, Petr PIKAL wrote:


Hi


Hi!

The sample below should give you what you want:

M = matrix(runif(100), 10, 10)
sd(as.numeric(M))

So the as.numeric command is the key. It transforms the matrix to a  
1D

vector. Or alternatively without using as.numeric:

M = matrix(runif(100), 10, 10)
M
dim(M) = 100


or dim(M)-NULL


shortest would surely be:

sd( c(M) )

--
David.



M
sd(M)

Here I use the dim command to set the dimensions to a vector of 100

long.


cheers,
Paul

On 08/02/2011 11:07 AM, chakri wrote:

Hello,

My R knowledge could not take me any further, so this request !

I have a matrix of dimensions (1185 X 1185). I want to calculate

standard

deviation of entire matrix.
sd function of {stats} calculates standard deviation for each

row/column,
giving 1 X 1185 matrix as result. I would like to have 1 X 1  
matrix as

result.

Any ideas, how to do this ?

TIA
Chakri

--
View this message in context: http://r.789695.n4.nabble.com/ 
Standard-

Deviation-of-a-matrix-tp3711991p3711991.html

Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



--
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem Installing/Uninstalling Rattle

2011-08-02 Thread adarwish

Rattle won't install properly on my Windows 7 64 bit laptop.

Here is what I've tried:

I've followed the instructions here:
http://rattle.togaware.com/rattle-install-mswindows.html
I had R installed already.
I downloaded the GTK+ packages, unzipped the 32 bit one into c:\gtkwin32.

I put c:\gtkwin32\bin in the system variables PATH.

I launched R, installed the rattle package, called the rattle library,
called rattle().

It told me RGtk2 could not be found and asked to install it.  I let it
download it to install, but still nothing.

Restarting/resintalling R has not helped.  And when I try
remove.packages(rattle) I get the error:

Removing package(s) from ‘C:/Users/darwish/Documents/R/win-library/2.13’
(as ‘lib’ is unspecified)
Error in match(x, table, nomatch = 0L) : 
  'match' requires vector arguments

I've restarted R before trying anything multiple times.

From what I understand, I need to clean everything off and start anew.  How
do I remove rattle so I can start fresh?  What did I do wrong in my steps?  

Thanks in advance.

--
View this message in context: 
http://r.789695.n4.nabble.com/Problem-Installing-Uninstalling-Rattle-tp3712221p3712221.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Functions for Sum of determinants of ranges of matrix subsets

2011-08-02 Thread john james

Dear R-help list,
Pls I have this problem. Suppose I have a matrix of size nxn say, generated as 
follows
 
z-matrix(rnorm(n*n,0,1),nrow=n)
 
I want to write a function such that for i in 1:n, I will remove the rows and 
columns 
corresponding to i (so, will be left with n-1*n-1 submatrix in each cases). Now 
I need
the sum of the determinant of each of this submatrices. As an example, if n=3, 
it means I will have det(1strow and 1stcolum removed) + det(2ndrow and 2ndcolum 
removed) + det(3rdrow and 3rdcolum removed).
 
Any help will be appreciated. Thanks
 
John
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] execute r-code stored in a string variable

2011-08-02 Thread Kim Lillesøe

Dear all

I have a simple R question. How do I execute R-code stored in a variable?

E.g if I have a variable which contains some R-code:
c = reg - lm(sales$sales~sales$price)

Is it possible to execute c
E.g like Exec(c)

I hope someone can help.

Thank you
Kim Lillesøe

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Odp: Using Function

2011-08-02 Thread Petr PIKAL

Hi

 
 Hi,
 
 I have some simple statistics to calculate for a large 
 number of variables.
 I created a simple function to apply to variables.
 I would like the variable name to be placed automatically.
 I tried the following function but is not working.
 
 desc = function(x){
  media = mean(x, na.rm=T)
  desvio = sd(x, na.rm=T)
  cv = desvio/media*100
  saida = cbind(media, desvio, cv)
  colnames(saida) = c(NULL, 'Média', 
 'Desvio', 'CV')
  rownames(saida) = c(x)
  saida
  }

You are quite close. This seems to do what you want if I presume that your 
variables are located in data frame

desc = function(x){
 media = mean(x, na.rm=T)
 desvio = sd(x, na.rm=T)
 cv = desvio/media*100
 saida = data.frame(Media=media, Desvio=desvio, CV=cv)
 saida
 }

iris4 - iris[,1:4]

sapply(iris4, desc)
   Sepal.Length Sepal.Width Petal.Length Petal.Width
Media  5.84 3.0573333.7581.199333 
Desvio 0.82806610.4358663   1.765298 0.7622377 
CV 14.17113 14.2564246.97441 63.55511 

If you want switch rows and cols use 

t(sapply(iris4, desc))

Regards
Petr


 
 desc(Idade)
 
  Média  Desvio  CV
 Idade 44.04961 16.9388 38.4539
 
 How do you get the variable name is placed as the first 
 element?
 
 My objective is get something like:
 
 rbind(
 desc(Altura),
 desc(Idade),
 desc(IMC),
 desc(FC),
 desc(CIRCABD),
 desc(GLICOSE),
 desc(UREIA),
 desc(CREATINA),
 desc(CTOTAL),
 desc(CHDL),
 desc(CLDL),
 desc(CVLDL),
 desc(TRIG),
 desc(URICO),
 desc(SAQRS),
 desc(SOKOLOW_LYON),
 desc(CORNELL),
 desc(QRS_dur),
 desc(Interv_QT)
 )
 
 Thanks a lot,
 
 --
 Silvano Cesar da Costa
 Departamento de Estatística
 Universidade Estadual de Londrina
 Fone: 3371-4346
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Clean up a scatterplot with too much data

In addition to the other responses (all of which I liked), a couple of
other alternatives to consider are 2D density plots (see ?kde2d in the
MASS package, for example) or geom_tile() in the ggplot2 package,
which you can think of as a 3D histogram projected to 2D with color
corresponding to (relative) frequency, as suggested by Paul Hiemstra.
geom_tile() is a discretized, gridded version of a hexbin plot, but I
would start with the hexbin myself. I echo KOH's comment: make sure
you remove the outliers first, especially that one in the upper left
corner :)

After looking at your plot, here's my question: why would you plot
kills/minute vs. minutes played? Doesn't the first variable render the
second one moot? Wouldn't kills vs. minutes played be a more relevant
(scatter)plot? If you have information on the skill level of the
players, you could incorporate that information into the plot as well.
There are several nice ways to go if this is the case.

If kills/minute is the more appropriate measure, a univariate density
plot would make sense, or a histogram.

HTH,
Dennis

On Mon, Aug 1, 2011 at 10:26 PM, DimmestLemming nicoadams...@gmail.com wrote:
I'm working with a lot of data right now, but I'm new to R, and not very good
with it, hence my request for help. What type of graph could I use to
straighten out things like...

http://r.789695.n4.nabble.com/file/n3711389/Untitled.png

...this?

Thanks!

--
View this message in context:
http://r.789695.n4.nabble.com/Clean-up-a-scatterplot-with-too-much-data-tp3711389p3711389.html
Sent from the R help mailing list archive at Nabble.com.

Re: [R] Clean up a scatterplot with too much data

2011-08-02 Thread R. Michael Weylandt michael.weyla...@gmail.com

On 08/02/2011 01:07 PM, Dennis Murphy wrote:
In addition to the other responses (all of which I liked), a couple of
other alternatives to consider are 2D density plots (see ?kde2d in the
MASS package, for example) or geom_tile() in the ggplot2 package,
which you can think of as a 3D histogram projected to 2D with color
corresponding to (relative) frequency, as suggested by Paul Hiemstra.
geom_tile() is a discretized, gridded version of a hexbin plot, but I

When using geom_tile you need to bin the data yourself. I much prefer
using stat_bin2d which does all the work for you.

cheers,
Paul

would start with the hexbin myself. I echo KOH's comment: make sure
you remove the outliers first, especially that one in the upper left
corner :)

If kills/minute is the more appropriate measure, a univariate density
plot would make sense, or a histogram.

HTH,
Dennis

On Mon, Aug 1, 2011 at 10:26 PM, DimmestLemming nicoadams...@gmail.com
wrote:
I'm working with a lot of data right now, but I'm new to R, and not very good
with it, hence my request for help. What type of graph could I use to
straighten out things like...

http://r.789695.n4.nabble.com/file/n3711389/Untitled.png

...this?

Thanks!

--
View this message in context:
http://r.789695.n4.nabble.com/Clean-up-a-scatterplot-with-too-much-data-tp3711389p3711389.html
Sent from the R help mailing list archive at Nabble.com.

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

Re: [R] Identifying US holidays

Now that I'm back at my computer, I'll actually suggest you do something
else entirely.

If you look at the code of holidayNYSE() or by calling listHolidays() of the
timeDate package you'll see that there are many many functions that get
every conceivable holiday directly. I'll let you pick the holidays you want,
but a simple script might be like this:

x-seq(as.Date(2011-01-01), as.Date(2011-12-31),by=day)

GetHolidays - function(x) {
years = as.POSIXlt(x)$year+1900
years = unique(years)
holidays - NULL
for (y in years) {
#If you don't need the if/then statements to include which years something
was a NYSE holiday, you should drop the loop since the holiday functions are
vectorized
if (y = 1885)
holidays - c(holidays, as.character(USNewYearsDay(y)))
if (y = 1885)
holidays - c(holidays, as.character(USIndependenceDay(y)))
if (y = 1885)
holidays - c(holidays, as.character(USThanksgivingDay(y)))
if (y = 1885)
holidays - c(holidays, as.character(USChristmasDay(y)))
 }
 holidays = as.Date(holidays,format=%Y-%m-%d)
 ans = x %in% holidays
return(ans)
}

This should return a boolean vector indicating which dates fall on the
selected holidays: feel free to add/delete holidays as you wish. To get the
actual holiday dates, this should work: x[GetHolidays(x)]. If you want to
identify things by holiday, you'll only have to modify the script slightly.
Let me know if I can help further!

Michael Weylandt

On Mon, Aug 1, 2011 at 4:57 PM, Dimitri Liakhovitski 
dimitri.liakhovit...@gmail.com wrote:

 To be specific, I only need to get rid of 2 NYSE holidays:
 Washington's Birthday and Good Friday.
 Is there a way to reduce the vector of NYSE holidays in timeDate by
 throwing out those two?
 Thank you!
 Dimitri

 On Mon, Aug 1, 2011 at 4:24 PM, R. Michael Weylandt
 michael.weyla...@gmail.com michael.weyla...@gmail.com wrote:
  Don't know if this is sufficiently slick for this list (which never fails
 to impress me with quick and elegant solutions) but I would point out to you
 that GF is the only NYSE holiday falling in March or April so it shouldn't
 be hard to discard it if desired.
 
  Michael Weylandt
 
  On Aug 1, 2011, at 4:18 PM, Dimitri Liakhovitski 
 dimitri.liakhovit...@gmail.com wrote:
 
  Just to clarify - I realize that major is subjective here. Maybe I
  should say most common.
  But maybe there is a way for me to select from a list of all NYSE
  holidays and flag only some of them?
  Just not sure how to do it...
  Thanks!
  Dimitri
 
  On Mon, Aug 1, 2011 at 3:45 PM, Dimitri Liakhovitski
  dimitri.liakhovit...@gmail.com wrote:
  Hello!
 
  I am trying to identify which ones of a vector of dates are US
  holidays. And, ideally, which is which. And I do not know (a-priori)
  which dates those should be.
  I have, for example:
   x-seq(as.Date(2011-01-01),as.Date(2011-12-31),by=day)
  (x)
 
  I think chron should help me here - but maybe I am not using it
 properly:
 
  library(chron)
  is.holiday(chron) # Says that none of those dates are holidays
 
  ?is.holiday says: holidays is an object that should be listing
  holidays. But I want to figure out which of my dates are US holidays
  and don't want to provide a list of
 
  Package timeDate does almost what I need:
  library(timeDate)
  holidayNYSE(2008:2010)
  holidayNYSE()
 
  However, I don't need all the NYSE holidays (like Good Friday). Just
  the major US holidays - New Years, MLK, Memorial Day, Independence
  Day, Labor Day, Halloween, Thanksgiving, Christmas.
  Is there any way to identify major US holidays?
 
  Thanks a lot!
 
  -
  Dimitri Liakhovitski
  marketfusionanalytics.com
 
 
 
 
  --
  Dimitri Liakhovitski
  marketfusionanalytics.com
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Dimitri Liakhovitski
 marketfusionanalytics.com


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] efficient way to reduce running time

Hi:

Could you please provide a reproducible example? In your code,
  (i) n is undefined;
  (ii) logbp is undefined.
A description of what you want to do and/or a reproducible example
with an expected outcome would be useful.

As the bottom of each e-mail to R-help says...

PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Dennis

On Tue, Aug 2, 2011 at 4:05 AM, Kathie kathryn.lord2...@gmail.com wrote:
 Dear R users,

 Would you plz tell me how to avoid this for loop blow??

 I think there might be a better way to reduce running time.

 --
 ## y1 and y2 are n*1 vectors

        for (k in 1:n){
                ymax - max( y1[k], y2[k] )

                i - 0:ymax

                sums- -lgamma(y1[k]-i+1)-lgamma(i+1)-lgamma(y2[k]-i+1)

                maxsums - max(sums)

                sums - sums - maxsums

                lsum - log( sum(exp(sums)) ) + maxsums

                logbp[k] - y1[k]  + y2[k]  + lsum
        }

 

 Any suggestion will be greatly appreciated.

 Regards,

 Kathryn Lord

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/efficient-way-to-reduce-running-time-tp3711985p3711985.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Identifying US holidays

2011-08-02 Thread Dimitri Liakhovitski

Thanks a lot, Michael - that's exactly what I was looking for!
Dimitri

On Tue, Aug 2, 2011 at 9:48 AM, R. Michael Weylandt
michael.weyla...@gmail.com michael.weyla...@gmail.com wrote:
 Now that I'm back at my computer, I'll actually suggest you do something
 else entirely.

 If you look at the code of holidayNYSE() or by calling listHolidays() of the
 timeDate package you'll see that there are many many functions that get
 every conceivable holiday directly. I'll let you pick the holidays you want,
 but a simple script might be like this:

 x-seq(as.Date(2011-01-01), as.Date(2011-12-31),by=day)

 GetHolidays - function(x) {
     years = as.POSIXlt(x)$year+1900
     years = unique(years)
     holidays - NULL
     for (y in years) {
 #If you don't need the if/then statements to include which years something
 was a NYSE holiday, you should drop the loop since the holiday functions are
 vectorized
     if (y = 1885)
     holidays - c(holidays, as.character(USNewYearsDay(y)))
     if (y = 1885)
     holidays - c(holidays, as.character(USIndependenceDay(y)))
     if (y = 1885)
     holidays - c(holidays, as.character(USThanksgivingDay(y)))
     if (y = 1885)
     holidays - c(holidays, as.character(USChristmasDay(y)))
  }
  holidays = as.Date(holidays,format=%Y-%m-%d)
  ans = x %in% holidays
     return(ans)
 }

 This should return a boolean vector indicating which dates fall on the
 selected holidays: feel free to add/delete holidays as you wish. To get the
 actual holiday dates, this should work: x[GetHolidays(x)]. If you want to
 identify things by holiday, you'll only have to modify the script slightly.
 Let me know if I can help further!

 Michael Weylandt

 On Mon, Aug 1, 2011 at 4:57 PM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:

 To be specific, I only need to get rid of 2 NYSE holidays:
 Washington's Birthday and Good Friday.
 Is there a way to reduce the vector of NYSE holidays in timeDate by
 throwing out those two?
 Thank you!
 Dimitri

 On Mon, Aug 1, 2011 at 4:24 PM, R. Michael Weylandt
 michael.weyla...@gmail.com michael.weyla...@gmail.com wrote:
  Don't know if this is sufficiently slick for this list (which never
  fails to impress me with quick and elegant solutions) but I would point out
  to you that GF is the only NYSE holiday falling in March or April so it
  shouldn't be hard to discard it if desired.
 
  Michael Weylandt
 
  On Aug 1, 2011, at 4:18 PM, Dimitri Liakhovitski
  dimitri.liakhovit...@gmail.com wrote:
 
  Just to clarify - I realize that major is subjective here. Maybe I
  should say most common.
  But maybe there is a way for me to select from a list of all NYSE
  holidays and flag only some of them?
  Just not sure how to do it...
  Thanks!
  Dimitri
 
  On Mon, Aug 1, 2011 at 3:45 PM, Dimitri Liakhovitski
  dimitri.liakhovit...@gmail.com wrote:
  Hello!
 
  I am trying to identify which ones of a vector of dates are US
  holidays. And, ideally, which is which. And I do not know (a-priori)
  which dates those should be.
  I have, for example:
   x-seq(as.Date(2011-01-01),as.Date(2011-12-31),by=day)
  (x)
 
  I think chron should help me here - but maybe I am not using it
  properly:
 
  library(chron)
  is.holiday(chron) # Says that none of those dates are holidays
 
  ?is.holiday says: holidays is an object that should be listing
  holidays. But I want to figure out which of my dates are US holidays
  and don't want to provide a list of
 
  Package timeDate does almost what I need:
  library(timeDate)
  holidayNYSE(2008:2010)
  holidayNYSE()
 
  However, I don't need all the NYSE holidays (like Good Friday). Just
  the major US holidays - New Years, MLK, Memorial Day, Independence
  Day, Labor Day, Halloween, Thanksgiving, Christmas.
  Is there any way to identify major US holidays?
 
  Thanks a lot!
 
  -
  Dimitri Liakhovitski
  marketfusionanalytics.com
 
 
 
 
  --
  Dimitri Liakhovitski
  marketfusionanalytics.com
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 



 --
 Dimitri Liakhovitski
 marketfusionanalytics.com





-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Functions for Sum of determinants of ranges of matrix subsets

Hi:

Try this:

 z - matrix(rnorm(100), nrow = 10)
 sum(sapply(seq_len(nrow(z)), function(k) det(z[-k, -k])))
[1] 1421.06

where
 sapply(seq_len(nrow(z)), function(k) det(z[-k, -k]))
 [1]  432.11613   81.65449  516.95791   54.72775  804.32097 -643.35436
 [7] -411.15932  394.18780   84.13173  107.47665

HTH,
Dennis

On Tue, Aug 2, 2011 at 5:18 AM, john james dnt...@yahoo.com wrote:
 Dear R-help list,
 Pls I have this problem. Suppose I have a matrix of size nxn say, generated 
 as follows

 z-matrix(rnorm(n*n,0,1),nrow=n)

 I want to write a function such that for i in 1:n, I will remove the rows and 
 columns
 corresponding to i (so, will be left with n-1*n-1 submatrix in each cases). 
 Now I need
 the sum of the determinant of each of this submatrices. As an example, if 
 n=3, it means I will have det(1strow and 1stcolum removed) + det(2ndrow and 
 2ndcolum removed) + det(3rdrow and 3rdcolum removed).

 Any help will be appreciated. Thanks

 John
        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] execute r-code stored in a string variable

2011-08-02 Thread Ista Zahn

Hi Kim,
You can use

eval(parse(text = c))

Best,
Ista

On Tue, Aug 2, 2011 at 8:22 AM, Kim Lillesøe k...@dataminds.dk wrote:
 Dear all

 I have a simple R question. How do I execute R-code stored in a variable?

 E.g if I have a variable which contains some R-code:
 c = reg - lm(sales$sales~sales$price)

 Is it possible to execute c
 E.g like Exec(c)

 I hope someone can help.

 Thank you
 Kim Lillesře

        [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Ista Zahn
Graduate student
University of Rochester
Department of Clinical and Social Psychology
http://yourpsyche.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory limit in Aggregate()


On Aug 2, 2011, at 11:45 , Guillaume wrote:

 Dear all,
 I am trying to aggregate a table (divided in two lists here), but get a
 memory error.
 Here is the code I'm running : 
 
 sessionInfo()
   
 print(paste(memory.limit() , memory.limit()))
 print(paste(memory.size() , memory.size()))
 print(paste(memory.size(TRUE) , memory.size(TRUE)))
   
 print(paste(size listX , object.size(listX)))
 print(paste(size listBy , object.size(listBy)))
 print(paste(length , object.size(nrow(listX
   
 tableAgg - aggregate(x   = listX
   ,   by  = listBy
   ,   FUN = max)
 
 
 It returns :
 
 R version 2.9.0 Patched (2009-05-09 r48513) 
 i386-pc-mingw32 
 locale:
 LC_COLLATE=French_France.1252;LC_CTYPE=French_France.1252;LC_MONETARY=French_France.1252;LC_NUMERIC=C;LC_TIME=French_France.1252
 attached base packages:
 [1];stats;graphics;grDevices;utils;datasets;methods;base
 other attached packages:
 [1];RODBC_1.3-2;HarpTools_1.4;HarpReport_1.9
 loaded via a namespace (and not attached):
 [1];tools_2.9.0
 [1];memory.limit()  4095
 [1];memory.size()  31.92
 [1];memory.size(TRUE)  166.94
 [1];size listX  218312
 [1];size listBy  408552
 [1];length  9083
 Erreur in vector(list, prod(extent)) : 
  cannot allocate vector of length 1224643220
 
 (the last line is translated from the french error message impossible
 d'allouer un vecteur de longueur 1224643220 )
 
 Why would R create such a long vector (my original lists , and is there a
 way to avoid this error ?
 

It would be easier if you described your data rather than just tell us their 
size, but as far as I can see, listX has about 50K columns and listBy has 100K. 
So you are trying to form a table of the max of 5 variables over the 
cartesian product of 10 classifiers? That's basically an infinite number of 
cells.

 Thank you for your help,
 
 Guillaume
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Memory-limit-in-Aggregate-tp3711819p3711819.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com
Døden skal tape! --- Nordahl Grieg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Help with aggregate syntax for a multi-column function please.

2011-08-02 Thread Michael Karol

Dear R-experts:

 

I am using a function called AUC whose arguments are data, time, id, and
dv.
 
data is the name of the dataframe, 
time is the independent variable column name, 
id is the subject id and 
dv is the dependent variable.  
 
The function computes area under the curve by trapezoidal rule, for each
subject id.
 
I would like to embed this in aggregate to further subset by each Cycle,
DoseDayNominal and Drug, but I can't seem to get the aggregate syntax
correct.  All the examples I can find use single column function such as
mean, whereas this AUC function requires four arguments.
 
Could someone kindly show me the syntax?
 
This is what I've tried so far:
 
AUC.DF- aggregate(PKdata, list(PKdata$Cycle, PKdata$DoseDayNominal,
PKdata$Drug), 
   function(x,tm,pt,conc) {AUC(x)},
tm=TimeBestEstimate, pt=Pt, conc=ConcentrationBQLzero )
 
AUC.DF- aggregate(PKdata, list(PKdata$Cycle, PKdata$DoseDayNominal,
PKdata$Drug), 
   function(x) {AUC(x,TimeBestEstimate, Pt,
ConcentrationBQLzero )} )
 
AUC syntax is:
args(AUC)
function (data, time = TIME, id = ID, dv = DV) 
 
 
thanks

 

Regards, 

Michael

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] identifying weeks (dates) that certain days (dates) fall into

2011-08-02 Thread Dimitri Liakhovitski

Hello!

I have dates for the beginning of each week, e.g.:
weekly-data.frame(week=seq(as.Date(2010-04-01),
as.Date(2011-12-26),by=week))
week  # each week starts on a Monday

I also have a vector of dates I am interested in, e.g.:
july4-as.Date(c(2010-07-04,2011-07-04))

I would like to flag the weeks in my weekly$week that contain those 2
individual dates.
I can only think of a very clumsy way of doing it:

myrows-c(which(weekly$week==weekly$week[weekly$weekjuly4[1]][1]-7),
which(weekly$week==weekly$week[weekly$weekjuly4[2]][1]-7))
weekly$flag-0
weekly$flag[myrows]-1

It's clumsy - because actually, my vector of dates of interest (july4
above) is much longer.
Is there maybe a more elegant way of doing it?
Thank you!
-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] matrix indexing (igraph ?)

2011-08-02 Thread Robinson, David G

I realize that matrix indexing has been addressed in various flavors, but I'm 
stumped and didn't find anything in the archives.  It's not clear if it is an 
igraph issue or a more general problem. Thanks in advance for your patience.

I am using igraph to read a gml file 
(http://www-personal.umich.edu/~mejn/netdata/football.zip
). The gml file contains vertex attributes (conference and team) that are 
provided as character/integer values.

I would like to build a matrix of dimension (length.team, length.conference) 
where the elements are zero except for 1's at the location of index [team, 
conference].

Here is a snippet of code that hopefully captures what I am trying to do:

original-read.graph(./Data/football/football.gml, format=gml)
conf.list- get.vertex.attribute(original, 'value', index=V(original))+1
team.list- get.vertex.attribute(original, 'id', index=V(original))+1
temp- matrix(0,115,12)
temp[team.list, conf.list]-1

Unfortunately, temp[] is filled with 1's.

However, if I try:
c.list=c(1,3,5)
t.list=c(2,4,6)
temp[t.list,c.list]-1

then things work as I would expect.  FWIW - I have tried 
as.integer(get.vertex.attribute(...)) with no luck.

Thanks for any suggestions.




*
 original-read.graph(./Data/football/football.gml, format=gml)
 conf.list- get.vertex.attribute(original, 'value', index=V(original))+1
 team.list- get.vertex.attribute(original, 'id', index=V(original))+1
 conf.list
  [1]  8  1  3  4  8  4  3  9  9  8  4 11  7  3  7  3  8 10  7  2 10  9  9  8 
11  1  7 10 12  2  2  7  3  1  7  2  6
 [38]  1  7  3  4  8  6  7  5  1 12  3  5 12 11  9  4 12  7  2 10  5 12 11  3  
7 10 11  3 10  5 12  9 11 10  7  4 12
 [75]  4  5 10  9  9  2  6  4  6 12  4  7  5 10 12  1  6  5  5  8  2 10 10 11  
4  7  3  2  4  1  8  1  3  4  9  1  5
[112]  9  5 10 12
 team.list
  [1]   1   2   3   4   5   6   7   8   9  10  11  12  13  14  15  16  17  18  
19  20  21  22  23  24  25  26  27
 [28]  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  
46  47  48  49  50  51  52  53  54
 [55]  55  56  57  58  59  60  61  62  63  64  65  66  67  68  69  70  71  72  
73  74  75  76  77  78  79  80  81
 [82]  82  83  84  85  86  87  88  89  90  91  92  93  94  95  96  97  98  99 
100 101 102 103 104 105 106 107 108
[109] 109 110 111 112 113 114 115
 length(conf.list)
[1] 115
 length(team.list)
[1] 115
 temp- matrix(0,115,12)
 r-c(1,3,5)
 col- c(2,4,6)
 temp[r,col]-1
 temp[1:10,]
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
 [1,]010101000 0 0 0
 [2,]000000000 0 0 0
 [3,]010101000 0 0 0
 [4,]000000000 0 0 0
 [5,]010101000 0 0 0
 [6,]000000000 0 0 0
 [7,]000000000 0 0 0
 [8,]000000000 0 0 0
 [9,]000000000 0 0 0
[10,]000000000 0 0 0
 temp[team.list,conf.list]- 1
 temp[1:10,]
  [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10] [,11] [,12]
 [1,]111111111 1 1 1
 [2,]111111111 1 1 1
 [3,]111111111 1 1 1
 [4,]111111111 1 1 1
 [5,]111111111 1 1 1
 [6,]111111111 1 1 1
 [7,]111111111 1 1 1
 [8,]111111111 1 1 1
 [9,]111111111 1 1 1
[10,]111111111 1 1 1

 -

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] merging lists within lists via time stamp

2011-08-02 Thread tomtomme

From multiple data.frames I created two lists, one with temperature, one with
gps data. With your help and lapply I managed to interpolate the timestamps
of gps and temperature data. Now I want to merge/join both lists via the
time-stamp, taking only times, where both lists have data. 
For the single data-frames that worked just fine with: 

both - merge(gps,temp)

For the two lists of data.frames I first tried an lapply over both
lists...something like

both - lapply(temp, gps, function(x){x - merge

Then I found both-merge.list(gps,temp), but this doesn´t work either. It
just transfers the first list gps to both

Thanks for any hint, Thomas

--
View this message in context: 
http://r.789695.n4.nabble.com/merging-lists-within-lists-via-time-stamp-tp3712631p3712631.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] execute r-code stored in a string variable

2011-08-02 Thread Samuel Le

Yes, you can use:
eval(parse(text=c))

On the other hand I would not recommend to use c as a variable name as it is 
the name of a very important function in the R language to aggregate data.

HTH,
Samuel

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Kim Lillesøe
Sent: 02 August 2011 13:22
To: r-help@R-project.org
Subject: [R] execute r-code stored in a string variable

Dear all

I have a simple R question. How do I execute R-code stored in a variable?

E.g if I have a variable which contains some R-code:
c = reg - lm(sales$sales~sales$price)

Is it possible to execute c
E.g like Exec(c)

I hope someone can help.

Thank you
Kim Lillesøe

[[alternative HTML version deleted]]



__ Information from ESET NOD32 Antivirus, version of virus signature 
database 6275 (20110707) __

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com



__ Information from ESET NOD32 Antivirus, version of virus signature 
database 6275 (20110707) __

The message was checked by ESET NOD32 Antivirus.

http://www.eset.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identifying weeks (dates) that certain days (dates) fall into

The findInterval function should surely be tried in some form or  
another.


On Aug 2, 2011, at 10:36 AM, Dimitri Liakhovitski wrote:


Hello!

I have dates for the beginning of each week, e.g.:
weekly-data.frame(week=seq(as.Date(2010-04-01),
as.Date(2011-12-26),by=week))
week  # each week starts on a Monday

I also have a vector of dates I am interested in, e.g.:
july4-as.Date(c(2010-07-04,2011-07-04))

I would like to flag the weeks in my weekly$week that contain those 2
individual dates.


 findInterval(july4, weekly$week)
[1] 14 66   # works out of the box

Provides an index you cna use with weekly$week


I can only think of a very clumsy way of doing it:

myrows-c(which(weekly$week==weekly$week[weekly$weekjuly4[1]][1]-7),
which(weekly$week==weekly$week[weekly$weekjuly4[2]][1]-7))
weekly$flag-0
weekly$flag[myrows]-1

It's clumsy - because actually, my vector of dates of interest (july4
above) is much longer.
Is there maybe a more elegant way of doing it?


--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with aggregate syntax for a multi-column function please.

Michael,

The function aggregate() is not going to work for your situation.  The 
function is applied to the individual columns of the subsetted data, not 
the subsetted data frame as a whole.  The help file reads: Then, each of 
the variables (columns) in x is split into subsets of cases (rows) of 
identical combinations of the components of by, and FUN is applied to each 
such subset with further arguments in ... passed to it.

If you can rewrite your function so that it is a function with one 
argument, the data frame alone, then using the by() function should give 
you what you need.  Here is a simple example:

df - data.frame(a=1:5, b=2:6, i=c(1, 1, 1, 2, 2))

junk - function(df) {
sum(df$a^2) + prod(df$b)
}

data.frame(index=sort(unique(df$i)), results=as.vector(by(df[, c(a, 
b)], df$i, junk)))

Hope this helps.

Jean


`·.,,  (((º   `·.,,  (((º   `·.,,  (((º

Jean V. Adams
Statistician
U.S. Geological Survey
Great Lakes Science Center
223 East Steinfest Road
Antigo, WI 54409  USA



From:
Michael Karol mka...@syntapharma.com
To:
r-help@r-project.org
Date:
08/02/2011 09:35 AM
Subject:
[R] Help with aggregate syntax for a multi-column function please.
Sent by:
r-help-boun...@r-project.org



Dear R-experts:

 

I am using a function called AUC whose arguments are data, time, id, and
dv.
 
data is the name of the dataframe, 
time is the independent variable column name, 
id is the subject id and 
dv is the dependent variable. 
 
The function computes area under the curve by trapezoidal rule, for each
subject id.
 
I would like to embed this in aggregate to further subset by each Cycle,
DoseDayNominal and Drug, but I can't seem to get the aggregate syntax
correct.  All the examples I can find use single column function such as
mean, whereas this AUC function requires four arguments.
 
Could someone kindly show me the syntax?
 
This is what I've tried so far:
 
AUC.DF- aggregate(PKdata, list(PKdata$Cycle, PKdata$DoseDayNominal,
PKdata$Drug), 
   function(x,tm,pt,conc) {AUC(x)},
tm=TimeBestEstimate, pt=Pt, conc=ConcentrationBQLzero )
 
AUC.DF- aggregate(PKdata, list(PKdata$Cycle, PKdata$DoseDayNominal,
PKdata$Drug), 
   function(x) {AUC(x,TimeBestEstimate, Pt,
ConcentrationBQLzero )} )
 
AUC syntax is:
args(AUC)
function (data, time = TIME, id = ID, dv = DV) 
 
 
thanks

 

Regards, 

Michael

 


 [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loops to assign a unique ID to a column

2011-08-02 Thread David L Carlson

How about this?

 indx - unique(cbind(Dates, Groups))
 indx
 DatesGroups
[1,] 12/10/2010 A   
[2,] 12/10/2010 B   
[3,] 13/10/2010 A   
[4,] 13/10/2010 B   
[5,] 13/10/2010 C 

 indx - data.frame(indx, id=1:nrow(indx))
 indx
   Dates Groups id
1 12/10/2010  A  1
2 12/10/2010  B  2
3 13/10/2010  A  3
4 13/10/2010  B  4
5 13/10/2010  C  5

 newdata - merge(data, indx)
 newdata
   Dates Groups id
1 12/10/2010  A  1
2 12/10/2010  B  2
3 12/10/2010  B  2
4 13/10/2010  A  3
5 13/10/2010  B  4
6 13/10/2010  C  5

--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Chandra Salgado Kent
Sent: Tuesday, August 02, 2011 2:12 AM
To: r-help@r-project.org
Subject: [R] Loops to assign a unique ID to a column

Dear R help,

 

I am fairly new in data management and programming in R, and am trying to
write what is probably a simple loop, but am not having any luck. I have a
dataframe with something like the following (but much bigger):

 

Dates-c(12/10/2010,12/10/2010,12/10/2010,13/10/2010, 13/10/2010,
13/10/2010)

Groups-c(A,B,B,A,B,C)

data-data.frame(Dates, Groups)

 

I would like to create a new column in the dataframe, and give each distinct
date by group a unique identifying number starting with 1, so that the
resulting column would look something like:

 

ID-c(1,2,2,3,4,5)

 

The loop that I have started to write is something like this (but doesn't
work!):

 

data$ID-as.number(c()) 

for(i in unique(data$Dates)){

  for(j in unique(data$Groups)){ data$ID[i,j]-i

  i-i+1

  }

}

 

Am I on the right track?

 

Any help on this is much appreciated!

 

Chandra


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] vglm: warnings and errors

2011-08-02 Thread Sramkova, Anna (IEE)

Hello,

I am using multinomial logit regression for the first time, and I am trying to 
understand the warnings and errors I get.


My data consists of 200 to 600 samples with ~25 predictors (these are principal 
components). The response has three categories.

I use the function vglm from the package VGAM, called as follows: 
fit1-vglm(fmla, data=tr, multinomial,weights=regwt, maxit=500)

regwt are Epanechnikov weights

In general, the regression works, but 

- often, one of the categories has posterior probability zero, but the 
remaining two probabilities are non-zero (although very small)

- I receive many warnings of the following type:
  
   in checkwz(wz, m = M , trace = trace, wzeps = control$wzepsilo): n elements 
replaced by 1.819e-12
   
in tfun(mu = mu, y = y, w =w, res = FALSE, eta = eta, ...: fitted values 
close to 0 or 1

   ... if I understand it correctly, these have to do with the variance of the 
predictions being too small?

- In some cases, I get an error: Error in devmu[smallmu] = smy * log(smu): NAs 
are not allowed in subscripted arguments, sometimes this error goes away when 
I decrease the size of the training set.


I would like to know if this is expected behavior for some types of data sets. 
The manual to VGAM states that multinomial is prone to numerical difficulties 
if the groups are separable and/or fitted probabilities are close to 0 or 1, 
but does not explain why. The latter could be my case.

I have to run the regression on 10,000s of data sets, so I would like to find a 
setting in which things go smoothly (i.e. without errors)

I realize that this is probably more of a methodological than technical 
question, but maybe you can give some rules of thumb about a  suitable number 
of samples/predictors or point me to some literature that would help me 
understand my problems.

Thanks

Anna

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Display/show the evaluation result of R commands automatically

2011-08-02 Thread Anthony Ching Ho Ng

R-help and Barry

Thank you for your suggestions. It works, and may I ask how I am able
to do the opposite (disable the call back, so that I could control
when to show and suppress the output). I would like to make a function
to enable/disable the callback similar to the one as follow:

enableOutput - function() {
 h - taskCallbackManager()
h$add(function(expr, value, ok, visible)
{if(!visible){print(value)};TRUE})
}

disableOutput - function() {

}

This shows output feature (and use ''; to suppress the output) is the
default behavior of Matlab which I find it quite useful (without
having to type in the variable name again every time to see the result
of the expression). So I am just curious to know how to do it in R.

Best Regards,

Anthony


On 31 July 2011 20:16, Barry Rowlingson b.rowling...@lancaster.ac.uk wrote:
 h - taskCallbackManager()
 h$add(function(expr, value, ok, visible) {if(!visible){print(value)};TRUE})


 On Sun, Jul 31, 2011 at 12:15 PM, Anthony Ching Ho Ng
 anthony.ch...@gmail.com wrote:
 Hello R-help,

 I wonder if it is possible to configure R, so that it will
 display/show the evaluation result of the R commands automatically
 (similar to the behavior of Matlab)

 i.e. If I type x - 8

 it will print 8 in the command prompt, instead of having type x
 explicitly to show the result and perhaps put an ; at the end to
 suppress the output.

 i.e. x - 8;


 The first thing I think you can do by adding a task callback manager
 to print the value if the value would otherwise be invisible:

   h - taskCallbackManager()
   h$add(function(expr, value, ok, visible) {if(!visible){print(value)};TRUE})

  The semicolon thing would probably need rewriting bits of R at the C
 code level.

  I don't think many people would use it though. And my code above
 might break things. I don't use it.

 Barry


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identifying weeks (dates) that certain days (dates) fall into

2011-08-02 Thread Gabor Grothendieck

On Tue, Aug 2, 2011 at 10:36 AM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Hello!

 I have dates for the beginning of each week, e.g.:
 weekly-data.frame(week=seq(as.Date(2010-04-01),
 as.Date(2011-12-26),by=week))
 week  # each week starts on a Monday

 I also have a vector of dates I am interested in, e.g.:
 july4-as.Date(c(2010-07-04,2011-07-04))

 I would like to flag the weeks in my weekly$week that contain those 2
 individual dates.
 I can only think of a very clumsy way of doing it:

 myrows-c(which(weekly$week==weekly$week[weekly$weekjuly4[1]][1]-7),
        which(weekly$week==weekly$week[weekly$weekjuly4[2]][1]-7))
 weekly$flag-0
 weekly$flag[myrows]-1

 It's clumsy - because actually, my vector of dates of interest (july4
 above) is much longer.
 Is there maybe a more elegant way of doing it?
 Thank you!

This gives myrows:

   as.numeric(july4 - weekly[1,1]) %/% 7 + 1

-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Standard Deviation of a matrix

2011-08-02 Thread chakri

Thank you everyone for your kind input,

I forgot to add that I have decimal points in my matrix !

Enclosed input file (reduced to 10 X 10 matrix), scripts and output for your
suggesions:

Code 1:
library(stats)
Matrix-read.table(test_input, head=T, sep= , dec=.)
SD-sd(as.numeric(Matrix))
SD

Output 1:
 library(stats)
 Matrix-read.table(test_input, head=T, sep=\t, dec=.)
 SD-sd(as.numeric(Matrix))
Error in sd(as.numeric(Matrix)) : 
  (list) object cannot be coerced to type 'double'
Execution halted

Code 2:
library(stats)
Matrix-read.table(test_input, head=T, sep=\t, dec=.)
dim(Matrix)-1
SD-sd(Matrix)
SD

Output:
 library(stats)
 Matrix-read.table(test_input, head=T, sep=\t, dec=.)
 dim(Matrix)-1
Error in dim(Matrix) - 1 : 
  dims [product 1] do not match the length of object [10]
Execution halted

Code 3:
library(stats)
Matrix-read.table(test_input, head=T, sep=\t, dec=.)
SD-sd(c(Matrix))
SD

Output:
 library(stats)
 Matrix-read.table(test_input, head=T, sep=\t, dec=.)
 SD-sd(c(Matrix))
Error: is.atomic(x) is not TRUE
Execution halted

Any ideas, what am I missing here ?

TIA
chakri
Input file:  http://r.789695.n4.nabble.com/file/n3712328/test_input
test_input 

--
View this message in context: 
http://r.789695.n4.nabble.com/Standard-Deviation-of-a-matrix-tp3711991p3712328.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to get the percentile of a number in a vector

2011-08-02 Thread ראובן אברמוביץ


   I'm familiar with the quantile() command, but what if I have a specific
   number that I want to know its location in a vector? I know that in known
   distributions, (for example the normal distribution), there is pnorm and
   qnorm, but how can I do it with unknown vector?


   thanks in advance
 _

   Walla! Mail - [1]Get your free unlimited mail today

References

   1. http://www.walla.co.il/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory limit in Aggregate()

2011-08-02 Thread Guillaume

Hi Peter,
Thanks for your answer.
I made a mistake in the script I copied sorry !

The description of the object : listX has 3 column, listBy has 4 column, and
they have 9000 rows :

print(paste(ncol x , length((listX
print(paste(ncol By , length((listBy
print(paste(nrow , length((listX[[1]]

[1];ncol x  3
[1];ncol By  4
[1];nrow  9083

It seems the large (=4) number of columns in listBy creates the
troubles...

Thanks,
Guillaume

--
View this message in context: 
http://r.789695.n4.nabble.com/Memory-limit-in-Aggregate-tp3711819p3712671.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] My R code is not efficient

2011-08-02 Thread Kathie

Dear R users,

I have two n*1 integer vectors, y1 and y2, where n is very very large.

I'd like to compute


elbp = 4^(y1) * 5^(y2) * sum_{i=0}^{max(y1, y2)}  [{ (y1-i)! * (i)! *
(y2-i)! }^(-1)];


that is, I need to compute elbp for each (y1, y2) pair.

So I made R code like below, but I don't think it's efficient

Would you plz tell me how to avoid this for loop blow??



--
for (k in 1:n){
ymax - max( y1[k], y2[k] )
   
i - 0:ymax

sums- -lgamma(y1[k]-i+1)-lgamma(i+1)-lgamma(y2[k]-i+1)

maxsums - max(sums)

sums - sums - maxsums

lsum - log( sum(exp(sums)) ) + maxsums

lbp[k] - y1[k]*log(4)  + y2[k]*log(5)  + lsum

}
elbp - exp(lbp)



Any suggestion will be greatly appreciated.

Regards,

Kathryn Lord 

--
View this message in context: 
http://r.789695.n4.nabble.com/My-R-code-is-not-efficient-tp3712762p3712762.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Extract p value from coxme object

2011-08-02 Thread Catarina Miranda

Dear R experts;

I am trying to extract the p values from a coxme object (package coxme). I
can see the value in the model output, but I wanted to have the result with
a higher number of decimal places.
I have searched the mailing list and followed equivalent suggestions for
nlme/lme objects, but I wasn't successful.

Thanks;

Catarina

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] My R code is not efficient

2011-08-02 Thread Jeff Newmiller

?expand.grid
---
Jeff Newmiller The . . Go Live...
DCN:jdnew...@dcn.davis.ca.us Basics: ##.#. ##.#. Live Go...
Live: OO#.. Dead: OO#.. Playing
Research Engineer (Solar/Batteries O.O#. #.O#. with
/Software/Embedded Controllers) .OO#. .OO#. rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Kathie kathryn.lord2...@gmail.com wrote:

Dear R users,

I have two n*1 integer vectors, y1 and y2, where n is very very large.

I'd like to compute


elbp = 4^(y1) * 5^(y2) * sum_{i=0}^{max(y1, y2)} [{ (y1-i)! * (i)! *
(y2-i)! }^(-1)];


that is, I need to compute elbp for each (y1, y2) pair.

So I made R code like below, but I don't think it's efficient

Would you plz tell me how to avoid this for loop blow??



_

for (k in 1:n){
ymax - max( y1[k], y2[k] )

i - 0:ymax

sums- -lgamma(y1[k]-i+1)-lgamma(i+1)-lgamma(y2[k]-i+1)

maxsums - max(sums)

sums - sums - maxsums

lsum - log( sum(exp(sums)) ) + maxsums

lbp[k] - y1[k]*log(4) + y2[k]*log(5) + lsum

}
elbp - exp(lbp)

_


Any suggestion will be greatly appreciated.

Regards,

Kathryn Lord 

--
View this message in context: 
http://r.789695.n4.nabble.com/My-R-code-is-not-efficient-tp3712762p3712762.html
Sent from the R help mailing list archive at Nabble.com.

_

R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to get the percentile of a number in a vector

2011-08-02 Thread R. Michael Weylandt michael.weyla...@gmail.com



On Aug 2, 2011, at 10:14 AM, ראובן אברמוביץ wrote:



  I'm familiar with the quantile() command, but what if I have a  
specific
  number that I want to know its location in a vector? I know that  
in known
  distributions, (for example the normal distribution), there is  
pnorm and

  qnorm, but how can I do it with unknown vector?


?ecdf

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Standard Deviation of a matrix

2011-08-02 Thread Nordlund, Dan (DSHS/RDA)

 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of chakri
 Sent: Tuesday, August 02, 2011 6:31 AM
 To: r-help@r-project.org
 Subject: Re: [R] Standard Deviation of a matrix

 Thank you everyone for your kind input,

 I forgot to add that I have decimal points in my matrix !

 Enclosed input file (reduced to 10 X 10 matrix), scripts and output for
 your
 suggesions:

 Code 1:
 library(stats)
 Matrix-read.table(test_input, head=T, sep= , dec=.)
 SD-sd(as.numeric(Matrix))
 SD

First, your data attachment did not come through the list.  Second, decimals 
are not a problem.  Third, you don't have a matrix, you have a data frame 
(read.table produces data frames).  As long as all columns are numeric you 
could do something like

sd(c(as.matrix(m)))

You could also convert to a matrix on input if you really don't need a 
dataframe for different column types.

Hope this is helpful,

Dan 

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to get the percentile of a number in a vector

Would this work for you?

if you want to know where the i-th element falls percentage-wise in the
distribution of a vector:

sum(x = x[i])/length(x)

This could be turned into a function:

pEmpirical - function(i,x) {
if (length(i)  1) return(apply(as.matrix(i), 1, pEmpirical,x))
r = sum(x = x[i])/length(x)
return(r)
}

Michael Weylandt

2011/8/2 ×¨×××× ×××¨×××××¥ gantk...@walla.com


   I'm familiar with the quantile() command, but what if I have a specific
   number that I want to know its location in a vector? I know that in known
   distributions, (for example the normal distribution), there is pnorm and
   qnorm, but how can I do it with unknown vector?


   thanks in advance
 _

   Walla! Mail - [1]Get your free unlimited mail today

 References

   1. http://www.walla.co.il/

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] SSOAP chemspider

2011-08-02 Thread Benton, Paul

Has anyone got SSOAP working on anything besides KEGG?

I just tried another 3 SOAP servers. Both the WSDL and constructing the .SOAP 
call. Again the perl and ruby interface worked without any hitches.

Paul

 library(SSOAP)
 massBank-processWSDL(http://www.massbank.jp/api/services/MassBankAPI?wsdl;)
Error in parse(text = paste(txt, collapse = \n)) : 
  text:1:29: unexpected input
1: function(x, ..., obj = new( ‚
   ^
In addition: Warning message:
In processWSDL(http://www.massbank.jp/api/services/MassBankAPI?wsdl;) :
  Ignoring additional serviceport ... elements


 metlin-processWSDL(http://metlin.scripps.edu/soap/metlin.wsdl;)
Error in parse(text = paste(txt, collapse = \n)) : 
  text:1:29: unexpected input
1: function(x, ..., obj = new( ‚
   ^
 pubchem-processWSDL(http://pubchem.ncbi.nlm.nih.gov/pug_soap/pug_soap.cgi?wsdl;)
Error in parse(text = paste(txt, collapse = \n)) : 
  text:1:29: unexpected input
1: function(x, ..., obj = new( ‚
   ^



On 20 Jul 2011, at 01:54, Benton, Paul wrote:

 Dear all,
 
 I've been trying on and off for the past few months to get SSOAP to work with 
 chemspider. First I tried the WSDL file:
 
 cs-processWSDL(http://www.chemspider.com/MassSpecAPI.asmx?WSDL;)
 Error in parse(text = paste(txt, collapse = \n)) : 
  text:1:29: unexpected input
 1: function(x, ..., obj = new( ‚
   ^
 In addition: Warning message:
 In processWSDL(http://www.chemspider.com/MassSpecAPI.asmx?WSDL;) :
  Ignoring additional serviceport ... elements
 
 Next I've tried using just the pure .SOAP to call the database. 
 
 s - SOAPServer(http://www.chemspider.com/MassSpecAPI.asmx;)
 csid- .SOAP(s, SearchByMass2, mass=89.04767, range=0.01,
action = I(http://www.chemspider.com/SearchByMass2;),
xmlns = c(http://www.chemspider.com;), .opts = list(verbose = TRUE))
 
 This seems to work and gives back a result. However, this result isn't the 
 right result. It's seems to have converted the mass into 0. When I run the 
 similar program in perl I get the correct id's. So this isn't a server side 
 problem but SSOAP. Any thoughts or suggestions on other packages to use?
 Further infomation about the SeachByMass2 method and it's xml that it's 
 expecting.
 http://www.chemspider.com/MassSpecAPI.asmx?op=SearchByMass2
 
 Cheers,
 
 
 Paul
 
 
 PS Placing a fake error in the .SOAP code I can look at the xml it's sending 
 to the server:
 Browse[1] doc
 ?xml version=1.0?
 SOAP-ENV:Envelope xmlns:SOAP-ENC=http://schemas.xmlsoap.org/soap/encoding/; 
 xmlns:SOAP-ENV=http://schemas.xmlsoap.org/soap/envelope/; 
 xmlns:xsi=http://www.w3.org/2001/XMLSchema-instance; 
 xmlns:xsd=http://www.w3.org/2001/XMLSchema; 
 SOAP-ENV:encodingStyle=http://schemas.xmlsoap.org/soap/encoding/;
  SOAP-ENV:Body
ns:SearchByMass2 xmlns:ns=http://www.chemspider.com;
  ns:mass89.04767/ns:mass
  ns:range0.01/ns:range
/ns:SearchByMass2
  /SOAP-ENV:Body
 /SOAP-ENV:Envelope

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to get the percentile of a number in a vector

Does this help?

x - c(3, 8, 5, 2, 9, 33, 21)

# the 43rd percentile
quantile(x, 0.43)

# the proportion of the distribution that is less than 7
mean(x7)

Jean


`·.,,  (((?   `·.,,  (((?   `·.,,  (((?

Jean V. Adams
Statistician
U.S. Geological Survey
Great Lakes Science Center
223 East Steinfest Road
Antigo, WI 54409  USA



From:
øàåáï àáøîåáéõ gantk...@walla.com
To:
r-help@r-project.org
Date:
08/02/2011 10:51 AM
Subject:
[R] how to get the percentile of a number in  a vector
Sent by:
r-help-boun...@r-project.org




   I'm familiar with the quantile() command, but what if I have a specific
   number that I want to know its location in a vector? I know that in 
known
   distributions, (for example the normal distribution), there is pnorm 
and
   qnorm, but how can I do it with unknown vector?


   thanks in advance
 _

   Walla! Mail - [1]Get your free unlimited mail today

References

   1. http://www.walla.co.il/
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory limit in Aggregate()


On Aug 2, 2011, at 17:10 , Guillaume wrote:

 Hi Peter,
 Thanks for your answer.
 I made a mistake in the script I copied sorry !
 
 The description of the object : listX has 3 column, listBy has 4 column, and

So what is the contents of listBy? If they are all factors with 100 levels, 
then you're looking at a table with 10^8 entries...

 they have 9000 rows :
 
 print(paste(ncol x , length((listX
 print(paste(ncol By , length((listBy
 print(paste(nrow , length((listX[[1]]
 
 [1];ncol x  3
 [1];ncol By  4
 [1];nrow  9083
 
 It seems the large (=4) number of columns in listBy creates the
 troubles...
 
 Thanks,
 Guillaume
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Memory-limit-in-Aggregate-tp3711819p3712671.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com
Døden skal tape! --- Nordahl Grieg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Help with aggregate syntax for a multi-column function please.

2011-08-02 Thread Thaler, Thorn, LAUSANNE, Applied Mathematics

Hi:

Another way to do this is to use one of the summarization packages.
The following uses the plyr package.

The first step is to create a function that takes a data frame as
input and outputs either a data frame or a scalar. In this case, the
function returns a scalar, but if you want to carry along additional
variables in the output, you can replace it with a data frame that
returns the set of variables you want. You don't need to return the
grouping variables, but no harm is done if you do.

# This assumes the existence of a function AUC with the arguments
#  you stated in your post. I presume it returns a scalar value; if not,
# you should modify it to return a data frame instead. It would probably
# be better to modify AUC and call it in ddply() directly, but without the
# function code there's not much one can do...
myAUC - function(df)
   AUC(df, 'TimeBestEstimate', 'Pt','ConcentrationBQLzero')

library('plyr')
ddply(PKdata, .(Cycle, DoseDayNominal, Drug), myAUC)

This is obviously untested, so caveat emptor. Both plyr and data.table
can accept functions with multiple arguments and do the right thing.
The trick in plyr is to write a function that takes a generic input
object (e.g., a (sub)data frame) and then uses (the variables within)
it to do the necessary calculations. Generally, you want the output of
the function to be compatible with the type of output you want from
the **ply() function. In this case, ddply() means data frame input,
data frame output; alply() would mean array input and list output,
etc.

If this doesn't work, please provide a reproducible example.

HTH,
Dennis

On Tue, Aug 2, 2011 at 7:32 AM, Michael Karol mka...@syntapharma.com wrote:
 Dear R-experts:



 I am using a function called AUC whose arguments are data, time, id, and
 dv.

 data is the name of the dataframe,
 time is the independent variable column name,
 id is the subject id and
 dv is the dependent variable.

 The function computes area under the curve by trapezoidal rule, for each
 subject id.

 I would like to embed this in aggregate to further subset by each Cycle,
 DoseDayNominal and Drug, but I can't seem to get the aggregate syntax
 correct.  All the examples I can find use single column function such as
 mean, whereas this AUC function requires four arguments.

 Could someone kindly show me the syntax?

 This is what I've tried so far:

 AUC.DF- aggregate(PKdata, list(PKdata$Cycle, PKdata$DoseDayNominal,
 PKdata$Drug),
                   function(x,tm,pt,conc) {AUC(x)},
 tm=TimeBestEstimate, pt=Pt, conc=ConcentrationBQLzero )

 AUC.DF- aggregate(PKdata, list(PKdata$Cycle, PKdata$DoseDayNominal,
 PKdata$Drug),
                   function(x) {AUC(x,TimeBestEstimate, Pt,
 ConcentrationBQLzero )} )

 AUC syntax is:
 args(AUC)
 function (data, time = TIME, id = ID, dv = DV)


 thanks



 Regards,

 Michael




        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lattice: index plot

Dear all,

How can I make an index plot with lattice, that is plotting a vector
simply against its particular index in the vector, i.e. something
similar to 

y - rnorm(10)
plot(y)

I don't want to specify the x's manually, as this could become
cumbersome when having multiple panels.

I tried something like

library(lattice)
mp - function(x, y, ...) {
  x - 1:length(y)
  panel.xyplot(x, y, ...)
}

pp - function(x, y, ...) {
  list(xlim = extendrange(1:length(y)), ylim = extendrange(y))
}

set.seed(123)
y - rnorm(10)
xyplot(y ~ 1, panel = mp, prepanel = pp, xlab=Index)

but I was wondering whether there is a more straightforward way?

By the way, if I do not specify the ylim in the prepanel function the
plot is clipped, but reading Deepayan's book, p.140 :

[...], so a user-specified prepanel function is not required to return
all of these components [i.e. xlim, ylim, xat, yat, dx and dy]; any
missing component will be replaced by the corresponding default.

I'd understand that if I do not specify ylim it is calculated
automatically? Not a big thing though, but it seems to me to be
inconsistent.

Any help appreciated. 

KR,

-Thorn

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Loops to assign a unique ID to a column

2011-08-02 Thread Bert Gunter

Whoa!

1. First and most important, there is very likely no reason you need
to do this. R can handle multiple groupings automatically in fitting
and plotting without creating artificial labels of the sort you appear
to want to create. Please read an Intro to R and/or get help to see
how.

2. The solution offered below is unnecessarily convoluted. Here is a
simpler and faster one:

z -  within(z, indx - as.numeric(interaction(Dates,Groups,
  drop=TRUE, lex.order=TRUE)))


Explanation:

interaction() produces all possible combinations the individual
groupings; drop=FALSE throws away any unused combinations,
lex.order-TRUE lexicographically orders the levels as you indicated.
?interaction for details.
By default, the result of the above is a factor, which as.numeric()
converts to the numeric codes used in factor representations. ?factor
 .
Finally, within() interprets and makes changes within z. The changed
result is then assigned back to z so that it is not lost. ?within

Cheers,
Bert

On Tue, Aug 2, 2011 at 8:36 AM, David L Carlson dcarl...@tamu.edu wrote:
 How about this?

 indx - unique(cbind(Dates, Groups))
 indx
     Dates        Groups
 [1,] 12/10/2010 A
 [2,] 12/10/2010 B
 [3,] 13/10/2010 A
 [4,] 13/10/2010 B
 [5,] 13/10/2010 C

 indx - data.frame(indx, id=1:nrow(indx))
 indx
       Dates Groups id
 1 12/10/2010      A  1
 2 12/10/2010      B  2
 3 13/10/2010      A  3
 4 13/10/2010      B  4
 5 13/10/2010      C  5

 newdata - merge(data, indx)
 newdata
       Dates Groups id
 1 12/10/2010      A  1
 2 12/10/2010      B  2
 3 12/10/2010      B  2
 4 13/10/2010      A  3
 5 13/10/2010      B  4
 6 13/10/2010      C  5

 --
 David L Carlson
 Associate Professor of Anthropology
 Texas AM University
 College Station, TX 77843-4352


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of Chandra Salgado Kent
 Sent: Tuesday, August 02, 2011 2:12 AM
 To: r-help@r-project.org
 Subject: [R] Loops to assign a unique ID to a column

 Dear R help,



 I am fairly new in data management and programming in R, and am trying to
 write what is probably a simple loop, but am not having any luck. I have a
 dataframe with something like the following (but much bigger):



 Dates-c(12/10/2010,12/10/2010,12/10/2010,13/10/2010, 13/10/2010,
 13/10/2010)

 Groups-c(A,B,B,A,B,C)

 data-data.frame(Dates, Groups)



 I would like to create a new column in the dataframe, and give each distinct
 date by group a unique identifying number starting with 1, so that the
 resulting column would look something like:



 ID-c(1,2,2,3,4,5)



 The loop that I have started to write is something like this (but doesn't
 work!):



 data$ID-as.number(c())

 for(i in unique(data$Dates)){

  for(j in unique(data$Groups)){ data$ID[i,j]-i

  i-i+1

  }

 }



 Am I on the right track?



 Any help on this is much appreciated!



 Chandra


        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identifying weeks (dates) that certain days (dates) fall into

Hi:

You could try the lubridate package:

library(lubridate)
week(weekly$week)
week(july4)
[1] 27 27

 week
function (x)
yday(x)%/%7 + 1
environment: namespace:lubridate

which is essentially Gabor's code :)

HTH,
Dennis

On Tue, Aug 2, 2011 at 7:36 AM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Hello!

 I have dates for the beginning of each week, e.g.:
 weekly-data.frame(week=seq(as.Date(2010-04-01),
 as.Date(2011-12-26),by=week))
 week  # each week starts on a Monday

 I also have a vector of dates I am interested in, e.g.:
 july4-as.Date(c(2010-07-04,2011-07-04))

 I would like to flag the weeks in my weekly$week that contain those 2
 individual dates.
 I can only think of a very clumsy way of doing it:

 myrows-c(which(weekly$week==weekly$week[weekly$weekjuly4[1]][1]-7),
        which(weekly$week==weekly$week[weekly$weekjuly4[2]][1]-7))
 weekly$flag-0
 weekly$flag[myrows]-1

 It's clumsy - because actually, my vector of dates of interest (july4
 above) is much longer.
 Is there maybe a more elegant way of doing it?
 Thank you!
 --
 Dimitri Liakhovitski
 marketfusionanalytics.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inserting column in between -- better way?


On Aug 1, 2011, at 20:50 , David L Carlson wrote:

 Actually Sara's method fails if the insertion is after the first or before
 the last column:
 
 x - data.frame(A=1:3, B=1:3, C=1:3, D=1:3, E=1:3)
 newcol - 4:6
 cbind(x[,1], newcol, x[,2:ncol(x)])
 

Sarah (sic) is on the right track, just lose the commas so that you don't drop 
to a vector:

 x - data.frame(A=1:3, B=1:3, C=1:3, D=1:3, E=1:3)
 newcol - 4:6
 cbind(x[1], newcol, x[2:ncol(x)])
  A newcol B C D E
1 1  4 1 1 1 1
2 2  5 2 2 2 2
3 3  6 3 3 3 3

Also notice that there is a named form of cbind

 cbind(x[1], foo=4:6, x[2:ncol(x)])
  A foo B C D E
1 1   4 1 1 1 1
2 2   5 2 2 2 2
3 3   6 3 3 3 3


and that things will work (mostly) with matrices and data frames too:

 newcol - data.frame(x=4:6,y=6:4)
 cbind(x[1], newcol, x[2:ncol(x)])
  A x y B C D E
1 1 4 6 1 1 1 1
2 2 5 5 2 2 2 2
3 3 6 4 3 3 3 3
 cbind(x[1], as.matrix(newcol), x[2:ncol(x)])
  A x y B C D E
1 1 4 6 1 1 1 1
2 2 5 5 2 2 2 2
3 3 6 4 3 3 3 3

(The mostly bit refers to some slight oddness occurring if you cbind a matrix 
with no column names:

 cbind(x[1], cbind(4:6,7:9), x[2:ncol(x)])
  A 1 2 B C D E
1 1 4 7 1 1 1 1
2 2 5 8 2 2 2 2
3 3 6 9 3 3 3 3

)
-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com
Døden skal tape! --- Nordahl Grieg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] density plot for weighted data

2011-08-02 Thread r student

I'm trying to create a density plot using census data, where the
weights don't sum to 1.


plot(density(oh$FINCP,weights=oh$PWGTP))


Warning message:
In density.default(oh$FINCP, weights = oh$PWGTP) :
  sum(weights) != 1  -- will not get true density


How would I go about doing this?


Thanks!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] identifying weeks (dates) that certain days (dates) fall into

2011-08-02 Thread Dimitri Liakhovitski

Thanks a lot, everyone!
Dimitri

On Tue, Aug 2, 2011 at 12:34 PM, Dennis Murphy djmu...@gmail.com wrote:
 Hi:

 You could try the lubridate package:

 library(lubridate)
 week(weekly$week)
 week(july4)
 [1] 27 27

 week
 function (x)
 yday(x)%/%7 + 1
 environment: namespace:lubridate

 which is essentially Gabor's code :)

 HTH,
 Dennis

 On Tue, Aug 2, 2011 at 7:36 AM, Dimitri Liakhovitski
 dimitri.liakhovit...@gmail.com wrote:
 Hello!

 I have dates for the beginning of each week, e.g.:
 weekly-data.frame(week=seq(as.Date(2010-04-01),
 as.Date(2011-12-26),by=week))
 week  # each week starts on a Monday

 I also have a vector of dates I am interested in, e.g.:
 july4-as.Date(c(2010-07-04,2011-07-04))

 I would like to flag the weeks in my weekly$week that contain those 2
 individual dates.
 I can only think of a very clumsy way of doing it:

 myrows-c(which(weekly$week==weekly$week[weekly$weekjuly4[1]][1]-7),
        which(weekly$week==weekly$week[weekly$weekjuly4[2]][1]-7))
 weekly$flag-0
 weekly$flag[myrows]-1

 It's clumsy - because actually, my vector of dates of interest (july4
 above) is much longer.
 Is there maybe a more elegant way of doing it?
 Thank you!
 --
 Dimitri Liakhovitski
 marketfusionanalytics.com

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Dimitri Liakhovitski
marketfusionanalytics.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] lattice: index plot

2011-08-02 Thread Peter Ehlers


Does

 xyplot(y ~ seq_along(y), xlab = Index)

do what you want?

Peter Ehlers

On 2011-08-02 09:07, Thaler, Thorn, LAUSANNE, Applied Mathematics wrote:

Dear all,

How can I make an index plot with lattice, that is plotting a vector
simply against its particular index in the vector, i.e. something
similar to

y- rnorm(10)
plot(y)

I don't want to specify the x's manually, as this could become
cumbersome when having multiple panels.

I tried something like

library(lattice)
mp- function(x, y, ...) {
   x- 1:length(y)
   panel.xyplot(x, y, ...)
}

pp- function(x, y, ...) {
   list(xlim = extendrange(1:length(y)), ylim = extendrange(y))
}

set.seed(123)
y- rnorm(10)
xyplot(y ~ 1, panel = mp, prepanel = pp, xlab=Index)

but I was wondering whether there is a more straightforward way?

By the way, if I do not specify the ylim in the prepanel function the
plot is clipped, but reading Deepayan's book, p.140 :

[...], so a user-specified prepanel function is not required to return
all of these components [i.e. xlim, ylim, xat, yat, dx and dy]; any
missing component will be replaced by the corresponding default.

I'd understand that if I do not specify ylim it is calculated
automatically? Not a big thing though, but it seems to me to be
inconsistent.

Any help appreciated.

KR,

-Thorn

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] density plot for weighted data



On Aug 2, 2011, at 12:51 PM, r student wrote:


I'm trying to create a density plot using census data, where the
weights don't sum to 1.



plot(density(oh$FINCP,weights=oh$PWGTP))



Warning message:
In density.default(oh$FINCP, weights = oh$PWGTP) :
 sum(weights) != 1  -- will not get true density


How would I go about doing this?


Wouldn't you just divide by the sum?

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Data frame to matrix - revisited

2011-08-02 Thread Jagz Bell

Hi,
I've tried to look through all the previous related Threads/posts but can't 
find a solution to what's probably a simple question.
 
I have a data frame comprised of three columns e.g.:
 
ID1 ID2 Value
a b 1
b d 1
c a 2
c e 1
d a 1
e d 2
 
I'd like to convert the data to a matrix i.e.:
 
 a b c d e
a n/a 1 2 1 n/a
b 1 n/a n/a 1 n/a 
c 2 n/a n/a n/a 1
d 1 1 n/a n/a 2
e n/a n/a 1 2 n/a
 
Any help is much appreciated,
 
Jagz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory limit in Aggregate()

2011-08-02 Thread Guillaume

Hi Peter,

Yes I have a large number of factors in the listBy table.

Do you mean that aggregate() creates a complete cartesian product of the
by columns ? (and creates combinations of values that do not exist in the
orignial by table, before removing them when returning the aggregated
table?)


Thanks a lot,
Guillaume

--
View this message in context: 
http://r.789695.n4.nabble.com/Memory-limit-in-Aggregate-tp3711819p3713042.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inserting column in between -- better way?

2011-08-02 Thread Bert Gunter

Thanks for this Peter:


 Sarah (sic) is on the right track, just lose the commas so that you don't 
 drop to a vector:

 x - data.frame(A=1:3, B=1:3, C=1:3, D=1:3, E=1:3)
 newcol - 4:6
 cbind(x[1], newcol, x[2:ncol(x)])
  A newcol B C D E
 1 1      4 1 1 1 1
 2 2      5 2 2 2 2
 3 3      6 3 3 3 3


Am I correct in saying that this is a bit subtle: x[1] and
x[2:ncol(x)] are actually lists with vector components; so you're
cbinding lists, which retain the labels, no?

If so, it's a nice subtlety to remember, anyway.

-- Bert


-- 
Men by nature long to get on to the ultimate truths, and will often
be impatient with elementary studies or fight shy of them. If it were
possible to reach the ultimate truths without the elementary studies
usually prefixed to them, these would not be preparatory studies but
superfluous diversions.

-- Maimonides (1135-1204)

Bert Gunter
Genentech Nonclinical Biostatistics

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] density plot for weighted data



On Aug 2, 2011, at 1:11 PM, r student wrote:


Like below?

plot(density(oh$FINCP,weights=oh$PWGTP/sum(oh$PWGTP)))


I don't understand why you are asking for approval. You are the one  
with the data and know where they came from. We have none of that  
background.


--
David.

On Tue, Aug 2, 2011 at 10:06 AM, David Winsemius dwinsem...@comcast.net 
 wrote:


On Aug 2, 2011, at 12:51 PM, r student wrote:


I'm trying to create a density plot using census data, where the
weights don't sum to 1.



plot(density(oh$FINCP,weights=oh$PWGTP))



Warning message:
In density.default(oh$FINCP, weights = oh$PWGTP) :
 sum(weights) != 1  -- will not get true density


How would I go about doing this?


Wouldn't you just divide by the sum?

--

David Winsemius, MD
West Hartford, CT




David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data frame to matrix - revisited

Jagz,

Assuming that your data frame is called df, try this ...

tapply(df$Value, list(df$ID1, df$ID2), mean)

Jean


`·.,,  (((º   `·.,,  (((º   `·.,,  (((º

Jean V. Adams
Statistician
U.S. Geological Survey
Great Lakes Science Center
223 East Steinfest Road
Antigo, WI 54409  USA
715-627-4317, ext. 3125  (Office)
715-216-8014  (Cell)
715-623-6773  (FAX)
http://www.glsc.usgs.gov  (GLSC web site)
http://profile.usgs.gov/jvadams  (My homepage)
jvad...@usgs.gov  (E-mail)




From:
Jagz Bell jagzb...@yahoo.com
To:
r-help@R-project.org r-help@r-project.org
Date:
08/02/2011 12:13 PM
Subject:
[R] Data frame to matrix  - revisited
Sent by:
r-help-boun...@r-project.org



Hi,
I've tried to look through all the previous related Threads/posts but 
can't find a solution to what's probably a simple question.
 
I have a data frame comprised of three columns e.g.:
 
ID1 ID2 Value
a b 1
b d 1
c a 2
c e 1
d a 1
e d 2
 
I'd like to convert the data to a matrix i.e.:
 
 a b c d e
a n/a 1 2 1 n/a
b 1 n/a n/a 1 n/a 
c 2 n/a n/a n/a 1
d 1 1 n/a n/a 2
e n/a n/a 1 2 n/a
 
Any help is much appreciated,
 
Jagz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Inserting column in between -- better way?


On Aug 2, 2011, at 19:17 , Bert Gunter wrote:

 Thanks for this Peter:
 
 
 Sarah (sic) is on the right track, just lose the commas so that you don't 
 drop to a vector:
 
 x - data.frame(A=1:3, B=1:3, C=1:3, D=1:3, E=1:3)
 newcol - 4:6
 cbind(x[1], newcol, x[2:ncol(x)])
  A newcol B C D E
 1 1  4 1 1 1 1
 2 2  5 2 2 2 2
 3 3  6 3 3 3 3
 
 
 Am I correct in saying that this is a bit subtle: x[1] and
 x[2:ncol(x)] are actually lists with vector components; so you're
 cbinding lists, which retain the labels, no?

Well, to be precise they are obtained by indexing a data frame _as_ a list. The 
result of that is a data frame (always, which was the point). 

So you're cbind()-ing data frames, which is what you wanted to do all along. 

 
 If so, it's a nice subtlety to remember, anyway.
 
 -- Bert
 
 
 -- 
 Men by nature long to get on to the ultimate truths, and will often
 be impatient with elementary studies or fight shy of them. If it were
 possible to reach the ultimate truths without the elementary studies
 usually prefixed to them, these would not be preparatory studies but
 superfluous diversions.
 
 -- Maimonides (1135-1204)
 
 Bert Gunter
 Genentech Nonclinical Biostatistics

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com
Døden skal tape! --- Nordahl Grieg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Memory limit in Aggregate()


On Aug 2, 2011, at 19:09 , Guillaume wrote:

 Hi Peter,
 
 Yes I have a large number of factors in the listBy table.
 
 Do you mean that aggregate() creates a complete cartesian product of the
 by columns ? (and creates combinations of values that do not exist in the
 orignial by table, before removing them when returning the aggregated
 table?)

Hm, at least in recent versions that shouldn't happen. The meat of 
aggregate.data.frame is

ans - lapply(split(e, grp), FUN, ...)

where grp is a numerical coding of the factor combination for each cell. That 
could conceivably contain some large values, but since it is numeric (and not a 
factor with levels, say,  0:(n1*n2*n3*n4-1)), split should not generate more 
groups than are present in data. 

Some of this stuff was rewritten in Jan 2010. You might want to try a version 
which is later than yours from May 2009...

 
 
 Thanks a lot,
 Guillaume
 
 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Memory-limit-in-Aggregate-tp3711819p3713042.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com
Døden skal tape! --- Nordahl Grieg

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] density plot for weighted data

2011-08-02 Thread r student

Like below?

plot(density(oh$FINCP,weights=oh$PWGTP/sum(oh$PWGTP)))





On Tue, Aug 2, 2011 at 10:06 AM, David Winsemius dwinsem...@comcast.net wrote:

 On Aug 2, 2011, at 12:51 PM, r student wrote:

 I'm trying to create a density plot using census data, where the
 weights don't sum to 1.


 plot(density(oh$FINCP,weights=oh$PWGTP))


 Warning message:
 In density.default(oh$FINCP, weights = oh$PWGTP) :
  sum(weights) != 1  -- will not get true density


 How would I go about doing this?

 Wouldn't you just divide by the sum?

 --

 David Winsemius, MD
 West Hartford, CT



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Extract names from vector according to their values

2011-08-02 Thread Sverre Stausland

Dear helpers,

I can create a vector with the priority of the packages that came with
R, like this:

 installed.packages()[,Priority]-my.vector
 my.vector
 base  boot class   cluster codetools
   base recommended recommended recommended recommended
 compiler  datasets   foreign  graphics grDevices
   basebase recommendedbasebase
 gridKernSmooth   lattice  MASSMatrix
   base recommended recommended recommended recommended
  methods  mgcv  nlme  nnet rpart
   base recommended recommended recommended recommended
  spatial   splines statsstats4  survival
recommendedbasebasebase recommended
tcltk tools utils
   basebasebase

How can I extract the names from this vector according to their
priority? I.e. I want to create a vector from this with the names of
the base packages, and another vector with the names of the
recommended packages.

Thank you
Sverre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data frame to matrix - revisited

Hi:

Here are a couple of ways. Since your data frame does not contain a
'c' in ID2, we redefine the factor to give it all five levels rather
than the observed four:

 df - read.table(textConnection(
+ ID1 ID2 Value
+ a b 1
+ b d 1
+ c a 2
+ c e 1
+ d a 1
+ e d 2), header = TRUE)
str(df)
 str(df)
'data.frame':   6 obs. of  3 variables:
 $ ID1  : Factor w/ 5 levels a,b,c,d,..: 1 2 3 3 4 5
 $ ID2  : Factor w/ 4 levels a,b,d,e: 2 3 1 4 1 3
 $ Value: int  1 1 2 1 1 2

df$ID2 - factor(df$ID2, levels = letters[1:5])
 str(df)
'data.frame':   6 obs. of  3 variables:
 $ ID1  : Factor w/ 5 levels a,b,c,d,..: 1 2 3 3 4 5
 $ ID2  : Factor w/ 5 levels a,b,c,d,..: 2 4 1 5 1 4
 $ Value: int  1 1 2 1 1 2

Now we're good...

# (1) xtabs:
with(df, xtabs(Value ~ ID1 + ID2) + xtabs(Value ~ ID2 + ID1))
   ID2
ID1 a b c d e
  a 0 1 2 1 0
  b 1 0 0 1 0
  c 2 0 0 0 1
  d 1 1 0 0 2
  e 0 0 1 2 0

# (2) acast() in the reshape2 package:
library('reshape2')
v1 - acast(df, ID1 ~ ID2, value_var = 'Value', drop = FALSE, fill = 0)
v2 - acast(df, ID2 ~ ID1, value_var = 'Value', drop = FALSE, fill = 0)
v - v1 + v2
v[v == 0L] - NA
v
   a  b  c  d  e
a NA  1  2  1 NA
b  1 NA NA  1 NA
c  2 NA NA NA  1
d  1  1 NA NA  2
e NA NA  1  2 NA

HTH,
Dennis


On Tue, Aug 2, 2011 at 10:00 AM, Jagz Bell jagzb...@yahoo.com wrote:
 Hi,
 I've tried to look through all the previous related Threads/posts but can't 
 find a solution to what's probably a simple question.

 I have a data frame comprised of three columns e.g.:

 ID1 ID2 Value
 a b 1
 b d 1
 c a 2
 c e 1
 d a 1
 e d 2

 I'd like to convert the data to a matrix i.e.:

  a b c d e
 a n/a 1 2 1 n/a
 b 1 n/a n/a 1 n/a
 c 2 n/a n/a n/a 1
 d 1 1 n/a n/a 2
 e n/a n/a 1 2 n/a

 Any help is much appreciated,

 Jagz

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract names from vector according to their values



On Aug 2, 2011, at 2:21 PM, Sverre Stausland wrote:


Dear helpers,

I can create a vector with the priority of the packages that came with
R, like this:


installed.packages()[,Priority]-my.vector
my.vector

base  boot class   cluster codetools
  base recommended recommended recommended recommended
compiler  datasets   foreign  graphics grDevices
  basebase recommendedbasebase
gridKernSmooth   lattice  MASSMatrix
  base recommended recommended recommended recommended
 methods  mgcv  nlme  nnet rpart
  base recommended recommended recommended recommended
 spatial   splines statsstats4  survival
recommendedbasebasebase recommended
   tcltk tools utils
  basebasebase

How can I extract the names from this vector according to their
priority? I.e. I want to create a vector from this with the names of
the base packages, and another vector with the names of the
recommended packages.


 names( my.vector[which(my.vector==recommended)])
 [1] boot   class  cluster
 [4] codetools  foreignKernSmooth
 [7] latticeMASS   Matrix
[10] mgcv   nlme   nnet
[13] rpart  spatialsurvival

Note that some people may tell you that this form below should be  
preferred because the 'which' is superfluous. It is not. The [  
function returns all the NA's fr reasons that are unclear to me. It is  
wiser to use `which` so that you get numerical indexing.

 names(my.vector[my.vector==recommended])

On my system it produces 493 items most of them NA's.

--

David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract names from vector according to their values

Sverre,

Try this:

my.list - split(names(my.vector), my.vector)
my.list$base
my.list$recommended

Jean


`·.,,  (((º   `·.,,  (((º   `·.,,  (((º

Jean V. Adams
Statistician
U.S. Geological Survey
Great Lakes Science Center
223 East Steinfest Road
Antigo, WI 54409  USA



From:
Sverre Stausland john...@fas.harvard.edu
To:
r-help@r-project.org
Date:
08/02/2011 01:24 PM
Subject:
[R] Extract names from vector according to their values
Sent by:
r-help-boun...@r-project.org



Dear helpers,

I can create a vector with the priority of the packages that came with
R, like this:

 installed.packages()[,Priority]-my.vector
 my.vector
 base  boot class   cluster codetools
   base recommended recommended recommended recommended
 compiler  datasets   foreign  graphics grDevices
   basebase recommendedbasebase
 gridKernSmooth   lattice  MASSMatrix
   base recommended recommended recommended recommended
  methods  mgcv  nlme  nnet rpart
   base recommended recommended recommended recommended
  spatial   splines statsstats4  survival
recommendedbasebasebase recommended
tcltk tools utils
   basebasebase

How can I extract the names from this vector according to their
priority? I.e. I want to create a vector from this with the names of
the base packages, and another vector with the names of the
recommended packages.

Thank you
Sverre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Extract names from vector according to their values