Re: [R] Sending Email with Attachment

2013-06-10 Thread Enrico Schumann
On Mon, 10 Jun 2013, Bhupendrasinh Thakre vickytha...@gmail.com writes:

 Thanks Rex for the help. So it seems that I might have to use Python or Perl
 to perform the action.

 
On Windows, you may want to look at Blat ( http://www.blat.net/ ).  You
can easily use it from R scripts via 'system'.




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
 Behalf Of rex
 Sent: Sunday, June 09, 2013 10:27 PM
 To: r-help@r-project.org
 Subject: Re: [R] Sending Email with Attachment

 Bhupendrasinh Thakre vickytha...@gmail.com [2013-06-09 20:03]:

library(sendmailR)

from - a...@outlook.com
to -  mailto:e...@gmail.com e...@gmail.com subject - Run at
mailControl = list(smtpServer=blu-m.hotmail.com)
attachment - type_1.pdf
attachmentName - target_score.pdf
attachmentObject - mime_part(x= attachment,name= attachmentName) body 
- Email Body
bodywithAttachement - list(body, attachmentObject) 
sendmail(from=from,to=to,subject=subject,msg=
bodywithAttachement,control=mailControl)

However it gives me following Error:

Error:

Error in socketConnection(host = server, port = port, blocking = TRUE) :
  cannot open the connection
In addition: Warning message:
In socketConnection(host = server, port = port, blocking = TRUE) :
  blu-m.hotmail.com:25 cannot be opened

 It's an unsurprising result since telnet doesn't connect either:
   
 telnet blu-m.hotmail.com 25
 Trying 65.55.121.94...

[...]

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] reshaping a data frame

2013-06-10 Thread Abhishek Pratap
Hi Guys

I am trying to cast a data frame but not aggregate the rows for the
same variable.

here is a contrived example.

**input**
temp_df  - 
data.frame(names=c('foo','foo','foo'),variable=c('w','w','w'),value=c(34,65,12))
 temp_df
  names variable value
1   foow34
2   foow65
3   foow12


###
**Want this**

names  w
foo 34
foo 65
foo 12


##
**getting this***
##
 cast(temp_df)
Aggregation requires fun.aggregate: length used as default
  names w
1   foo 3


In real dataset  the categorical column 'variable' will have many more
categorical variable.

Thanks!
-Abhi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot load pbdMPI package after compilation

2013-06-10 Thread Prof Brian Ripley

On 10/06/2013 03:17, Pascal Oettli wrote:

Hello,

I am not sure whether it helps you, but I was able to install it.

OpenSUSE 12.3
R version 3.0.1 Patched (2013-06-09 r62918)
pbdMPI version 0.1-6
gcc version 4.7.2
OpenMPI version 1.6.3

I didn't try with the most recent version of ompi (1.6.4).


But the system used to accept that version of pdbMPI for CRAN used it, 
with gcc.


The issue here is likely to be using the Intel compiler with OpenMPI. 
This is a programming matter really off-topic for R-help (see the 
posting guide).  The first port of call for help is the package 
maintainer, then if that does not help, the R-devel list.  But very few 
R users have access to an Intel compiler, let alone one as recent as 
that, and you will be expected to use a debugger for yourself (see 
'Writing R Extensions').




Regards,
Pascal


On 07/06/13 21:42, Antoine Migeon wrote:

Hello,

I try to install pbdMPI.
Compilation successful, but load fails with segfault.

Is anyone can help me?

R version 3.0.0
pbdMPI version 0.1-6
Intel compiler version 13.1.1
OpenMPI version 1.6.4-1
CPU Intel x86_64

# R CMD INSTALL pbdMPI_0.1-6.tar.gz
..

checking for gcc... icc -std=gnu99
checking whether the C compiler works... yes
checking for C compiler default output file name... a.out
checking for suffix of executables...
checking whether we are cross compiling... no
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether icc -std=gnu99 accepts -g... yes
checking for icc -std=gnu99 option to accept ISO C89... none needed
checking for mpirun... mpirun
checking for mpiexec... mpiexec
checking for orterun... orterun
checking for sed... /bin/sed
checking for mpicc... mpicc
checking for ompi_info... ompi_info
checking for mpich2version... F
found sed, mpicc, and ompi_info ...

TMP_INC_DIRS = /opt/openmpi/1.6.4-1/intel-13.1.1/include

checking /opt/openmpi/1.6.4-1/intel-13.1.1/include ...
found /opt/openmpi/1.6.4-1/intel-13.1.1/include/mpi.h ...

TMP_LIB_DIRS = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64

checking /opt/openmpi/1.6.4-1/intel-13.1.1/lib64 ...
found /opt/openmpi/1.6.4-1/intel-13.1.1/lib64/libmpi.so ...
found mpi.h and libmpi.so ...

TMP_INC = /opt/openmpi/1.6.4-1/intel-13.1.1/include
TMP_LIB = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64

checking for openpty in -lutil... yes
checking for main in -lpthread... yes

*** Results of pbdMPI package configure *


TMP_INC = /opt/openmpi/1.6.4-1/intel-13.1.1/include
TMP_LIB = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64
MPI_ROOT =
MPITYPE = OPENMPI
MPI_INCLUDE_PATH = /opt/openmpi/1.6.4-1/intel-13.1.1/include
MPI_LIBPATH = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64
MPI_LIBS =  -lutil -lpthread
MPI_DEFS = -DMPI2
MPI_INCL2 =
PKG_CPPFLAGS = -I/opt/openmpi/1.6.4-1/intel-13.1.1/include  -DMPI2

-DOPENMPI

PKG_LIBS = -L/opt/openmpi/1.6.4-1/intel-13.1.1/lib64 -lmpi  -lutil

-lpthread
*
..
icc -std=gnu99 -I/usr/local/R/3.0.0/intel13/lib64/R/include -DNDEBUG
-I/opt/openmpi/1.6.4-1/intel-13.1.1/include  -DMPI2 -DOPENMPI -O3
-fp-model precise -pc 64 -axAVX-fpic  -O3 -fp-model precise  -pc 64
-axAVX  -c comm_errors.c -o comm_errors.o
icc -std=gnu99 -I/usr/local/R/3.0.0/intel13/lib64/R/include -DNDEBUG
-I/opt/openmpi/1.6.4-1/intel-13.1.1/include  -DMPI2 -DOPENMPI -O3
-fp-model precise -pc 64 -axAVX-fpic  -O3 -fp-model precise  -pc 64
-axAVX  -c comm_sort_double.c -o comm_sort_double.o
.
..

** testing if installed package can be loaded
sh: line 1:  2905 Segmentation fault
'/usr/local/R/3.0.0/intel13/lib64/R/bin/R' --no-save --slave 21 
/tmp/RtmpGkncGK/file1e541c57190
ERROR: loading failed

  *** caught segfault ***
address (nil), cause 'unknown'

Traceback:
  1: .Call(spmd_initialize, PACKAGE = pbdMPI)
  2: fun(libname, pkgname)
  3: doTryCatch(return(expr), name, parentenv, handler)
  4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
  5: tryCatchList(expr, classes, parentenv, handlers)
  6: tryCatch(fun(libname, pkgname), error = identity)
  7: runHook(.onLoad, env, package.lib, package)
  8: loadNamespace(package, c(which.lib.loc, lib.loc))
  9: doTryCatch(return(expr), name, parentenv, handler)
10: tryCatchOne(expr, names, parentenv, handlers[[1L]])
11: tryCatchList(expr, classes, parentenv, handlers)
12: tryCatch(expr, error = function(e) {call - conditionCall(e)
if (!is.null(call)) {if (identical(call[[1L]],
quote(doTryCatch))) call - sys.call(-4L)dcall -
deparse(call)[1L]prefix - paste(Error in, dcall, : )
LONG - 75Lmsg - conditionMessage(e)sm - strsplit(msg,
\n)[[1L]]w - 14L + nchar(dcall, type = w) + nchar(sm[1L],
type = w)if (is.na(w)) w - 14L + nchar(dcall,
type = b) + nchar(sm[1L], type = b)if (w 
LONG) prefix - paste0(prefix, \n  )}else prefix
- Error : msg - paste0(prefix, 

[R] problems with setClass or/and setMethod

2013-06-10 Thread andreas betz
Hello,

I am working my way through A (not so) Short introduction to S4

I created a class

setClass(Class = Trajectories,
 representation = representation(times = numeric,traj = matrix))

and tried to build a method using

setMethod(
  f = plot,
  signature = Trajectories,
  definition = function(X, y, ...){
matplot(x@times, t(x@traj), xaxt = n, type = l, ylab = ,
xlab = , pch = 1)
axis(1, at = x@times)
  }
)

R responds with an error message:

Creating a generic function for ‘plot’ from package ‘graphics’ in the
global environment
Error in conformMethod(signature, mnames, fnames, f, fdef, definition) :
  in method for ‘plot’ with signature ‘x=Trajectories’: formal arguments
(x = Trajectories, y = Trajectories, ... = Trajectories) omitted in
the method definition cannot be in the signature

Did anything change in the transition to R-3.0?

is there any other, more recent introduction to S4 classes recommended?

Thank you

for your help.

Andreas

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Identifying breakpoints/inflection points?

2013-06-10 Thread dchristop
You can try this: 

library(inflection) 
#you have to instsall package inflection first 
a-findiplist(cbind(year),cbind(piproute),1) 
a 

The answer: 
 [,1] [,2]   [,3] 
[1,]5   35 1986.0 
[2,]5   30 1983.5 

shows that the total inflection point is between 1983 and 1986, if we treat
data as first concave and then convex, as it can be found from a simple
graph.



--
View this message in context: 
http://r.789695.n4.nabble.com/Identifying-breakpoints-inflection-points-tp2065886p4669117.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Create a package with package.skeleton

2013-06-10 Thread jpm miao
Hi,

   I am trying to build a package with package.skeleton function.

   I already have the function quadprod2.R in the current folder. After
running the program,

library(frontier)
source(quadprod2.R)
package.skeleton(name=sfa_ext)

 package.skeleton(name=sfa_ext)
Creating directories ...
Creating DESCRIPTION ...
Creating NAMESPACE ...
Creating Read-and-delete-me ...
Saving functions and data ...
Making help files ...
Done.
Further steps are described in './sfa_ext/Read-and-delete-me'.

Opening the Read-and-delete-me file by notepad, I find

* Edit the help file skeletons in 'man', possibly
  combining help files for multiple functions.
* Edit the exports in 'NAMESPACE', and add
  necessary imports.
* Put any C/C++/Fortran code in 'src'.
* If you have compiled code, add a useDynLib()
  directive to 'NAMESPACE'.
* Run R CMD build to build the package tarball.
* Run R CMD check to check the package tarball.

Read Writing R Extensions for more information.


Then it seems that I need to edit some documentation. Since I build the
package primarily for myself, I spend as little time as possible editing
the documentation. I almost do nothing on man and namespace. (Is that
ok?)
What should I do next?
How can I run build and check commands?

Thanks

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems with setClass or/and setMethod

2013-06-10 Thread David Winsemius

On Jun 9, 2013, at 11:37 PM, andreas betz wrote:

 Hello,
 
 I am working my way through A (not so) Short introduction to S4
 
 I created a class
 
 setClass(Class = Trajectories,
 representation = representation(times = numeric,traj = matrix))
 
 and tried to build a method using
 
 setMethod(
  f = plot,
  signature = Trajectories,
  definition = function(X, y, ...){
matplot(x@times, t(x@traj), xaxt = n, type = l, ylab = ,
 xlab = , pch = 1)
axis(1, at = x@times)
  }
)
 
 R responds with an error message:
 
 Creating a generic function for Œplot‚ from package Œgraphics‚ in the
 global environment
 Error in conformMethod(signature, mnames, fnames, f, fdef, definition) :
  in method for Œplot‚ with signature Œx=Trajectories‚: formal arguments
 (x = Trajectories, y = Trajectories, ... = Trajectories) omitted in
 the method definition cannot be in the signature
 
 Did anything change in the transition to R-3.0?

I doubt it worked in earlier versions. There is a misprint of X where there 
should be an x. I'm unable to explain why the y is along side the x in 
the argument list since the 'definition' function does nothing with it.

 
 is there any other, more recent introduction to S4 classes recommended?
 
 Thank you
 
 for your help.
 
 Andreas
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Error Object Not Found

2013-06-10 Thread Rui Barradas

Hello,

Please quote context.
The message you get means that package foreign is installed on your 
computer, you need to load it in the R session:


library(foreign)

Hope this helps,

Rui Barradas

Em 09-06-2013 23:07, Court escreveu:

Hi,

I think that they are loaded.  Here is the response that I get:

  package ‘foreign’ successfully unpacked and MD5 sums checked




--
View this message in context: 
http://r.789695.n4.nabble.com/Error-Object-Not-Found-tp4669041p4669100.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to expand.grid with string elements (the half!)

2013-06-10 Thread Rolf Turner


Your question makes no sense at all.  The grid expansion
has 9 rows.  In case you hadn't noticed, 9 is an odd number
(i.e. not divisible by 2).  There are no halves.

Do not expect the list to read your mind.  Instead, ask a
meaningful question.

cheers,

Rolf Turner

On 10/06/13 17:25, Gundala Viswanath wrote:

I have the following result of expand grid:


d - expand.grid(c(x,y,z),c(x,y,z))

What I want is to create a combination of strings
but only the half of the all combinations:

   Var1 Var2
1xx
2yx
3   yy
4   zy
5   xz
6zz


What's the way to do it?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Not sure this is something R could do but it feels like it should be.

2013-06-10 Thread Jim Lemon

On 06/09/2013 11:14 PM, Calum Polwart wrote:

...
What we are trying to do is determine the most appropriate number to
make the capsules. (Our dosing is more complex but lets stick to
something simple. I can safely assure you that vritually no-one actually
needs 250 or 500mg as a dose of amoxicillin... ...thats just a dose to
get them into a therapeutic window, and I'm 99% certain 250 and 500 are
used coz they are round numbers. if 337.5 more reliably got everyone in
the window without kicking anyone out the window that'd be a better dose
to use! So... what I'm looking to do is model the 'theoretical dose
required' (which we know) and the dose delivered using several starting
points to get the 'best fit'. We know they need to be within 7% of each
other, but if one starting point can get 85% of doses within 5% we think
that might be better than one that only gets 50% within 5%.



Okay, I think I see what you are attempting now. You are stuck with 
fairly large dosage increments (say powers of two) and you want to have 
a base value that will be appropriate for the greatest number of 
patients. So, your range of doses can be generated with:


d * 2 ^ (0:m)

where d is some constant and m+1 is the number of doses you want to 
generate. For your amoxcillin, d=250 and m=1, so you get 250 and 500mg. 
Given this relationship (or any other one you can define), you want to 
set your base dose so that it is close to the mode of the patient 
distribution. This means that the greatest number of patients will be 
suitably dosed with your base dose. I would probably try to solve this 
by brute force, setting the base dose at the mode and then moving it up 
and down until the dose was appropriate for the largest number of patients.


However, there are a lot of people on this list who would be more 
familiar with this sort of problem, and there may be a more elegant 
solution.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] modify and append new rows to a data.frame using ddply

2013-06-10 Thread Santiago Guallar
Hi,

I have a data.frame that contains a variable act which records the duration (in 
seconds) of two states (wet-dry) for several individuals (identified by Ring) 
over a period of time. Since I want to work with daytime (i.e. from sunrise 
till sunset) and night time (i.e. from sunset till next sunrise), I have to 
split act from time[i] till sunset and from sunset until time[i+1], and from 
time[k] till sunrise and from sunrise until time[k+1].

Here is an example with time and act separated by a comma:

[i] 01-01-2000 20:55:00 , 360 
[i+1] 01-01-2000 21:01:00 , 30 # let's say that sunset is at 01-01-2000 21:00:00

[i+2] 01-01-2000 21:01:30 , 30

.
.
.
My goal is to get:
[i] 01-01-2000 20:55:00 , 300 # act is modified

[i+1] 01-01-2000 21:00:00 , 60 # new row with time=sunset

[i+2] 01-01-2000 21:01:00 , 30 # previously row i+1th

[i+3] 01-01-2000 21:01:30 , 30 # previously row i+2th

.
.
.
I attach a dput with a selection of my data.frame. Here is a piece of existing 
code that I am trying to adapt just for the daytime/night time change:

  require(plyr)

  xandaynight - ddply( xan, .(Ring), function(df1){
  # index of day/night changes
  ind - c( FALSE, diff(df$dif) == 1 )
  add - df1[ind,]
  add$timepos - add$dusk
  # rearrangement
  df1 - rbind( df1, add )
  df1 - df1[order(df1$timepos),]
  # recalculation of act
  df1$act2 - c( diff( as.numeric(df1$timepos) ), NA )
  df1} )

This code produces an error message:
Error en diff(df$dif): error in evaluating the argument 'x' in selecting a 
method for function 'diff': Error en df$dif: object of type 'closure' is not a 
subset

Thank you for your help,

Santi__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Estimation of covariance matrices and mixing parameter by a bivariate normal-lognormal model

2013-06-10 Thread hertzogg
Dear all,
I have to create a model which is a mixture of a normal and log-normal
distribution. To create it, I need to estimate the 2 covariance matrixes and
the mixing parameter (total =7 parameters) by maximizing the log-likelihood
function. This maximization has to be performed by the nlm routine.
As I use relative data, the means are known and equal to 1.
I’ve already tried to do it in 1 dimension (with 1 set of relative data) and
it works well. However, when I introduce the 2nd set of relative data I get
illogical results for the correlation and a lot of warnings messages.
To estimates the parameters I defined first the log-likelihood function with
the 2 commands dmvnorm and dlnorm.plus. Then I assign starting values of the
parameters and finally I use the nlm routine to estimate the parameters (see
script below).
# Importing and reading the grid files. Output are 2048x2048 matrixes

P - read.ascii.grid(d:/Documents/JOINT_FREQUENCY/grid_E727_P-3000.asc,
return.header= FALSE ); 
V - read.ascii.grid(d:/Documents/JOINT_FREQUENCY/grid_E727_V-3000.asc,
return.header= FALSE ); 

p - c(P); # tranform matrix into a vector
v - c(V);

p- p[!is.na(p)] # removing NA values
v- v[!is.na(v)]

p_rel - p/mean(p) #Transforming the data to relative values
v_rel - v/mean(v) 
PV - cbind(p_rel, v_rel) # create a matrix of vectors

L - function(par,p_rel,v_rel) {

return (-sum(log( (1- par[7])*dmvnorm(PV, mean=c(1,1), sigma=
matrix(c(par[1], par[1]*par[2]*par[3],par[1]*par[2]*par[3], par[2] ),nrow=2,
ncol=2))+
par[7]*dlnorm.rplus(PV, meanlog=c(1,1), varlog=
matrix(c(par[4],par[4]*par[5]*par[6],par[4]*par[5]*par[6],par[5]),
nrow=2,ncol=2)))))

}
par.start- c(0.74, 0.66 ,0.40, 1.4, 1.2, 0.4, 0.5) # log-likelihood
estimators
result-nlm(L,par.start,v_rel=v_rel,p_rel=p_rel, hessian=TRUE, iterlim=200,
check.analyticals= TRUE)
Il y a eu 50 avis ou plus (utilisez warnings() pour voir les 50 premiers)
1: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
production de NaN
2: In sqrt(2 * pi * det(varlog)) : production de NaN
3: In nlm(L, par.start, v_rel = v_rel, p_rel = p_rel, hessian = TRUE,  ... : 
NA/Inf replaced by maximum positive value
4: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
production de NaN
5: In sqrt(2 * pi * det(varlog)) : production de NaN
6: In nlm(L, par.start, v_rel = v_rel, p_rel = p_rel, hessian = TRUE,  ... : 
NA/Inf replaced by maximum positive value
par.hat - result$estimate
cat(sigN_p =, par[1],\n,sigN_v =, par[2],\n,rhoN =,
par[3],\n,sigLN_p =, par[4],\n,sigLN_v =, par[5],\n,rhoLN =,
par[6],\n,mixing parameter =, par[7],\n)
sigN_p = 0.2919377 
 sigN_v = 0.4445056 
 rhoN = 1.737904 
 sigLN_p = 2.911735 
 sigLN_v = 2.539405 
 rhoLN = 0.3580525 
 mixing parameter = 0.8112917

Does someone know what is wrong in my model or how should I do to find these
parameters in 2 dimensions?
Thank you very much for taking time to look at my questions.
Regards,
Gladys Hertzog




--
View this message in context: 
http://r.789695.n4.nabble.com/Estimation-of-covariance-matrices-and-mixing-parameter-by-a-bivariate-normal-lognormal-model-tp4669143.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] modify and append new rows to a data.frame using ddply

2013-06-10 Thread Berend Hasselman

On 10-06-2013, at 11:49, Santiago Guallar sgual...@yahoo.com wrote:

 Hi,
 
 I have a data.frame that contains a variable act which records the duration 
 (in seconds) of two states (wet-dry) for several individuals (identified by 
 Ring) over a period of time. Since I want to work with daytime (i.e. from 
 sunrise till sunset) and night time (i.e. from sunset till next sunrise), I 
 have to split act from time[i] till sunset and from sunset until time[i+1], 
 and from time[k] till sunrise and from sunrise until time[k+1].
 
 Here is an example with time and act separated by a comma:
 
 [i] 01-01-2000 20:55:00 , 360 
 [i+1] 01-01-2000 21:01:00 , 30 # let's say that sunset is at 01-01-2000 
 21:00:00
 
 [i+2] 01-01-2000 21:01:30 , 30
 
 .
 .
 .
 My goal is to get:
 [i] 01-01-2000 20:55:00 , 300 # act is modified
 
 [i+1] 01-01-2000 21:00:00 , 60 # new row with time=sunset
 
 [i+2] 01-01-2000 21:01:00 , 30 # previously row i+1th
 
 [i+3] 01-01-2000 21:01:30 , 30 # previously row i+2th
 
 .
 .
 .
 I attach a dput with a selection of my data.frame. Here is a piece of 
 existing code that I am trying to adapt just for the daytime/night time 
 change:
 
   require(plyr)
 
   xandaynight - ddply( xan, .(Ring), function(df1){
   # index of day/night changes
   ind - c( FALSE, diff(df$dif) == 1 )
   add - df1[ind,]
   add$timepos - add$dusk
   # rearrangement
   df1 - rbind( df1, add )
   df1 - df1[order(df1$timepos),]
   # recalculation of act
   df1$act2 - c( diff( as.numeric(df1$timepos) ), NA )
   df1} )
 
 This code produces an error message:
 Error en diff(df$dif): error in evaluating the argument 'x' in selecting a 
 method for function 'diff': Error en df$dif: object of type 'closure' is not 
 a subset
 

Shouldn't the line

  ind - c( FALSE, diff(df$dif) == 1 )

read

  ind - c( FALSE, diff(df1$dif) == 1 )

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] cannot load pbdMPI package after compilation

2013-06-10 Thread Antoine Migeon
Thank you, I will try contact the developper.

Antoine Migeon
Université de Bourgogne
Centre de Calcul et Messagerie
Direction des Systèmes d'Information

tel : 03 80 39 52 70
Site du CCUB : http://www.u-bourgogne.fr/dsi-ccub

Le 10/06/2013 08:19, Prof Brian Ripley a écrit :
 On 10/06/2013 03:17, Pascal Oettli wrote:
 Hello,

 I am not sure whether it helps you, but I was able to install it.

 OpenSUSE 12.3
 R version 3.0.1 Patched (2013-06-09 r62918)
 pbdMPI version 0.1-6
 gcc version 4.7.2
 OpenMPI version 1.6.3

 I didn't try with the most recent version of ompi (1.6.4).

 But the system used to accept that version of pdbMPI for CRAN used it,
 with gcc.

 The issue here is likely to be using the Intel compiler with OpenMPI.
 This is a programming matter really off-topic for R-help (see the
 posting guide).  The first port of call for help is the package
 maintainer, then if that does not help, the R-devel list.  But very
 few R users have access to an Intel compiler, let alone one as recent
 as that, and you will be expected to use a debugger for yourself (see
 'Writing R Extensions').


 Regards,
 Pascal


 On 07/06/13 21:42, Antoine Migeon wrote:
 Hello,

 I try to install pbdMPI.
 Compilation successful, but load fails with segfault.

 Is anyone can help me?

 R version 3.0.0
 pbdMPI version 0.1-6
 Intel compiler version 13.1.1
 OpenMPI version 1.6.4-1
 CPU Intel x86_64

 # R CMD INSTALL pbdMPI_0.1-6.tar.gz
 ..
 
 checking for gcc... icc -std=gnu99
 checking whether the C compiler works... yes
 checking for C compiler default output file name... a.out
 checking for suffix of executables...
 checking whether we are cross compiling... no
 checking for suffix of object files... o
 checking whether we are using the GNU C compiler... yes
 checking whether icc -std=gnu99 accepts -g... yes
 checking for icc -std=gnu99 option to accept ISO C89... none needed
 checking for mpirun... mpirun
 checking for mpiexec... mpiexec
 checking for orterun... orterun
 checking for sed... /bin/sed
 checking for mpicc... mpicc
 checking for ompi_info... ompi_info
 checking for mpich2version... F
 found sed, mpicc, and ompi_info ...
 TMP_INC_DIRS = /opt/openmpi/1.6.4-1/intel-13.1.1/include
 checking /opt/openmpi/1.6.4-1/intel-13.1.1/include ...
 found /opt/openmpi/1.6.4-1/intel-13.1.1/include/mpi.h ...
 TMP_LIB_DIRS = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64
 checking /opt/openmpi/1.6.4-1/intel-13.1.1/lib64 ...
 found /opt/openmpi/1.6.4-1/intel-13.1.1/lib64/libmpi.so ...
 found mpi.h and libmpi.so ...
 TMP_INC = /opt/openmpi/1.6.4-1/intel-13.1.1/include
 TMP_LIB = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64
 checking for openpty in -lutil... yes
 checking for main in -lpthread... yes

 *** Results of pbdMPI package configure
 *

 TMP_INC = /opt/openmpi/1.6.4-1/intel-13.1.1/include
 TMP_LIB = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64
 MPI_ROOT =
 MPITYPE = OPENMPI
 MPI_INCLUDE_PATH = /opt/openmpi/1.6.4-1/intel-13.1.1/include
 MPI_LIBPATH = /opt/openmpi/1.6.4-1/intel-13.1.1/lib64
 MPI_LIBS =  -lutil -lpthread
 MPI_DEFS = -DMPI2
 MPI_INCL2 =
 PKG_CPPFLAGS = -I/opt/openmpi/1.6.4-1/intel-13.1.1/include  -DMPI2
 -DOPENMPI
 PKG_LIBS = -L/opt/openmpi/1.6.4-1/intel-13.1.1/lib64 -lmpi  -lutil
 -lpthread
 *

 ..
 icc -std=gnu99 -I/usr/local/R/3.0.0/intel13/lib64/R/include -DNDEBUG
 -I/opt/openmpi/1.6.4-1/intel-13.1.1/include  -DMPI2 -DOPENMPI -O3
 -fp-model precise -pc 64 -axAVX-fpic  -O3 -fp-model precise  -pc 64
 -axAVX  -c comm_errors.c -o comm_errors.o
 icc -std=gnu99 -I/usr/local/R/3.0.0/intel13/lib64/R/include -DNDEBUG
 -I/opt/openmpi/1.6.4-1/intel-13.1.1/include  -DMPI2 -DOPENMPI -O3
 -fp-model precise -pc 64 -axAVX-fpic  -O3 -fp-model precise  -pc 64
 -axAVX  -c comm_sort_double.c -o comm_sort_double.o
 .
 ..
 
 ** testing if installed package can be loaded
 sh: line 1:  2905 Segmentation fault
 '/usr/local/R/3.0.0/intel13/lib64/R/bin/R' --no-save --slave 21 
 /tmp/RtmpGkncGK/file1e541c57190
 ERROR: loading failed

   *** caught segfault ***
 address (nil), cause 'unknown'

 Traceback:
   1: .Call(spmd_initialize, PACKAGE = pbdMPI)
   2: fun(libname, pkgname)
   3: doTryCatch(return(expr), name, parentenv, handler)
   4: tryCatchOne(expr, names, parentenv, handlers[[1L]])
   5: tryCatchList(expr, classes, parentenv, handlers)
   6: tryCatch(fun(libname, pkgname), error = identity)
   7: runHook(.onLoad, env, package.lib, package)
   8: loadNamespace(package, c(which.lib.loc, lib.loc))
   9: doTryCatch(return(expr), name, parentenv, handler)
 10: tryCatchOne(expr, names, parentenv, handlers[[1L]])
 11: tryCatchList(expr, classes, parentenv, handlers)
 12: tryCatch(expr, error = function(e) {call - conditionCall(e)
 if (!is.null(call)) {if (identical(call[[1L]],
 quote(doTryCatch))) call - sys.call(-4L)dcall -
 deparse(call)[1L]prefix - paste(Error in, dcall, : )
 LONG - 75L 

Re: [R] agnes() in package cluster on R 2.14.1 and R 3.0.1

2013-06-10 Thread Martin Maechler
 Hugo Varet vareth...@gmail.com
 on Sun, 9 Jun 2013 11:43:32 +0200 writes:

 Dear R users,
 I discovered something strange using the function agnes() of the cluster
 package on R 3.0.1 and on R 2.14.1. Indeed, the clusterings obtained are
 different whereas I ran exactly the same code.

hard to believe... but ..

 I quickly looked at the source code of the function and I discovered that
 there was an important change: agnes() in R 2.14.1 used a FORTRAN code
 whereas agnes() in R 3.0.1 uses a C code.

well, it does so quite a bit longer, e.g., also in R 2.15.0

 Here is one of the contingency table between R 2.14.1 and R 3.0.1:
 classe.agnTani.2.14.1
 classe.agnTani.3.0.1  12   3
 174   0229
 2 02350
 3  120   0  15

 So, I was wondering if it was normal that the C and FORTRAN codes give
 different results?

It's not normal, and I'm pretty sure I have had many many
examples which gave identical results.

Can you provide a reproducible example, please?
If the example is too large [for dput() ], please send me the *.rda
file produced from 
 save(your data, file=the file I neeed)
*and* a the exact call to agnes() for your data.

Thank you in advance!

Martin Maechler,
the one you could have e-mailed directly 
to using   maintainer(cluster) ...


 Best regards,
 Hugo Varet

 [[alternative HTML version deleted]]
 ^ try to avoid, please ^

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

yes indeed, please.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reshaping a data frame

2013-06-10 Thread Adams, Jean
Abhi,

In the example you give, you don't really need to reshape the data ... just
rename the column value to w.

Here's a different example with more than one category ...
tempdf - expand.grid(names=c(foo, bar), variable=letters[1:3])
tempdf$value - rnorm(dim(tempdf)[1])
tempdf
library(reshape)
cast(tempdf)

But, that may not be what you want,  If not, please give an example with
more than one category showing us what you have and what you want.

Jean



On Mon, Jun 10, 2013 at 1:15 AM, Abhishek Pratap abhishek@gmail.comwrote:

 Hi Guys

 I am trying to cast a data frame but not aggregate the rows for the
 same variable.

 here is a contrived example.

 **input**
 temp_df  -
 data.frame(names=c('foo','foo','foo'),variable=c('w','w','w'),value=c(34,65,12))
  temp_df
   names variable value
 1   foow34
 2   foow65
 3   foow12


 ###
 **Want this**
 
 names  w
 foo 34
 foo 65
 foo 12


 ##
 **getting this***
 ##
  cast(temp_df)
 Aggregation requires fun.aggregate: length used as default
   names w
 1   foo 3


 In real dataset  the categorical column 'variable' will have many more
 categorical variable.

 Thanks!
 -Abhi

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reshaping a data frame

2013-06-10 Thread John Kane
Unless I completely misunderstand what you are doing you don't need to 
aggregate, just drop the one column and rename things

newtemp  -  temp_df[, c(1,3)]
names(newtemp) -  c(names, w)
newtemp


John Kane
Kingston ON Canada


 -Original Message-
 From: abhishek@gmail.com
 Sent: Sun, 9 Jun 2013 23:15:48 -0700
 To: r-help@r-project.org
 Subject: [R] reshaping a data frame
 
 Hi Guys
 
 I am trying to cast a data frame but not aggregate the rows for the
 same variable.
 
 here is a contrived example.
 
 **input**
 temp_df  -
 data.frame(names=c('foo','foo','foo'),variable=c('w','w','w'),value=c(34,65,12))
 temp_df
   names variable value
 1   foow34
 2   foow65
 3   foow12
 
 
 ###
 **Want this**
 
 names  w
 foo 34
 foo 65
 foo 12
 
 
 ##
 **getting this***
 ##
 cast(temp_df)
 Aggregation requires fun.aggregate: length used as default
   names w
 1   foo 3
 
 
 In real dataset  the categorical column 'variable' will have many more
 categorical variable.
 
 Thanks!
 -Abhi
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE 3D EARTH SCREENSAVER - Watch the Earth right on your desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] All against all correlation matrix with GGPLOT Facet

2013-06-10 Thread John Kane
No image.  The R-help list tends to strip out a lot of files. A pdf or txt 
usually gets through.  In any case I understand what you want this may do it.

library(ggplot2)
dat1  -  data.frame( v = rnorm(13),
w = rnorm(13),
x = rnorm(13),
y = rnorm(13),
z = rnorm(13))
plotmatrix(dat1)

John Kane
Kingston ON Canada


 -Original Message-
 From: gunda...@gmail.com
 Sent: Mon, 10 Jun 2013 12:26:44 +0900
 To: r-h...@stat.math.ethz.ch
 Subject: [R] All against all correlation matrix with GGPLOT Facet
 
 I have the following data:
 
 v - rnorm(13)
 w - rnorm(13)
 x - rnorm(13)
 y - rnorm(13)
 z - rnorm(13)
 
 
 Using GGPLOT facet, what I want to do is to create a 5*5 matrix,
 where each cells plot the correlation between
 each pair of the above data. E.g. v-v,v-w; v-x,...,z-z
 
 
 What's the way to do it?
 Attached is the image.
 
 GV.
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reshaping a data frame

2013-06-10 Thread arun
Hi,If your dataset is similar to the one below:
set.seed(24)
temp1_df- 
data.frame(names=rep(c('foo','foo1'),each=6),variable=rep(c('w','x'),times=6),value=sample(25:40,12,replace=TRUE),stringsAsFactors=FALSE)

library(reshape2)
 
res-dcast(within(temp1_df,{Seq1-ave(value,names,variable,FUN=seq_along)}),names+Seq1~variable,value.var=value)[,-2]
res
#  names  w  x
#1   foo 29 28
#2   foo 36 33
#3   foo 35 39
#4  foo1 29 37
#5  foo1 37 29
#6  foo1 34 30
A.K.


- Original Message -
From: Abhishek Pratap abhishek@gmail.com
To: r-help@r-project.org r-help@r-project.org
Cc: 
Sent: Monday, June 10, 2013 2:15 AM
Subject: [R] reshaping a data frame

Hi Guys

I am trying to cast a data frame but not aggregate the rows for the
same variable.

here is a contrived example.

**input**
temp_df  - 
data.frame(names=c('foo','foo','foo'),variable=c('w','w','w'),value=c(34,65,12))
 temp_df
  names variable value
1   foo        w    34
2   foo        w    65
3   foo        w    12


###
**Want this**

names  w
foo         34
foo         65
foo         12


##
**getting this***
##
 cast(temp_df)
Aggregation requires fun.aggregate: length used as default
  names w
1   foo 3


In real dataset  the categorical column 'variable' will have many more
categorical variable.

Thanks!
-Abhi

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] please check this

2013-06-10 Thread arun
Hi,
Try this:
which(duplicated(res10Percent))
# [1] 117 125 157 189 213 235 267 275 278 293 301 327 331 335 339 367 369 371 
379
#[20] 413 415 417 441 459 461 477 479 505
res10PercentSub1-subset(res10Percent[which(duplicated(res10Percent)),],dummy==1)
  #most of the duplicated are dummy==1
res10PercentSub0-subset(res10Percent[which(duplicated(res10Percent)),],dummy==0)
 indx1-as.numeric(row.names(res10PercentSub1))
indx11-sort(c(indx1,indx1+1))
indx0- as.numeric(row.names(res10PercentSub0))
 indx00- sort(c(indx0,indx0-1))
indx10- sort(c(indx11,indx00))

 nrow(res10Percent[-indx10,])
#[1] 452
 res10PercentNew-res10Percent[-indx10,]
 nrow(subset(res10PercentNew,dummy==1))
#[1] 226
 nrow(subset(res10PercentNew,dummy==0))
#[1] 226
 nrow(unique(res10PercentNew))
#[1] 452
A.K.



- Original Message -
From: Cecilia Carmo cecilia.ca...@ua.pt
To: arun smartpink...@yahoo.com
Cc: 
Sent: Monday, June 10, 2013 10:19 AM
Subject: RE: please check this

But I don't want it like this. 
Once a firm is paired with another, these two firms should not be paired again.
Could you solve this?
Thanks,
Cecília



De: arun [smartpink...@yahoo.com]
Enviado: segunda-feira, 10 de Junho de 2013 15:12
Para: Cecilia Carmo
Assunto: Re: please check this

I did look into that.
If you look for the nrow() in each category, then it will be different.  It 
means that the duplicates are not pairwise, but in the whole `result`.  The 
explanation is again with the multiple matches.  So, here we selected the one 
with dummy==0 that closely matches the dimension of one dummy==1.  Suppose, the 
value of dimension with dummy==1` is `2554` and it got a match with dummy==0 
with `2580`.  Now, consider another case with dimension as `2570` with dummy==1 
(which also comes within the same split group).  Then it got a match with 
`2580' with dummy==0.  I guess it was based on the way in which it was tested.







From: Cecilia Carmo cecilia.ca...@ua.pt
To: arun smartpink...@yahoo.com
Sent: Monday, June 10, 2013 10:02 AM
Subject: please check this




When I do

res10Percent- fun1(final3New,0.1,200)
dim(res10Percent)
[1] 508   5
#[1] 508   5
nrow(subset(res10Percent,dummy==0))
#[1] 254
nrow(subset(res10Percent,dummy==1))
#[1] 254


testingDuplicates-unique(res10Percent)
nrow(testingDuplicates)
[1] 480 #this should be 508, if not there are duplicated rows, or not?


Thanks
Cecilia

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] recode: how to avoid nested ifelse

2013-06-10 Thread Paul Johnson
Thanks, guys.


On Sat, Jun 8, 2013 at 2:17 PM, Neal Fultz nfu...@gmail.com wrote:

 rowSums and Reduce will have the same problems with bad data you alluded
 to earlier, eg
 cg = 1, hs = 0

 But that's something to check for with crosstabs anyway.


This wrong data thing is a distraction here.  I guess I'd have to craft 2
solutions, depending on what the researcher says. (We can't assume es = 0
or es = NA and cg = 1 is bad data. There are some people who finish college
without doing elementary school (wasn't Albert Einstein one of those?) or
high school. I once went to an eye doctor who didn't finish high school,
but nonetheless was admitted to optometrist school.)

I did not know about the Reduce function before this. If we enforce the
ordering and clean up the data in the way you imagine, it would work.

I think the pmax is the most teachable and dependably not-getting-wrongable
approach if the data is not wrong.


 Side note: you should check out the microbenchmark pkg, it's quite handy.


Perhaps the working example of microbenchmark is the best thing in this
thread! I understand the idea behind it, but it seems like I can never get
it to work right. It helps to see how you do it.


 Rrequire(microbenchmark)
 Rmicrobenchmark(
 +   f1(cg,hs,es),
 +   f2(cg,hs,es),
 +   f3(cg,hs,es),
 +   f4(cg,hs,es)
 + )
 Unit: microseconds
expr   min lq median uq   max neval
  f1(cg, hs, es) 23029.848 25279.9660 27024.9640 29996.6810 55444.112   100
  f2(cg, hs, es)   730.665   755.5750   811.7445   934.3320  6179.798   100
  f3(cg, hs, es)85.029   101.6785   129.8605   196.2835  2820.187   100
  f4(cg, hs, es)   762.232   804.4850   843.7170  1079.0800 24869.548   100

 On Fri, Jun 07, 2013 at 08:03:26PM -0700, Joshua Wiley wrote:
  I still argue for na.rm=FALSE, but that is cute, also substantially
 faster
 
  f1 - function(x1, x2, x3) do.call(paste0, list(x1, x2, x3))
  f2 - function(x1, x2, x3) pmax(3*x3, 2*x2, es, 0, na.rm=FALSE)
  f3 - function(x1, x2, x3) Reduce(`+`, list(x1, x2, x3))
  f4 - function(x1, x2, x3) rowSums(cbind(x1, x2, x3))
 
  es - rep(c(0, 0, 1, 0, 1, 0, 1, 1, NA, NA), 1000)
  hs - rep(c(0, 0, 1, 0, 1, 0, 1, 0, 1, NA), 1000)
  cg - rep(c(0, 0, 0, 0, 1, 0, 1, 0, NA, NA), 1000)
 
  system.time(replicate(1000, f1(cg, hs, es)))
  system.time(replicate(1000, f2(cg, hs, es)))
  system.time(replicate(1000, f3(cg, hs, es)))
  system.time(replicate(1000, f4(cg, hs, es)))
 
   system.time(replicate(1000, f1(cg, hs, es)))
 user  system elapsed
22.730.03   22.76
   system.time(replicate(1000, f2(cg, hs, es)))
 user  system elapsed
 0.920.040.95
   system.time(replicate(1000, f3(cg, hs, es)))
 user  system elapsed
 0.190.020.20
system.time(replicate(1000, f4(cg, hs, es)))
 user  system elapsed
 0.950.030.98
 
 
  R version 3.0.0 (2013-04-03)
  Platform: x86_64-w64-mingw32/x64 (64-bit)
 
 
 
 
  On Fri, Jun 7, 2013 at 7:25 PM, Neal Fultz nfu...@gmail.com wrote:
   I would do this to get the highest non-missing level:
  
   x - pmax(3*cg, 2*hs, es, 0, na.rm=TRUE)
  
   rock chalk...
  
   -nfultz
  
   On Fri, Jun 07, 2013 at 06:24:50PM -0700, Joshua Wiley wrote:
   Hi Paul,
  
   Unless you have truly offended the data generating oracle*, the
   pattern: NA, 1, NA, should be a data entry error --- graduating HS
   implies graduating ES, no?  I would argue fringe cases like that
   should be corrected in the data, not through coding work arounds.
   Then you can just do:
  
   x - do.call(paste0, list(es, hs, cg))
  
table(factor(x, levels = c(000, 100, 110, 111), labels =
 c(none, es,hs, cg)))
   none   es   hs   cg
  4112
  
   Cheers,
  
   Josh
  
   *Drawn from comments by Judea Pearl one lively session.
  
  
   On Fri, Jun 7, 2013 at 6:13 PM, Paul Johnson pauljoh...@gmail.com
 wrote:
In our Summer Stats Institute, I was asked a question that amounts
 to
reversing the effect of the contrasts function (reconstruct an
 ordinal
predictor from a set of binary columns). The best I could think of
 was to
link together several ifelse functions, and I don't think I want to
 do this
if the example became any more complicated.
   
I'm unable to remember a less error prone method :). But I expect
 you might.
   
Here's my working example code
   
## Paul Johnson pauljohn at ku.edu
## 2013-06-07
   
## We need to create an ordinal factor from these indicators
## completed elementary school
es - c(0, 0, 1, 0, 1, 0, 1, 1)
## completed high school
hs - c(0, 0, 1, 0, 1, 0, 1, 0)
## completed college graduate
cg - c(0, 0, 0, 0, 1, 0, 1, 0)
   
ed - ifelse(cg == 1, 3,
 ifelse(hs == 1, 2,
ifelse(es == 1, 1, 0)))
   
edf - factor(ed, levels = 0:3,  labels = c(none, es, hs,
 cg))
data.frame(es, hs, cg, ed, edf)
   
## Looks OK, but what if there are missings?
es - c(0, 0, 1, 

Re: [R] How to expand.grid with string elements (the half!)

2013-06-10 Thread MacQueen, Don
If you can explain why those particular six combinations out of the
complete set of nine, then perhaps someone can tell you how.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 6/9/13 10:25 PM, Gundala Viswanath gunda...@gmail.com wrote:

I have the following result of expand grid:

 d - expand.grid(c(x,y,z),c(x,y,z))

What I want is to create a combination of strings
but only the half of the all combinations:

  Var1 Var2
1xx
2yx
3   yy
4   zy
5   xz
6zz


What's the way to do it?

G.V.

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Substituting the values on the y-axis

2013-06-10 Thread diddle1990
Hello, 

I plotted a graph on R showing how salinity (in ‰, y-axis) changes with time(in 
years, x-axis). However, right from the beginning on the Excel spreadsheet the v
alues for salinity appeared as, for example, 35000‰ instead of 35‰, which I gues
sed must have been a typing error for the website from which I extracted the dat
a (NOAA).Thus, I now would like to substitute these values with the correspondin
g smaller value, as it follows: 

25000 35000- 25, 35   and so on.

Is there any way I can change this on R or do I have to modify these numbers bef
ore inputting the data on R (for example on Excel)? If so, can anybody tell me h
ow to do either of these? 

Many thanks! 

Emanuela 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] twoby2 (Odds Ratio) for variables with 3 or more levels

2013-06-10 Thread Vlatka Matkovic Puljic
Dear all,

I am using Epi package to calculate Odds ratio in my bivariate analysis.
How can I make *twoby2 *in variables that have 3 or more levels.

For example:
I have 4 level var (Age)
m=matrix(c(290, 100,232, 201, 136, 99, 182, 240), nrow=4, ncol=2)
twoby2(m)

R gives me only
Comparing : Row 1 vs. Row 2

While I would like to have reference value in Row 1, and compare Row 2, Row
3 and Row 4 with it.


Thanks for your help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Rcmdr seit heute nicht mehr ladbar

2013-06-10 Thread Bastian Wimmer
Wenn man ihn mal braucht ist er tot. folgende Fehlermeldung ereilt mich seit 
heute beim starten des R Commander auf dem Mac:

 library(Rcmdr)
Lade nötiges Paket: car
Lade nötiges Paket: MASS
Lade nötiges Paket: nnet
Error : .onAttach in attachNamespace() für 'Rcmdr' fehlgeschlagen, Details:
  Aufruf: structure(.External(.C_dotTclObjv, objv), class = tclObj)
  Fehler: [tcl] invalid command name image.

Zusätzlich: Warnmeldung:
In fun(libname, pkgname) :
  couldn't connect to display /tmp/launch-K8nELf/org.macosforge.xquartz:0
Fehler: Laden von Paket oder Namensraum für 'Rcmdr' fehlgeschlagen

Ich bin ziemlich angefressen. Folgende erfolglose Versuche:
- R neu installiert
- x11 neu installiert
- alle Ordner dabei gelöscht
- Pakete neu installiert
Nichts. Rcmdr will nicht mehr. 


--  
Beste Grüße,
Yours,
Bastian Wimmer M.A.

Research Associate at the Chair of Educational Psychology
University of Erlangen-Nuremberg
Dutzendteichstraße 24
90478 Nuremberg
Germany

Phone: +49 (0) 9171 83924 84
Fax: +49 (0) 3222 64968 14
Email: bastian.wim...@fau.de
Web: http://j.mp/Umkf4U (Chair of educational Psychology)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Substituting the values on the y-axis

2013-06-10 Thread Bert Gunter
Sounds like you have made no effort to learn R, e.g. by reading the
Intro to R tutorial packaged with R or other online tutorial (there
are many).

Don't you think you need to do some homework first?

-- Bert

On Mon, Jun 10, 2013 at 7:26 AM,  diddle1...@fastwebnet.it wrote:
 Hello,

 I plotted a graph on R showing how salinity (in ‰, y-axis) changes with 
 time(in
 years, x-axis). However, right from the beginning on the Excel spreadsheet 
 the v
 alues for salinity appeared as, for example, 35000‰ instead of 35‰, which I 
 gues
 sed must have been a typing error for the website from which I extracted the 
 dat
 a (NOAA).Thus, I now would like to substitute these values with the 
 correspondin
 g smaller value, as it follows:

 25000 35000- 25, 35   and so on.

 Is there any way I can change this on R or do I have to modify these numbers 
 bef
 ore inputting the data on R (for example on Excel)? If so, can anybody tell 
 me h
 ow to do either of these?

 Many thanks!

 Emanuela

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Substituting the values on the y-axis

2013-06-10 Thread John Kane

Just calculate a new sequence if those percentages are in an orderly sequence. 
See ?seq
 v  -  seq(25, 200, by = 10)
or perhaps the values are actually  text
?substr
x  -  substr(v, 1,2)

John Kane
Kingston ON Canada


 -Original Message-
 From: diddle1...@fastwebnet.it
 Sent: Mon, 10 Jun 2013 16:26:54 +0200 (CEST)
 To: r-help@r-project.org
 Subject: [R] Substituting the values on the y-axis
 
 Hello,
 
 I plotted a graph on R showing how salinity (in ‰, y-axis) changes with
 time(in
 years, x-axis). However, right from the beginning on the Excel
 spreadsheet the v
 alues for salinity appeared as, for example, 35000‰ instead of 35‰, which
 I gues
 sed must have been a typing error for the website from which I extracted
 the dat
 a (NOAA).Thus, I now would like to substitute these values with the
 correspondin
 g smaller value, as it follows:
 
 25000 35000- 25, 35   and so on.
 
 Is there any way I can change this on R or do I have to modify these
 numbers bef
 ore inputting the data on R (for example on Excel)? If so, can anybody
 tell me h
 ow to do either of these?
 
 Many thanks!
 
 Emanuela
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks  orcas on your 
desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Substituting the values on the y-axis

2013-06-10 Thread Emanuela
I did look into tutorials but I could not find the exact request I am looking
for. I just started using R so I am still a beginner.  If you then know
where I can find it, can you please redirect me to it 




--
View this message in context: 
http://r.789695.n4.nabble.com/Substituting-the-values-on-the-y-axis-tp4669165p4669171.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Rcmdr seit heute nicht mehr ladbar

2013-06-10 Thread John Fox
Dear Bastian,

I'm afraid that I don't read German, but (as near as I can tell) since you say 
that you're using the most recent version of R and have X11 installed, you 
should have the software you need. Just in case, you might check the Rcmdr 
installation notes for Mac users at 
http://socserv.socsci.mcmaster.ca/jfox/Misc/Rcmdr/installation-notes.html. 
Apparently, R is having difficulty connecting to X11. I'm copying this response 
to Rob Goedman, who has often been able to help with Rcmdr issues under Mac OS 
X.

Best,
 John

---
John Fox
Senator McMaster Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada




 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
 project.org] On Behalf Of Bastian Wimmer
 Sent: Monday, June 10, 2013 5:27 AM
 To: r-help@r-project.org
 Subject: [R] Rcmdr seit heute nicht mehr ladbar
 
 Wenn man ihn mal braucht ist er tot. folgende Fehlermeldung ereilt mich
 seit heute beim starten des R Commander auf dem Mac:
 
  library(Rcmdr)
 Lade nötiges Paket: car
 Lade nötiges Paket: MASS
 Lade nötiges Paket: nnet
 Error : .onAttach in attachNamespace() für 'Rcmdr' fehlgeschlagen,
 Details:
   Aufruf: structure(.External(.C_dotTclObjv, objv), class = tclObj)
   Fehler: [tcl] invalid command name image.
 
 Zusätzlich: Warnmeldung:
 In fun(libname, pkgname) :
   couldn't connect to display /tmp/launch-
 K8nELf/org.macosforge.xquartz:0
 Fehler: Laden von Paket oder Namensraum für 'Rcmdr' fehlgeschlagen
 
 Ich bin ziemlich angefressen. Folgende erfolglose Versuche:
 - R neu installiert
 - x11 neu installiert
 - alle Ordner dabei gelöscht
 - Pakete neu installiert
 Nichts. Rcmdr will nicht mehr.
 
 
 --
 Beste Grüße,
 Yours,
 Bastian Wimmer M.A.
 
 Research Associate at the Chair of Educational Psychology
 University of Erlangen-Nuremberg
 Dutzendteichstraße 24
 90478 Nuremberg
 Germany
 
 Phone: +49 (0) 9171 83924 84
 Fax: +49 (0) 3222 64968 14
 Email: bastian.wim...@fau.de
 Web: http://j.mp/Umkf4U (Chair of educational Psychology)
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] woby2 (Odds Ratio) for variables with 3 or more levels

2013-06-10 Thread Vlatka Matkovic Puljic
Dear all,

I am using Epi package to calculate Odds ratio in my bivariate analysis.
How can I make *twoby2 *in variables that have 3 or more levels.

For example:
I have 4 level var (Age)
m=matrix(c(290, 100,232, 201, 136, 99, 182, 240), nrow=4, ncol=2)
library (Epi)
twoby2(m)

R gives me only
Comparing : Row 1 vs. Row 2

While I would like to have reference value in Row 1, and compare Row 2, Row
3 and Row 4 with it.


Thanks for your help!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Substituting the values on the y-axis

2013-06-10 Thread John Kane
Hi Emanuela,

Welcome to R

It can be hard finding even relatively simple things when you are just 
starting.  You might want to have a look at 
http://www.unt.edu/rss/class/Jon/R_SC/ or 
http://www.burns-stat.com/documents/tutorials/impatient-r/ if ou have not 
already seen them.  Patrick Burn's site 
http://www.introductoryr.co.uk/R_Resources_for_Beginners.html has some useful 
links 

If you are a refugee from SAS or SPSS, this paper by Bob Muenchen is very 
useful www.et.bs.ehu.es/~etptupaf/pub/R/RforSASSPSSusers.pdf

Some tricks for asking a good question in the R help list is here:
https://github.com/hadley/devtools/wiki/Reproducibility or  
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example

In most cases it is very useful to provide some data. See ?dput in the last two 
links. A small bit of sample data in your original post would definately have 
helped.

Many or most R-help readers do not use nabble and really hate to have to go 
there to see the context of a message.  You should always leave the important 
parts of earlier messages to let the R-help reader see what the problems and 
other suggested solutions may be.

John Kane
Kingston ON Canada


 -Original Message-
 From: diddle1...@fastwebnet.it
 Sent: Mon, 10 Jun 2013 09:08:59 -0700 (PDT)
 To: r-help@r-project.org
 Subject: Re: [R] Substituting the values on the y-axis
 
 I did look into tutorials but I could not find the exact request I am
 looking
 for. I just started using R so I am still a beginner.  If you then know
 where I can find it, can you please redirect me to it
 
 
 
 
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Substituting-the-values-on-the-y-axis-tp4669165p4669171.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to expand.grid with string elements (the half!)

2013-06-10 Thread William Dunlap
Perhaps the OP wants the unique combinations of V1 and V2, as in
  R d - expand.grid(V1=c(x,y,z),V2=c(x,y,z))
  R d[ as.numeric(d$V1) = as.numeric(d$V2), ]
V1 V2
  1  x  x
  4  x  y
  5  y  y
  7  x  z
  8  y  z
  9  z  z
or
  R V - letters[24:26]
  R rbind(t(combn(V,m=2)), cbind(V,V))
   V   V  
  [1,] x y
  [2,] x z
  [3,] y z
  [4,] x x
  [5,] y y
  [6,] z z

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Rolf Turner
 Sent: Monday, June 10, 2013 2:20 AM
 To: Gundala Viswanath
 Cc: r-h...@stat.math.ethz.ch
 Subject: Re: [R] How to expand.grid with string elements (the half!)
 
 
 Your question makes no sense at all.  The grid expansion
 has 9 rows.  In case you hadn't noticed, 9 is an odd number
 (i.e. not divisible by 2).  There are no halves.
 
 Do not expect the list to read your mind.  Instead, ask a
 meaningful question.
 
  cheers,
 
  Rolf Turner
 
 On 10/06/13 17:25, Gundala Viswanath wrote:
  I have the following result of expand grid:
 
  d - expand.grid(c(x,y,z),c(x,y,z))
  What I want is to create a combination of strings
  but only the half of the all combinations:
 
 Var1 Var2
  1xx
  2yx
  3   yy
  4   zy
  5   xz
  6zz
 
 
  What's the way to do it?
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Selecting divergent colors

2013-06-10 Thread Brian Smith
Hi,

I was trying to make a density plot with 13 samples. To distinguish each
sample, it would be good if each color is as different as possible from the
other colors. I could use the built in function, but that does not do more
than 8 colors and then goes back to recycling the cols. If I use a palette,
then it is really difficult to distinguish between the colors.

So, is there a way that I can select a large number of colors (i.e. perhaps
20) that are as different from each other as possible?

Here is my example code using the palette:

**
mat - matrix(sample(1:1000,1000,replace=T),nrow=20,ncol=20)
snames - paste('Sample_',1:ncol(mat),sep='')
colnames(mat) - snames

mycols - palette(rainbow(ncol(mat)))

for(k in 1:ncol(mat)){
  plot(density(mat[,k]),col=mycols[k],xlab='',ylab='',axes=F,main=F)
  par(new=T)
}

legend(x='topright',legend=snames,fill=mycols)



thanks!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read.csv timing

2013-06-10 Thread ivo welch
here are some small benchmarks on an i7-2600k with an SSD:

input file: 104,126 rows with 76 columns.  all numeric.

linux time bzcat bzfile.csv.bz2  /dev/null  -- 1.8 seconds

R d - read.csv( pipe( bzfile ) )   -- 6.3 seconds
R d - read.csv( pipe( bzfile ), colClasses=numeric)  -- 4.2 seconds

R more than doubles the time it takes to load the file to convert it
into an R data structure.  if the colClasses are not specified, then
it takes another 50% longer.


some more experiments: save in R format (gzip format) --- this
increases file size from 15MB to 20MB.  how fast is the filesystem?

linux time gzcat file.Rdata  /dev/null  -- 0.4 seconds


the linux file system and CPU can decompress the 15MB .bz2 file in 1.8
seconds and decompress the 20MB .gz file in 0.4 seconds.  this is
surprising.  let's make sure that this is due to the .gz format.
indeed:

linux bunzip bzfile.csv.bz2 ; gzip bzfile.csv
linux time gzcat bzfile.csv.gz  /dev/null  -- 0.4 seconds


reading .gz files is much faster on my linux system than reading bz
files.  this surprises me.  I would have thought my CPU is so fast at
decompressing even bzip2 that it is almost zero, so I thought the disk
space was the primary determinant of speed, and bzip2 should have been
faster.  well, ok, maybe slower, but not by a factor of 4.


now I am thinking that maybe I should use .gz files to store my data.
but the advantages are surprisingly not as great:

R d - read.csv( pipe( gzfile ) )   -- 5.7 seconds
R d - read.csv( pipe( gzfile ), colClasses=numeric)  -- 2.6 seconds
R d - read.csv( gzfile( gzfile ), colClasses=numeric) -- 4.5
seconds   (surprisingly slower)

(the first and second versions are using R's gzfile, but literally
gzcat .. | in a pipe here.)


conclusion: a .gz file can be read from file to memory about four
times faster than a .bz file by the linux file system (outside R).
the conversion from strings in memory nto R doubles takes about as
much time as the .bz file system decompression read.  bzip2 is a more
efficient storage method than .gz, but its decompression is
considerably slower (the fact that there is less to read from disk
does not make up for the CPU decompression overhead).

saving the data in native R format essentially has no decompression
penalty and becomes close to native fast reading of .gz data.  chances
are this is because it has .gz support baked in.  gzfile does not help
with read.csv, however.

/iaw

Ivo Welch (ivo.we...@gmail.com)


On Mon, Jun 10, 2013 at 10:09 AM, ivo welch ivo.we...@gmail.com wrote:
 Surely you know the types of the columns?  If you specify it in advance,
 read.table and relatives will be much faster.

 Duncan Murdoch

 thx, duncan.  yes, I do know the types of columns, but I did not
 realize how much faster these functions become.  on my SSD-based
 system, the speedup is about a factor of 2.  that is, read.csv on a
 bzip2 file that takes 10 seconds without colClasses takes 5 seconds
 with colClasses.  I don't know how to benchmark intermittent memory
 usage, but my guess is that with colClasses, it requires less memory,
 too.  in fact, my naive and incorrect assumption had been that
 read.csv would just read ithe file nto a dynamic string array and then
 convert each string, and this would not take much longer than if it
 converted as it went along.  so, I had thought more memory use but
 not more time.  wrong.

 I would add to the man (.Rd) page the sentence Specifying colClasses
 can speed up read.csv where it describes the option.)


 once I will figure out how to bake C into R, I may try to write a fast
 filter function for myself, but share it for others wanting to use it.

 regards,

 /iaw

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] woby2 (Odds Ratio) for variables with 3 or more levels

2013-06-10 Thread David Winsemius

On Jun 10, 2013, at 9:27 AM, Vlatka Matkovic Puljic wrote:

 Dear all,
 
 I am using Epi package to calculate Odds ratio in my bivariate analysis.
 How can I make *twoby2 *in variables that have 3 or more levels.

I hope looking at that again you will see how odd it sounds to be requesting 
advice about how to use a program for 2 x 2 tables on data that doesn't meet 
those requirements. If you want to stay within the Epi package world, you can 
probably use the 'mh' function since it says it can handle multi-way tables (or 
you can learn to use 'glm' in the regular stats package to do either logistic 
regression or Poisson regression.)
 
 For example:
 I have 4 level var (Age)
 m=matrix(c(290, 100,232, 201, 136, 99, 182, 240), nrow=4, ncol=2)
 library (Epi)
 twoby2(m)
 
 R gives me only
 Comparing : Row 1 vs. Row 2
 
 While I would like to have reference value in Row 1, and compare Row 2, Row
 3 and Row 4 with it.

That is the default set of contrasts for 'glm' (and probably for 'mh' although 
it's not clear from the help page.)

(Epi does have its own mailing list.)

-- 
David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Where Query in SQL

2013-06-10 Thread Sneha Bishnoi
Hey all

I am trying to use where in clause in sql query in R
here is my code:

sql.select-paste(select PERSON_NAME from UNITS where UNIT_ID in
(',cathree,'),sep=)

where cathree is 1 variable with 16 observations as follows

UNIT_ID
1 205
2 209
3 213
4 217
5 228
6 232
7 236
8 240
9 245
10 249
11 253
12 257
13 268
14 272
15 276
16 280

but when i run this code, 0 rows are selected eventhough there exist 3 rows
which satisfy the above query


Thanks
Sneha

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] woby2 (Odds Ratio) for variables with 3 or more levels

2013-06-10 Thread James C. Whanger
You may want to consider a cumulative logit model which effectively
bifurcates an ordinal variable by utilizing the odds of being in a given
level or below (depending on your coding).


On Mon, Jun 10, 2013 at 12:27 PM, Vlatka Matkovic Puljic vlatk...@gmail.com
 wrote:

 Dear all,

 I am using Epi package to calculate Odds ratio in my bivariate analysis.
 How can I make *twoby2 *in variables that have 3 or more levels.

 For example:
 I have 4 level var (Age)
 m=matrix(c(290, 100,232, 201, 136, 99, 182, 240), nrow=4, ncol=2)
 library (Epi)
 twoby2(m)

 R gives me only
 Comparing : Row 1 vs. Row 2

 While I would like to have reference value in Row 1, and compare Row 2, Row
 3 and Row 4 with it.


 Thanks for your help!

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
*James C. Whanger*
*
*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Where Query in SQL

2013-06-10 Thread MacQueen, Don
Do this

cat(sql.select,'\n')

and then decide whether the query is what it should be according to
standard SQL syntax.
(If it is not, then fix it.)

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 6/10/13 11:47 AM, Sneha Bishnoi sneha.bish...@gmail.com wrote:

Hey all

I am trying to use where in clause in sql query in R
here is my code:

sql.select-paste(select PERSON_NAME from UNITS where UNIT_ID in
(',cathree,'),sep=)

where cathree is 1 variable with 16 observations as follows

UNIT_ID
1 205
2 209
3 213
4 217
5 228
6 232
7 236
8 240
9 245
10 249
11 253
12 257
13 268
14 272
15 276
16 280

but when i run this code, 0 rows are selected eventhough there exist 3
rows
which satisfy the above query


Thanks
Sneha

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting divergent colors

2013-06-10 Thread Adams, Jean
It will be hard to come up with 20 clearly distinguishable colors.  Check
out the website http://colorbrewer2.org/ and the R package RColorBrewer.
 It does not have a 20-color palette, but it does have some 8- to 12-color
palettes that are very nice.

library(RColorBrewer)
display.brewer.all(n=NULL, type=all, select=NULL, exact.n=TRUE)

You could use these colors in combination with line type to build up to 72
unique combinations.  For example ...

nuniq - ncol(mat)
mycols - rep(brewer.pal(12, Set3), length=nuniq)
myltys - rep(1:6, rep(12, 6))[1:nuniq]

for(k in 1:nuniq){
 plot(density(mat[,k]), col=mycols[k], xlab='', ylab='', axes=F, main=F,
lwd=3, lty=myltys[k])
par(new=TRUE)
 }
legend('topright', legend=snames, col=mycols, lty=myltys, lwd=3)

Jean



On Mon, Jun 10, 2013 at 12:33 PM, Brian Smith bsmith030...@gmail.comwrote:

 Hi,

 I was trying to make a density plot with 13 samples. To distinguish each
 sample, it would be good if each color is as different as possible from the
 other colors. I could use the built in function, but that does not do more
 than 8 colors and then goes back to recycling the cols. If I use a palette,
 then it is really difficult to distinguish between the colors.

 So, is there a way that I can select a large number of colors (i.e. perhaps
 20) that are as different from each other as possible?

 Here is my example code using the palette:

 **
 mat - matrix(sample(1:1000,1000,replace=T),nrow=20,ncol=20)
 snames - paste('Sample_',1:ncol(mat),sep='')
 colnames(mat) - snames

 mycols - palette(rainbow(ncol(mat)))

 for(k in 1:ncol(mat)){
   plot(density(mat[,k]),col=mycols[k],xlab='',ylab='',axes=F,main=F)
   par(new=T)
 }

 legend(x='topright',legend=snames,fill=mycols)

 

 thanks!

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] reshaping a data frame

2013-06-10 Thread Abhishek Pratap
Thanks everyone for your quick reply. I think my contrived example hid
the complexity I wanted to show by using only one variable.

@Arun: I think your example is exactly what I was looking for. Very
cool trick with 'ave' and 'seq_along'...just dint occur to me.

Best,
-Abh


 On Mon, Jun 10, 2013 at 7:13 AM, arun smartpink...@yahoo.com wrote:
 Hi,If your dataset is similar to the one below:
 set.seed(24)
 temp1_df- 
 data.frame(names=rep(c('foo','foo1'),each=6),variable=rep(c('w','x'),times=6),value=sample(25:40,12,replace=TRUE),stringsAsFactors=FALSE)

 library(reshape2)
  
 res-dcast(within(temp1_df,{Seq1-ave(value,names,variable,FUN=seq_along)}),names+Seq1~variable,value.var=value)[,-2]
 res
 #  names  w  x
 #1   foo 29 28
 #2   foo 36 33
 #3   foo 35 39
 #4  foo1 29 37
 #5  foo1 37 29
 #6  foo1 34 30
 A.K.


 - Original Message -
 From: Abhishek Pratap abhishek@gmail.com
 To: r-help@r-project.org r-help@r-project.org
 Cc:
 Sent: Monday, June 10, 2013 2:15 AM
 Subject: [R] reshaping a data frame

 Hi Guys

 I am trying to cast a data frame but not aggregate the rows for the
 same variable.

 here is a contrived example.

 **input**
 temp_df  - 
 data.frame(names=c('foo','foo','foo'),variable=c('w','w','w'),value=c(34,65,12))
 temp_df
   names variable value
 1   foow34
 2   foow65
 3   foow12


 ###
 **Want this**
 
 names  w
 foo 34
 foo 65
 foo 12


 ##
 **getting this***
 ##
 cast(temp_df)
 Aggregation requires fun.aggregate: length used as default
   names w
 1   foo 3


 In real dataset  the categorical column 'variable' will have many more
 categorical variable.

 Thanks!
 -Abhi

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] parameters estimation of a normal-lognormal multivariate model

2013-06-10 Thread Hertzog Gladys
Dear all,

I have to create a model which is a mixture of a normal and log-normal 
distribution. To create it, I need to estimate the 2 covariance matrixes and 
the mixing parameter (total =7 parameters) by maximizing the log-likelihood 
function. This maximization has to be performed by the nlm routine.
As I use relative data, the means are known and equal to 1.

I’ve already tried to do it in 1 dimension (with 1 set of relative data) and it 
works well. However, when I introduce the 2nd set of relative data I get 
illogical results for the correlation and a lot of warnings messages (at all 
25).

To estimates the parameters I defined first the log-likelihood function with 
the 2 commands dmvnorm and dlnorm.plus. Then I assign starting values of the 
parameters and finally I use the nlm routine to estimate the parameters (see 
script below).

# Importing and reading the grid files. Output are 2048x2048 matrixes
 
P - read.ascii.grid(d:/Documents/JOINT_FREQUENCY/grid_E727_P-3000.asc, 
return.header= FALSE ); 
V - read.ascii.grid(d:/Documents/JOINT_FREQUENCY/grid_E727_V-3000.asc, 
return.header= FALSE ); 
 
p - c(P); # tranform matrix into a vector
v - c(V);
 
p- p[!is.na(p)] # removing NA values
v- v[!is.na(v)]
 
p_rel - p/mean(p) #Transforming the data to relative values
v_rel - v/mean(v) 
PV - cbind(p_rel, v_rel) # create a matrix of vectors
 
L - function(par,p_rel,v_rel) {
 
return (-sum(log( (1- par[7])*dmvnorm(PV, mean=c(1,1), sigma= 
matrix(c(par[1]^2, par[1]*par[2]*par[3],par[1]*par[2]*par[3], par[2]^2 
),nrow=2, ncol=2))+
par[7]*dlnorm.rplus(PV, meanlog=c(1,1), varlog= 
matrix(c(par[4]^2,par[4]*par[5]*par[6],par[4]*par[5]*par[6],par[5]^2), 
nrow=2,ncol=2)))))
 
}
par.start- c(0.74, 0.66 ,0.40, 1.4, 1.2, 0.4, 0.5) # log-likelihood estimators
 
result-nlm(L,par.start,v_rel=v_rel,p_rel=p_rel, hessian=TRUE, iterlim=200, 
check.analyticals= TRUE)
Messages d'avis :
1: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
  production de NaN
2: In sqrt(2 * pi * det(varlog)) : production de NaN
3: In nlm(L, par.start, p_rel = p_rel, v_rel = v_rel, hessian = TRUE) :
  NA/Inf replaced by maximum positive value
4: In log(eigen(sigma, symmetric = TRUE, only.values = TRUE)$values) :
  production de NaN
…. Until 25.

par.hat - result$estimate
 
cat(sigN_p =, par[1],\n,sigN_v =, par[2],\n,rhoN =, 
par[3],\n,sigLN_p =, par[4],\n,sigLN_v =, par[5],\n,rhoLN =, 
par[6],\n,mixing parameter =, par[7],\n)
 
sigN_p = 0.5403361 
 sigN_v = 0.6667375 
 rhoN = 0.6260181 
 sigLN_p = 1.705626 
 sigLN_v = 1.592832 
 rhoLN = 0.9735974 
 mixing parameter = 0.8113369
 
Does someone know what is wrong in my model or how should I do to find these 
parameters in 2 dimensions?

Thank you very much for taking time to look at my questions.

Regards,

Gladys Hertzog
Master student in environmental engineering, ETH Zurich
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Selecting divergent colors

2013-06-10 Thread Ben Tupper
Hi,

On Jun 10, 2013, at 3:46 PM, Adams, Jean wrote:

 It will be hard to come up with 20 clearly distinguishable colors.  Check
 out the website http://colorbrewer2.org/ and the R package RColorBrewer.
 It does not have a 20-color palette, but it does have some 8- to 12-color
 palettes that are very nice.
 
 library(RColorBrewer)
 display.brewer.all(n=NULL, type=all, select=NULL, exact.n=TRUE)
 

It sounds like Brian is looking for categorical coloring rather than divergent 
coloring.  The Glasbey LUT works really well in image processing for just such 
purposes.  It would be easy to use that within R for your lines.

http://www.bioss.ac.uk/people/chris/colorpaper.pdf

You might be able to snag the color table out of this collection of Java 
plugins for ImageJ software. 

http://www.dentistry.bham.ac.uk/landinig/software/morphology.zip

Within that archive is a text file called glasbey.lut which is a simple text 
file of RGB color values.

Cheers,
Ben




 You could use these colors in combination with line type to build up to 72
 unique combinations.  For example ...
 
 nuniq - ncol(mat)
 mycols - rep(brewer.pal(12, Set3), length=nuniq)
 myltys - rep(1:6, rep(12, 6))[1:nuniq]
 
 for(k in 1:nuniq){
 plot(density(mat[,k]), col=mycols[k], xlab='', ylab='', axes=F, main=F,
 lwd=3, lty=myltys[k])
 par(new=TRUE)
 }
 legend('topright', legend=snames, col=mycols, lty=myltys, lwd=3)
 
 Jean
 
 
 
 On Mon, Jun 10, 2013 at 12:33 PM, Brian Smith bsmith030...@gmail.comwrote:
 
 Hi,
 
 I was trying to make a density plot with 13 samples. To distinguish each
 sample, it would be good if each color is as different as possible from the
 other colors. I could use the built in function, but that does not do more
 than 8 colors and then goes back to recycling the cols. If I use a palette,
 then it is really difficult to distinguish between the colors.
 
 So, is there a way that I can select a large number of colors (i.e. perhaps
 20) that are as different from each other as possible?
 
 Here is my example code using the palette:
 
 **
 mat - matrix(sample(1:1000,1000,replace=T),nrow=20,ncol=20)
 snames - paste('Sample_',1:ncol(mat),sep='')
 colnames(mat) - snames
 
 mycols - palette(rainbow(ncol(mat)))
 
 for(k in 1:ncol(mat)){
  plot(density(mat[,k]),col=mycols[k],xlab='',ylab='',axes=F,main=F)
  par(new=T)
 }
 
 legend(x='topright',legend=snames,fill=mycols)
 
 
 
 thanks!
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Fwd: Problem with ODBC connection

2013-06-10 Thread Christofer Bogaso
Any response please? Was my question not clear to the list? Please let me
know.

Thanks and regards,

-- Forwarded message --
From: Christofer Bogaso bogaso.christo...@gmail.com
Date: Sat, Jun 8, 2013 at 9:39 PM
Subject: Re: Problem with ODBC connection
To: r-help r-help@r-project.org


Hello All,

My previous post remains unanswered probably because the attachment was not
working properly.

So I am re-posting it again.

My problem is in reading an Excel-2003 file through ODBC connection using
RODBC package. Let say I have this Excel file:

http://www.2shared.com/document/HS3JeFyW/MyFile.html


I saved it in my F: drive and tried reading the contents using RODBC
connection:

 library(RODBC)
 MyData - sqlFetch(odbcConnectExcel(f:/MyFile.xls), )
 head(MyData, 30)


However it looks that the second column (with header 's') is not read
properly.

Can somebody here explain this bizarre thing? Did I do something wrong in
reading that?

Really appreciate if someone could point out anything what might go wrong.

Thanks and regards,


On Fri, Jun 7, 2013 at 4:46 PM, Christofer Bogaso 
bogaso.christo...@gmail.com wrote:

 Hello again,

 I am having problem with ODBC connection using the RODBC package.

 I am basically trying to read the attached Excel-2003 file using RODBC
 package. Here is my code:

  head(sqlFetch(odbcConnectExcel(d:/1.xls), ), 30);
 odbcCloseAll()
Criteria  s  d fd  ffd1
 f1fd2f2 fd3 f3 F12 F13 F14 F15 F16 F17
 F18 F19 F20
 1 a NA NA NA NA 0.
 0.27755576 -0.00040332321NA  NA NA  NA
  NA  NA  NA  NA  NA  NA  NA  NA
 2 s NA  0 NA NA 0.
 0.  0.000NA  NA NA  NA
  NA  NA  NA  NA  NA  NA  NA  NA
 3 d NA  0 NA NA 0.01734723
 0.06938894  0.2775558  5.00  NA NA  NA
  NA  NA  NA  NA  NA  NA  NA  NA
 4 f NA NA NA NA NA
 NA NA -4.25  NA NA  NA  NA  NA  NA  NA  NA
  NA  NA  NA
 5 f NA  0 NA NA 0.
 0.  0.000 -1.53  NA NA  NA
  NA  NA  NA  NA  NA  NA  NA  NA
 6 f NA NA NA NA NA
 NA  0.000  0.00  NA NA  NA  NA  NA  NA  NA  NA
  NA  NA  NA
 7 f NA NA NA NA NA
 NA  0.000NA  NA NA  NA  NA  NA  NA  NA  NA
  NA  NA  NA
 8 f NA  0 NA NA NA
 NA NANA  NA NA  NA  NA  NA  NA  NA  NA
  NA  NA  NA
 9 f NA  0 NA NA NA
 NA NANA  NA NA  NA  NA  NA  NA  NA  NA
  NA  NA  NA
 10f NA NA NA NA NA
 NA NANA  NA NA  NA  NA  NA  NA  NA  NA
  NA  NA  NA
 11f NA NA NA NA NA
 NA NANA  NA NA  NA  NA  NA  NA  NA  NA
  NA  NA  NA
 12f NA NA NA NA NA
 NA NANA  NA NA  NA  NA  NA  NA  NA  NA
  NA  NA  NA
 13f NA NA NA NA NA
 NA NANA  NA NA  NA  NA  NA  NA  NA  NA
  NA  NA  NA

 Here you see the data in second column could not read at all.

 Can somebody point me if I did something wrong?

 Thanks and regards,


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Apply a PCA to other datasets

2013-06-10 Thread edelance
I have run a PCA on one data set.  I need the standard deviation of the first
two bands for my analysis.  I now want to apply the same PCA rotation I used
in the first one to all my other data sets.  Is there any way to do this in
r?  Thanks.




--
View this message in context: 
http://r.789695.n4.nabble.com/Apply-a-PCA-to-other-datasets-tp4669182.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Speed up or alternative to 'For' loop

2013-06-10 Thread Trevor Walker
I have a For loop that is quite slow and am wondering if there is a faster
option:

df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500))
df$Height - exp(-0.1 + 0.2*df$Age)
df$HeightGrowth - NA   #intialize with NA
for (i in 2:nrow(df))
 {if(df$TreeID[i]==df$TreeID[i-1])
  {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1]
  }
 }

Trevor Walker
Email: trevordaviswal...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Combining CSV data

2013-06-10 Thread Shreya Rawal
Hello R community,

I am trying to combine two CSV files that look like this:

File A

Row_ID_CR,   Data1,Data2,Data3
1,   aa,  bb,  cc
2,   dd,  ee,  ff


File B

Row_ID_N,   Src_Row_ID,   DataN1
1a,   1,   This is comment 1
2a,   1,   This is comment 2
3a,   2,   This is comment 1
4a,   1,   This is comment 3

And the output I am looking for is, comparing the values of Row_ID_CR and
Src_Row_ID

Output

ROW_ID_CR,Data1,Data2,Data3,DataComment1,
 DataComment2,  DataComment3
1,  aa, bb, cc,This is
comment1,This is comment2, This is comment 3
2,  dd,  ee, ff,  This is
comment1


I am a novice R user, I am able to replicate a left join but I need a bit
more in the final result.


Thanks!!

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How sum all possible combinations of rows, given 4 matrices

2013-06-10 Thread Estigarribia, Bruno
It works, Arun. Thanks!
(FYI, a couple a the matrices I am dealing with have 1000+ rows, so I had
to do in on a supercomputer at work. For the curious, I am trying to find
all possible scores in a model f language mixing described in:
Title: Structured Variation in Codeswitching: Towards an Empirically Based
Typology of Bilingual Speech Patterns
Authors: Deuchar, Margaret; Muysken, Pieter; Wang, Sung-Lan
Publication Date: 2007
Journal Name: International Journal of Bilingual Education and
Bilingualism)


Bruno Estigarribia
Assistant Professor of Spanish, Department of Romance Languages and
Literatures
Research Assistant Professor of Psychology, Cognitive Science Program
Affiliate Faculty, Global Studies
Dey Hall, Room 332, CB# 3170
University of North Carolina at Chapel Hill
estig...@email.unc.edu
917-348-8162





On 5/27/13 1:54 PM, arun smartpink...@yahoo.com wrote:

Hi,
Not sure if this is what you expected:

set.seed(24)
mat1- matrix(sample(1:20,3*4,replace=TRUE),ncol=3)
set.seed(28)
mat2- matrix(sample(1:25,3*6,replace=TRUE),ncol=3)
set.seed(30)
mat3- matrix(sample(1:35,3*8,replace=TRUE),ncol=3)
set.seed(35)
mat4- matrix(sample(1:40,3*10,replace=TRUE),ncol=3)
 
dat1-expand.grid(seq(dim(mat1)[1]),seq(dim(mat2)[1]),seq(dim(mat3)[1]),se
q(dim(mat4)[1]))
vec1-paste0(mat,1:4)
matNew-do.call(cbind,lapply(seq_len(ncol(dat1)),function(i)
get(vec1[i])[dat1[,i],]))
colnames(matNew)- (seq(12)-1)%%3+1
datNew-data.frame(matNew)
res-sapply(split(colnames(datNew),gsub(\\..*,,colnames(datNew))),func
tion(x) rowSums(datNew[,x]))

dim(res)
#[1] 19203
 head(res)
# X1 X2 X3
#[1,] 46 63 70
#[2,] 45 68 59
#[3,] 55 55 66
#[4,] 51 65 61
#[5,] 48 84 75
#[6,] 47 89 64

A.K.

- Original Message -
From: Estigarribia, Bruno estig...@email.unc.edu
To: r-help@R-project.org r-help@r-project.org
Cc: 
Sent: Monday, May 27, 2013 11:24 AM
Subject: [R] How sum all possible combinations of rows, given 4 matrices

Hello all,

I have 4 matrices with 3 columns each (different number of rows though). I
want to find a function that returns all possible 3-place vectors
corresponding to the sum by columns of picking one row from matrix 1, one
from matrix 2, one from matrix 3, and one from matrix 4. So basically, all
possible ways of picking one row from each matrix and then sum their
columns to obtain a 3-place vector.
Is there a way to use expand.grid and reduce to obtain this result? Or am
I on the wrong track?
Thank you,
Bruno
PS:I believe I have given all relevant info. I apologize in advance if my
question is ill-posed or ambiguous.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] please check this

2013-06-10 Thread arun
Hi,
Try this:
res10Percent- fun1(final3New,0.1,200)

res10PercentSub1-subset(res10Percent[duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE),],dummy==1)
indx1-as.numeric(row.names(res10PercentSub1))

res10PercentSub2-res10PercentSub1[order(res10PercentSub1$dimension),]
indx11-as.numeric(row.names(res10PercentSub2))
names(indx11)-(seq_along(indx11)-1)%/%2+1
res10PercentSub3-res10Percent[c(indx11,indx11+1),]
res10PercentSub3$id- names(c(indx11,indx11+1))
 
res10PercentSub4-do.call(rbind,lapply(split(res10PercentSub3,res10PercentSub3$id),function(x)
 
{x1-x[-1,];x2-x1[which.max(abs(x1$dimension[1]-x1$dimension[-1]))+1,];x3-x[x$dummy==1,][which.min(abs(as.numeric(row.names(x[x$dummy==1,]))-as.numeric(row.names(x2,];rbind(x3,x2)}))

res10PercentSub0-subset(res10Percent[duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE),],dummy==0)
indx0-as.numeric(row.names(res10PercentSub0))

res10PercentSub20-res10PercentSub0[order(res10PercentSub0$dimension),]
indx00-as.numeric(row.names(res10PercentSub20))
names(indx00)-(seq_along(indx00)-1)%/%2+1
res10PercentSub30- res10Percent[c(indx00-1,indx00),]
res10PercentSub30$id- names(c(indx00-1,indx00))
res10PercentSub40- 
do.call(rbind,lapply(split(res10PercentSub30,res10PercentSub30$id),function(x){x1-subset(x,dummy==1);
 
x2-subset(x,dummy==0);x3-x1[which.max(abs(x1$dimension-unique(x2$dimension))),];x4-x2[which.min(abs(as.numeric(row.names(x3))-as.numeric(row.names(x2,];rbind(x3,x4)}))

row.names(res10PercentSub40)-gsub(.*\\.,,row.names(res10PercentSub40))
indxNew- 
sort(as.numeric(c(row.names(res10PercentSub5),row.names(res10PercentSub40
res10PercentFinal-res10Percent[-indxNew,]
 dim(res10PercentFinal)
#[1] 454   5
 nrow(subset(res10PercentFinal,dummy==0))
#[1] 227
 nrow(subset(res10PercentFinal,dummy==1))
#[1] 227

nrow(unique(res10PercentFinal))
#[1] 454
which(duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE))
# [1] 113 117 123 125 153 157 187 189 207 213 223 235 265 267 269 275 276 278 
279
#[20] 283 293 301 303 305 309 317 327 331 335 339 341 343 347 351 367 369 371 
379
#[39] 385 399 407 413 415 417 429 437 441 453 459 461 471 473 477 479 501 505
 res10Percent[c(113:114,117:118),]
# firm year industry dummy dimension
#113 500221723 2005   26 1  3147
#114 500601429 2005   26 0  3076
#117 500221723 2005   26 1  3147
#118 502668920 2005   26 0  3249
 
res10PercentFinal[c(113:114,117:118),]  #deleted the duplicated row and the 
accompanying pair with the maximum difference
# firm year industry dummy dimension
#113 500221723 2005   26 1  3147
#114 500601429 2005   26 0  3076
#119 500115362 2006   26 1  6239
#120 500060223 2006   26 0  6208

A.K.

row.names(res10PercentSub4)-gsub(.*\\.,,row.names(res10PercentSub4))
res10PercentSub5-res10PercentSub4[order(as.numeric(res10PercentSub4$id)),]

- Original Message -
From: Cecilia Carmo cecilia.ca...@ua.pt
To: arun smartpink...@yahoo.com
Cc: 
Sent: Monday, June 10, 2013 1:41 PM
Subject: RE: please check this

I think it could be better to eliminate that one.
If you could do it I appreciate.

Cecília


De: arun [smartpink...@yahoo.com]
Enviado: segunda-feira, 10 de Junho de 2013 18:14
Para: Cecilia Carmo
Assunto: Re: please check this

If you wanted to eliminate the duplicate rows that have the pair with the 
maximum difference, it is possible.
Just informing you.




- Original Message -
From: Cecilia Carmo cecilia.ca...@ua.pt
To: arun smartpink...@yahoo.com
Cc:
Sent: Monday, June 10, 2013 10:51 AM
Subject: RE: please check this

I think it is ok now.

Thanks
Cecília


De: arun [smartpink...@yahoo.com]
Enviado: segunda-feira, 10 de Junho de 2013 15:39
Para: Cecilia Carmo
Cc: R help
Assunto: Re: please check this

Hi,
Try this:
which(duplicated(res10Percent))
# [1] 117 125 157 189 213 235 267 275 278 293 301 327 331 335 339 367 369 371 
379
#[20] 413 415 417 441 459 461 477 479 505
res10PercentSub1-subset(res10Percent[which(duplicated(res10Percent)),],dummy==1)
  #most of the duplicated are dummy==1
res10PercentSub0-subset(res10Percent[which(duplicated(res10Percent)),],dummy==0)
indx1-as.numeric(row.names(res10PercentSub1))
indx11-sort(c(indx1,indx1+1))
indx0- as.numeric(row.names(res10PercentSub0))
indx00- sort(c(indx0,indx0-1))
indx10- sort(c(indx11,indx00))

nrow(res10Percent[-indx10,])
#[1] 452
res10PercentNew-res10Percent[-indx10,]
nrow(subset(res10PercentNew,dummy==1))
#[1] 226
nrow(subset(res10PercentNew,dummy==0))
#[1] 226
nrow(unique(res10PercentNew))
#[1] 452
A.K.



- Original Message -
From: Cecilia Carmo cecilia.ca...@ua.pt
To: arun smartpink...@yahoo.com
Cc:
Sent: Monday, June 10, 2013 10:19 AM
Subject: RE: please check this

But I don't want it like this.
Once a firm is paired with another, these two firms 

Re: [R] Speed up or alternative to 'For' loop

2013-06-10 Thread Rui Barradas

Hello,

One way to speed it up is to use a matrix instead of a data.frame. Since 
data.frames can hold data of all classes, the access to their elements 
is slow. And your data is all numeric so it can be hold in a matrix. The 
second way below gave me a speed up by a factor of 50.



system.time({
for (i in 2:nrow(df))
 {if(df$TreeID[i]==df$TreeID[i-1])
  {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1]
  }
 }
})

system.time({
df2 - data.matrix(df)
for(i in seq_len(nrow(df2))[-1]){
if(df2[i, TreeID] == df2[i - 1, TreeID])
df2[i, HeightGrowth] - df2[i, Height] - df2[i - 1, 
Height]
}
})

all.equal(df, as.data.frame(df2))  # TRUE


Hope this helps,

Rui Barradas

Em 10-06-2013 18:28, Trevor Walker escreveu:

I have a For loop that is quite slow and am wondering if there is a faster
option:

df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500))
df$Height - exp(-0.1 + 0.2*df$Age)
df$HeightGrowth - NA   #intialize with NA
for (i in 2:nrow(df))
  {if(df$TreeID[i]==df$TreeID[i-1])
   {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1]
   }
  }

Trevor Walker
Email: trevordaviswal...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining CSV data

2013-06-10 Thread jim holtman
try this:

 fileA - read.csv(text = Row_ID_CR,   Data1,Data2,Data3
+ 1,   aa,  bb,  cc
+ 2,   dd,  ee,  ff, as.is = TRUE)

 fileB - read.csv(text = Row_ID_N,   Src_Row_ID,   DataN1
+ 1a,   1,   This is comment 1
+ 2a,   1,   This is comment 2
+ 3a,   2,   This is comment 1
+ 4a,   1,   This is comment 3, as.is = TRUE)

 # get rid of leading/trailing blanks on comments
 fileB$DataN1 - gsub(^ *| *$, , fileB$DataN1)

 # merge together
 result - merge(fileA, fileB, by.x = 'Row_ID_CR', by.y = Src_Row_ID)

 # now partition by Row_ID_CR and aggregate the comments
 result2 - do.call(rbind,
+ lapply(split(result, result$Row_ID_CR), function(.grp){
+ cbind(.grp[1L, -c(5,6)], comment = paste(.grp$DataN1, collapse =
', '))
+ })
+ )
 result2
  Row_ID_CR Data1Data2Data3
comment
1 1aa   bb   cc This is comment
1, This is comment 2, This is comment 3
2 2dd   ee   ff
  This is comment 1




On Mon, Jun 10, 2013 at 4:38 PM, Shreya Rawal rawal.shr...@gmail.comwrote:

 Hello R community,

 I am trying to combine two CSV files that look like this:

 File A

 Row_ID_CR,   Data1,Data2,Data3
 1,   aa,  bb,  cc
 2,   dd,  ee,  ff


 File B

 Row_ID_N,   Src_Row_ID,   DataN1
 1a,   1,   This is comment 1
 2a,   1,   This is comment 2
 3a,   2,   This is comment 1
 4a,   1,   This is comment 3

 And the output I am looking for is, comparing the values of Row_ID_CR and
 Src_Row_ID

 Output

 ROW_ID_CR,Data1,Data2,Data3,DataComment1,
  DataComment2,  DataComment3
 1,  aa, bb, cc,This is
 comment1,This is comment2, This is comment 3
 2,  dd,  ee, ff,  This is
 comment1


 I am a novice R user, I am able to replicate a left join but I need a bit
 more in the final result.


 Thanks!!

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speed up or alternative to 'For' loop

2013-06-10 Thread MacQueen, Don
How about

for (ir in unique(df$TreeID)) {
  in.ir - df$TreeID == ir
  df$HeightGrowth[in.ir] - cumsum(df$Height[in.ir])
}

Seemed fast enough to me.

In R, it is generally good to look for ways to operate on entire vectors
or arrays, rather than element by element within them. The cumsum()
function does that in this example.

-Don


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 6/10/13 10:28 AM, Trevor Walker trevordaviswal...@gmail.com wrote:

I have a For loop that is quite slow and am wondering if there is a faster
option:

df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500))
df$Height - exp(-0.1 + 0.2*df$Age)
df$HeightGrowth - NA   #intialize with NA
for (i in 2:nrow(df))
 {if(df$TreeID[i]==df$TreeID[i-1])
  {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1]
  }
 }

Trevor Walker
Email: trevordaviswal...@gmail.com

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] please check this

2013-06-10 Thread arun
Sorry, I forgot to paste some lines and change the names:


res10Percent- fun1(final3New,0.1,200)

res10PercentSub1-subset(res10Percent[duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE),],dummy==1)
indx1-as.numeric(row.names(res10PercentSub1))

res10PercentSub2-res10PercentSub1[order(res10PercentSub1$dimension),]
indx11-as.numeric(row.names(res10PercentSub2))
names(indx11)-(seq_along(indx11)-1)%/%2+1
res10PercentSub3-res10Percent[c(indx11,indx11+1),]
res10PercentSub3$id- names(c(indx11,indx11+1))
res10PercentSub4-do.call(rbind,lapply(split(res10PercentSub3,res10PercentSub3$id),function(x)
 
{x1-x[-1,];x2-x1[which.max(abs(x1$dimension[1]-x1$dimension[-1]))+1,];x3-x[x$dummy==1,][which.min(abs(as.numeric(row.names(x[x$dummy==1,]))-as.numeric(row.names(x2,];rbind(x3,x2)}))
row.names(res10PercentSub4)-gsub(.*\\.,,row.names(res10PercentSub4)) 
#forgot

res10PercentSub0-subset(res10Percent[duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE),],dummy==0)
indx0-as.numeric(row.names(res10PercentSub0))

res10PercentSub20-res10PercentSub0[order(res10PercentSub0$dimension),]
indx00-as.numeric(row.names(res10PercentSub20))
names(indx00)-(seq_along(indx00)-1)%/%2+1
res10PercentSub30- res10Percent[c(indx00-1,indx00),]
res10PercentSub30$id- names(c(indx00-1,indx00))
res10PercentSub40- 
do.call(rbind,lapply(split(res10PercentSub30,res10PercentSub30$id),function(x){x1-subset(x,dummy==1);
 
x2-subset(x,dummy==0);x3-x1[which.max(abs(x1$dimension-unique(x2$dimension))),];x4-x2[which.min(abs(as.numeric(row.names(x3))-as.numeric(row.names(x2,];rbind(x3,x4)}))

row.names(res10PercentSub40)-gsub(.*\\.,,row.names(res10PercentSub40))
indxNew- 
sort(as.numeric(c(row.names(res10PercentSub4),row.names(res10PercentSub40 
#res10PercentSub4
res10PercentFinal-res10Percent[-indxNew,]
dim(res10PercentFinal)
#[1] 454  5
nrow(subset(res10PercentFinal,dummy==0))
#[1] 227
nrow(subset(res10PercentFinal,dummy==1))
#[1] 227

nrow(unique(res10PercentFinal))

A.K.

- Original Message -
From: Cecilia Carmo cecilia.ca...@ua.pt
To: arun smartpink...@yahoo.com
Cc: 
Sent: Monday, June 10, 2013 5:48 PM
Subject: RE: please check this

Error message:

Error in row.names(res10PercentSub5) : 
  object 'res10PercentSub5' not found



De: arun [smartpink...@yahoo.com]
Enviado: segunda-feira, 10 de Junho de 2013 22:05
Para: Cecilia Carmo
Cc: R help
Assunto: Re: please check this

Hi,
Try this:
res10Percent- fun1(final3New,0.1,200)

res10PercentSub1-subset(res10Percent[duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE),],dummy==1)
indx1-as.numeric(row.names(res10PercentSub1))

res10PercentSub2-res10PercentSub1[order(res10PercentSub1$dimension),]
indx11-as.numeric(row.names(res10PercentSub2))
names(indx11)-(seq_along(indx11)-1)%/%2+1
res10PercentSub3-res10Percent[c(indx11,indx11+1),]
res10PercentSub3$id- names(c(indx11,indx11+1))
res10PercentSub4-do.call(rbind,lapply(split(res10PercentSub3,res10PercentSub3$id),function(x)
 
{x1-x[-1,];x2-x1[which.max(abs(x1$dimension[1]-x1$dimension[-1]))+1,];x3-x[x$dummy==1,][which.min(abs(as.numeric(row.names(x[x$dummy==1,]))-as.numeric(row.names(x2,];rbind(x3,x2)}))

res10PercentSub0-subset(res10Percent[duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE),],dummy==0)
indx0-as.numeric(row.names(res10PercentSub0))

res10PercentSub20-res10PercentSub0[order(res10PercentSub0$dimension),]
indx00-as.numeric(row.names(res10PercentSub20))
names(indx00)-(seq_along(indx00)-1)%/%2+1
res10PercentSub30- res10Percent[c(indx00-1,indx00),]
res10PercentSub30$id- names(c(indx00-1,indx00))
res10PercentSub40- 
do.call(rbind,lapply(split(res10PercentSub30,res10PercentSub30$id),function(x){x1-subset(x,dummy==1);
 
x2-subset(x,dummy==0);x3-x1[which.max(abs(x1$dimension-unique(x2$dimension))),];x4-x2[which.min(abs(as.numeric(row.names(x3))-as.numeric(row.names(x2,];rbind(x3,x4)}))

row.names(res10PercentSub40)-gsub(.*\\.,,row.names(res10PercentSub40))
indxNew- 
sort(as.numeric(c(row.names(res10PercentSub5),row.names(res10PercentSub40
res10PercentFinal-res10Percent[-indxNew,]
dim(res10PercentFinal)
#[1] 454   5
nrow(subset(res10PercentFinal,dummy==0))
#[1] 227
nrow(subset(res10PercentFinal,dummy==1))
#[1] 227

nrow(unique(res10PercentFinal))
#[1] 454
which(duplicated(res10Percent)|duplicated(res10Percent,fromLast=TRUE))
# [1] 113 117 123 125 153 157 187 189 207 213 223 235 265 267 269 275 276 278 
279
#[20] 283 293 301 303 305 309 317 327 331 335 339 341 343 347 351 367 369 371 
379
#[39] 385 399 407 413 415 417 429 437 441 453 459 461 471 473 477 479 501 505
res10Percent[c(113:114,117:118),]
#         firm year industry dummy dimension
#113 500221723 2005       26     1      3147
#114 500601429 2005       26     0      3076
#117 500221723 2005       26     1      3147
#118 502668920 2005       26     0      3249


Re: [R] help needed! RMSE

2013-06-10 Thread Ben Bolker
mansor nad nadsim88 at hotmail.com writes:

 
 i need HELPPP!! how do i calculate the RMSE value for two GEV
 models?first GEV is where the three parameters are constant.2nd GEV
 model a 4 parameter model with the location parameter is allowed to
 vary linearly with respect to time while holding the other
 parameters at constant.  is there any programming code for this?  i
 really really need help. please reply to me as soon as
 possible. thanks in advance.

  Have you read the posting guide (URL/link at the bottom of
every posting at this list)?  Can you provide a reproducible example?
It may seem perverse, but urgency (I need HELP! ... I really really
need help ... please reply to me as soon as possible ...) doesn't
actually generally improve your chances of getting help here -- it
comes across as shouting.  Providing reproducible examples not only
makes it easier for people to answer, and improving the chances
that the answers you get will be ones you really need, it also 
demonstrates evidence that you have invested some effort.

  You might want to start with this example:

  library(fExtremes)
  g1 - gevFit(gevSim())
  sqrt(sum(g1@residuals^2))
  ?gevFit

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speed up or alternative to 'For' loop

2013-06-10 Thread David Winsemius

On Jun 10, 2013, at 10:28 AM, Trevor Walker wrote:

 I have a For loop that is quite slow and am wondering if there is a faster
 option:
 
 df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500))
 df$Height - exp(-0.1 + 0.2*df$Age)
 df$HeightGrowth - NA   #intialize with NA
 for (i in 2:nrow(df))
 {if(df$TreeID[i]==df$TreeID[i-1])
  {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1]
  }
 }
 
Ivoid tests with if(){}e;se(). Use vectorized code, possibly with 'ifelse' but 
in this case you need a function that does calcualtions within groups.

The ave() function with diff() will do it compactly and efficiently:

 df - data.frame(TreeID=rep(1:5,each=4), Age=rep(seq(1,4,1),5))
 df$Height - exp(-0.1 + 0.2*df$Age)
 df$HeightGrowth - NA   #intialize with NA

 df$HeightGrowth - ave(df$Height, df$TreeID, FUN= function(vec) c(NA, 
 diff(vec)))
 df
   TreeID Age   Height HeightGrowth
1   1   1 1.105171   NA
2   1   2 1.3498590.2446879
3   1   3 1.6487210.2988625
4   1   4 2.0137530.3650314
5   2   1 1.105171   NA
6   2   2 1.3498590.2446879
7   2   3 1.6487210.2988625
8   2   4 2.0137530.3650314
9   3   1 1.105171   NA
10  3   2 1.3498590.2446879
11  3   3 1.6487210.2988625
12  3   4 2.0137530.3650314
13  4   1 1.105171   NA
14  4   2 1.3498590.2446879
15  4   3 1.6487210.2988625
16  4   4 2.0137530.3650314
17  5   1 1.105171   NA
18  5   2 1.3498590.2446879
19  5   3 1.6487210.2988625
20  5   4 2.0137530.3650314

(On my machine it was over six times as fast as the if-based code from Arun. )

-- 

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speed up or alternative to 'For' loop

2013-06-10 Thread MacQueen, Don
Sorry, it looks like I was hasty.
Absent another dumb mistake, the following should do it.

The request was for differences, i.e., the amount of growth from one
period to the next, separately for each tree.

for (ir in unique(df$TreeID)) {
  in.ir - df$TreeID == ir
  df$HeightGrowth[in.ir] - c(NA, diff(df$Height[in.ir]))
}



And this gives the same result as Rui Barradas' previous response.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 6/10/13 2:51 PM, MacQueen, Don macque...@llnl.gov wrote:

How about

for (ir in unique(df$TreeID)) {
  in.ir - df$TreeID == ir
  df$HeightGrowth[in.ir] - cumsum(df$Height[in.ir])
}

Seemed fast enough to me.

In R, it is generally good to look for ways to operate on entire vectors
or arrays, rather than element by element within them. The cumsum()
function does that in this example.

-Don


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 6/10/13 10:28 AM, Trevor Walker trevordaviswal...@gmail.com wrote:

I have a For loop that is quite slow and am wondering if there is a
faster
option:

df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500))
df$Height - exp(-0.1 + 0.2*df$Age)
df$HeightGrowth - NA   #intialize with NA
for (i in 2:nrow(df))
 {if(df$TreeID[i]==df$TreeID[i-1])
  {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1]
  }
 }

Trevor Walker
Email: trevordaviswal...@gmail.com

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining CSV data

2013-06-10 Thread arun
Hi,
Try this:

dat1-read.table(text=
Row_ID_CR,  Data1,    Data2,    Data3
1,  aa,  bb,  cc
2,  dd,  ee,  ff
,sep=,,header=TRUE,stringsAsFactors=FALSE)

dat2-read.table(text=
Row_ID_N,  Src_Row_ID,  DataN1
1a,  1,  This is comment 1
2a,  1,  This is comment 2
3a,  2,  This is comment 1
4a,  1,  This is comment 3
,sep=,,header=TRUE,stringsAsFactors=FALSE)
library(stringr)
dat2$DataN1-str_trim(dat2$DataN1)
res- merge(dat1,dat2,by.x=1,by.y=2)
 res1-res[,-5]
library(plyr)
 res2-ddply(res1,.(Row_ID_CR,Data1,Data2,Data3),summarize, DataN1=list(DataN1))
 res2
 # Row_ID_CR    Data1    Data2    Data3
#1 1   aa   bb   cc
#2 2   dd   ee   ff
#   DataN1
#1 This is comment 1, This is comment 2, This is comment 3
#2   This is comment 1



res3-data.frame(res2[,-5],t(apply(do.call(rbind,res2[,5]),1,function(x) 
{x[duplicated(x)]-NA;x})))
 colnames(res3)[grep(X,colnames(res3))]- 
paste0(DataComment,gsub([[:alpha:]],,colnames(res3)[grep(X,colnames(res3))]))
res3
#  Row_ID_CR    Data1    Data2    Data3  DataComment1
#1 1   aa   bb   cc This is comment 1
#2 2   dd   ee   ff This is comment 1
#   DataComment2  DataComment3
#1 This is comment 2 This is comment 3
#2  NA  NA

A.K.


- Original Message -
From: Shreya Rawal rawal.shr...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Monday, June 10, 2013 4:38 PM
Subject: [R] Combining CSV data

Hello R community,

I am trying to combine two CSV files that look like this:

File A

Row_ID_CR,   Data1,    Data2,    Data3
1,                   aa,          bb,          cc
2,                   dd,          ee,          ff


File B

Row_ID_N,   Src_Row_ID,   DataN1
1a,               1,                   This is comment 1
2a,               1,                   This is comment 2
3a,               2,                   This is comment 1
4a,               1,                   This is comment 3

And the output I am looking for is, comparing the values of Row_ID_CR and
Src_Row_ID

Output

ROW_ID_CR,    Data1,    Data2,    Data3,    DataComment1,
DataComment2,          DataComment3
1,                      aa,         bb,         cc,        This is
comment1,    This is comment2,     This is comment 3
2,                      dd,          ee,         ff,          This is
comment1


I am a novice R user, I am able to replicate a left join but I need a bit
more in the final result.


Thanks!!

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speed up or alternative to 'For' loop

2013-06-10 Thread MacQueen, Don
Well, speaking of hasty...

This will also do it, provided that each tree's initial height is less
than the previous tree's final height. In principle, not a safe
assumption, but might be ok depending on where the data came from.

df$delta - c(NA,diff(df$Height))
df$delta[df$delta  0] - NA

-Don



-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 6/10/13 2:51 PM, MacQueen, Don macque...@llnl.gov wrote:

How about

for (ir in unique(df$TreeID)) {
  in.ir - df$TreeID == ir
  df$HeightGrowth[in.ir] - cumsum(df$Height[in.ir])
}

Seemed fast enough to me.

In R, it is generally good to look for ways to operate on entire vectors
or arrays, rather than element by element within them. The cumsum()
function does that in this example.

-Don


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 6/10/13 10:28 AM, Trevor Walker trevordaviswal...@gmail.com wrote:

I have a For loop that is quite slow and am wondering if there is a
faster
option:

df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500))
df$Height - exp(-0.1 + 0.2*df$Age)
df$HeightGrowth - NA   #intialize with NA
for (i in 2:nrow(df))
 {if(df$TreeID[i]==df$TreeID[i-1])
  {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1]
  }
 }

Trevor Walker
Email: trevordaviswal...@gmail.com

  [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Speed up or alternative to 'For' loop

2013-06-10 Thread arun
Hi,
Some speed comparisons:


df - data.frame(TreeID=rep(1:6000,each=20), Age=rep(seq(1,20,1),6000))
df$Height - exp(-0.1 + 0.2*df$Age)
df1- df
df3-df
library(data.table)
dt1- data.table(df)
df$HeightGrowth - NA 


system.time({  #Rui's 2nd function
df2 - data.matrix(df)
for(i in seq_len(nrow(df2))[-1]){
    if(df2[i, TreeID] == df2[i - 1, TreeID])
        df2[i, HeightGrowth] - df2[i, Height] - df2[i - 1, Height]
}
})
# user  system elapsed 
 # 1.108   0.000   1.109 


system.time({for (ir in unique(df$TreeID)) {   #Don's first function
  in.ir - df$TreeID == ir
  df$HeightGrowth[in.ir] - c(NA, diff(df$Height[in.ir]))
}})
#  user  system elapsed 
#100.004   0.704 100.903 

system.time({df3$delta - c(NA,diff(df3$Height)) ##Don's 2nd function
df3$delta[df3$delta  0] - NA}) #winner 
#   user  system elapsed 
 # 0.016   0.000   0.014 

system.time(df1$HeightGrowth - ave(df1$Height, df1$TreeID, FUN= function(vec) 
c(NA, diff(vec #David's
 #user  system elapsed 
 # 0.136   0.000   0.137 
 system.time(dt1[,HeightGrowth:=c(NA,diff(Height)),by=TreeID])
#  user  system elapsed 
 # 0.076   0.000   0.079 


 identical(df1,as.data.frame(dt1))
#[1] TRUE
 identical(df1,df)
#[1] TRUE


head(df1,2)
#  TreeID Age   Height HeightGrowth
#1  1   1 1.105171   NA
#2  1   2 1.349859    0.2446879
head(df2,2)
# TreeID Age   Height HeightGrowth
#[1,]  1   1 1.105171   NA
#[2,]  1   2 1.349859    0.2446879

A.K.



- Original Message -
From: Trevor Walker trevordaviswal...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Monday, June 10, 2013 1:28 PM
Subject: [R] Speed up or alternative to 'For' loop

I have a For loop that is quite slow and am wondering if there is a faster
option:

df - data.frame(TreeID=rep(1:500,each=20), Age=rep(seq(1,20,1),500))
df$Height - exp(-0.1 + 0.2*df$Age)
df$HeightGrowth - NA   #intialize with NA
for (i in 2:nrow(df))
{if(df$TreeID[i]==df$TreeID[i-1])
  {df$HeightGrowth[i] - df$Height[i]-df$Height[i-1]
  }
}

Trevor Walker
Email: trevordaviswal...@gmail.com

    [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Apply a PCA to other datasets

2013-06-10 Thread Thomas Stewart
Short answer: Yes.

Long answer: Your question does not provide specific information;
therefore, I cannot provide a specific answer.


On Mon, Jun 10, 2013 at 1:23 PM, edelance delanceye...@gmail.com wrote:

 I have run a PCA on one data set.  I need the standard deviation of the
 first
 two bands for my analysis.  I now want to apply the same PCA rotation I
 used
 in the first one to all my other data sets.  Is there any way to do this in
 r?  Thanks.




 --
 View this message in context:
 http://r.789695.n4.nabble.com/Apply-a-PCA-to-other-datasets-tp4669182.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Combining CSV data

2013-06-10 Thread arun
HI,
I am not sure about your DataN1 column.  If there is any identifier to 
differentiate the comments (in this case 1,2,3), then it will easier to place 
that in the correct column.
  My previous solution is not helpful in situations like these:
dat2-read.table(text=
Row_ID_N,  Src_Row_ID,  DataN1
1a,  1,  This is comment 1
2a,  1,  This is comment 2
3a,  2,  This is comment 2
4a,  1,  This is comment 3
,sep=,,header=TRUE,stringsAsFactors=FALSE)
dat3-read.table(text=
Row_ID_N,  Src_Row_ID,  DataN1
1a,  1,  This is comment 1
2a,  1,  This is comment 2
3a,  2,  This is comment 3
4a,  1,  This is comment 3
5a,         2,  This is comment 2 
,sep=,,header=TRUE,stringsAsFactors=FALSE)


library(stringr)
library(plyr)
fun1- function(data1,data2){
    data2$DataN1- str_trim(data2$DataN1)    
    res- merge(data1,data2,by.x=1,by.y=2)
    res1- res[,-5]
    res2- 
ddply(res1,.(Row_ID_CR,Data1,Data2,Data3),summarize,DataN1=list(DataN1))
    Mx1- max(sapply(res2[,5],length))
    res3- data.frame(res2[,-5],do.call(rbind,lapply(res2[,5],function(x){
                                  indx- as.numeric(gsub([[:alpha:]],,x))
                                  x[match(seq(Mx1),indx)]
                                  })),stringsAsFactors=FALSE)
    colnames(res3)[grep(X,colnames(res3))]- 
paste0(DataComment,gsub([[:alpha:]],,colnames(res3)[grep(X,colnames(res3))]))
    res3
    }           
fun1(dat1,dat2)
#  Row_ID_CR    Data1    Data2    Data3  DataComment1
#1 1   aa   bb   cc This is comment 1
#2 2   dd   ee   ff  NA
#   DataComment2  DataComment3
#1 This is comment 2 This is comment 3
#2 This is comment 2  NA
 fun1(dat1,dat3)
#  Row_ID_CR    Data1    Data2    Data3  DataComment1
#1 1   aa   bb   cc This is comment 1
#2 2   dd   ee   ff  NA
#   DataComment2  DataComment3
#1 This is comment 2 This is comment 3
#2 This is comment 2 This is comment 3


A.K.


- Original Message -
From: arun smartpink...@yahoo.com
To: Shreya Rawal rawal.shr...@gmail.com
Cc: R help r-help@r-project.org
Sent: Monday, June 10, 2013 6:41 PM
Subject: Re: [R] Combining CSV data

Hi,
Try this:

dat1-read.table(text=
Row_ID_CR,  Data1,    Data2,    Data3
1,  aa,  bb,  cc
2,  dd,  ee,  ff
,sep=,,header=TRUE,stringsAsFactors=FALSE)

dat2-read.table(text=
Row_ID_N,  Src_Row_ID,  DataN1
1a,  1,  This is comment 1
2a,  1,  This is comment 2
3a,  2,  This is comment 1
4a,  1,  This is comment 3
,sep=,,header=TRUE,stringsAsFactors=FALSE)
library(stringr)
dat2$DataN1-str_trim(dat2$DataN1)
res- merge(dat1,dat2,by.x=1,by.y=2)
 res1-res[,-5]
library(plyr)
 res2-ddply(res1,.(Row_ID_CR,Data1,Data2,Data3),summarize, DataN1=list(DataN1))
 res2
 # Row_ID_CR    Data1    Data2    Data3
#1 1   aa   bb   cc
#2 2   dd   ee   ff
#   DataN1
#1 This is comment 1, This is comment 2, This is comment 3
#2   This is comment 1



res3-data.frame(res2[,-5],t(apply(do.call(rbind,res2[,5]),1,function(x) 
{x[duplicated(x)]-NA;x})))
 colnames(res3)[grep(X,colnames(res3))]- 
paste0(DataComment,gsub([[:alpha:]],,colnames(res3)[grep(X,colnames(res3))]))
res3
#  Row_ID_CR    Data1    Data2    Data3  DataComment1
#1 1   aa   bb   cc This is comment 1
#2 2   dd   ee   ff This is comment 1
#   DataComment2  DataComment3
#1 This is comment 2 This is comment 3
#2  NA  NA

A.K.


- Original Message -
From: Shreya Rawal rawal.shr...@gmail.com
To: r-help@r-project.org
Cc: 
Sent: Monday, June 10, 2013 4:38 PM
Subject: [R] Combining CSV data

Hello R community,

I am trying to combine two CSV files that look like this:

File A

Row_ID_CR,   Data1,    Data2,    Data3
1,                   aa,          bb,          cc
2,                   dd,          ee,          ff


File B

Row_ID_N,   Src_Row_ID,   DataN1
1a,               1,                   This is comment 1
2a,               1,                   This is comment 2
3a,               2,                   This is comment 1
4a,               1,                   This is comment 3

And the output I am looking for is, comparing the values of Row_ID_CR 

[R] padding specific missing values with NA to allow cbind

2013-06-10 Thread Rob Forsyth
Dear list

Getting very frustrated with this simple-looking problem

 m1 - lm(x~y, data=mydata)
 outliers - abs(stdres(m1))2
 plot(x~y, data=mydata)

I would like to plot a simple x,y scatter plot with labels giving custom 
information displayed for the outliers only, i.e. I would like to define a 
column mydata$labels for the mydata dataframe so that the command

 text(mydata$y, mydata$x, labels=mydata$labels)

will label those rows where outliers[i] = TRUE with text but is otherwise blank

The first problem I have is that due to some NAs in mydata, nrows(outliers)  
nrows(mydata) and I'm getting in a tangle trying to pad the appropriate rows of 
outliers

Thanks

Rob

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Help with R loop for URL download from FRED to create US time series

2013-06-10 Thread arum
I am downloading time series data from FRED. I have a working download, but I
do not want to write out the download for all 50 states likes this:
 IDRGSP -
read.table('http://research.stlouisfed.org/fred2/data/IDRGSP.txt', skip=11,
header=TRUE)
IDRGSP$DATE - as.Date(IDRGSP$DATE, '%Y-%m-%d')
IDRGSP$SERIES - 'IDRGSP'
IDRGSP$DESC -  Real Total Gross Domestic Product by State for Idaho, Mil.
of, A, NSA, 2012-06-05

WYRGSP - read.table('http://research.stlouisfed.org/fred2/data/WYRGSP.txt',
skip=11, header=TRUE)
WYRGSP$DATE - as.Date(WYRGSP$DATE, '%Y-%m-%d')
WYRGSP$SERIES - 'WYRGSP'
WYRGSP$DESC -  Real Total Gross Domestic Product by State for Wyoming,
Mil. of, A, NSA, 2012-06-05
RGSP - rbind(IDRGSP, WYRGSP)

I want to loop but I can not get the paste to work correctly. I am trying
this:  Can someone help me figure out the loop so I can build a table for
all 50 states.
ab - c(state.abb)
base - 'http://research.stlouisfed.org/fred2/data/;
type - RGSP.txt', skip=11, header=TRUE;

tmp - NULL;
for (a in ab) {
  url - paste(base, a, type, sep=);

if (is.null(tmp))
tmp - read.table(url)
  else tmp - rbind(tmp, read.table(url))
}
tmp

thanks for your help




--
View this message in context: 
http://r.789695.n4.nabble.com/Help-with-R-loop-for-URL-download-from-FRED-to-create-US-time-series-tp4669209.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] padding specific missing values with NA to allow cbind

2013-06-10 Thread William Dunlap
Try adding the argument
   na.action = na.exclude
to your call to lm().  See help(na.exclude) for details.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of Rob Forsyth
 Sent: Monday, June 10, 2013 2:42 PM
 To: r-help@r-project.org
 Subject: [R] padding specific missing values with NA to allow cbind
 
 Dear list
 
 Getting very frustrated with this simple-looking problem
 
  m1 - lm(x~y, data=mydata)
  outliers - abs(stdres(m1))2
  plot(x~y, data=mydata)
 
 I would like to plot a simple x,y scatter plot with labels giving custom 
 information
 displayed for the outliers only, i.e. I would like to define a column 
 mydata$labels for the
 mydata dataframe so that the command
 
  text(mydata$y, mydata$x, labels=mydata$labels)
 
 will label those rows where outliers[i] = TRUE with text but is otherwise 
 blank
 
 The first problem I have is that due to some NAs in mydata, nrows(outliers) 
 nrows(mydata) and I'm getting in a tangle trying to pad the appropriate rows 
 of outliers
 
 Thanks
 
 Rob
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Help with R loop for URL download from FRED to create US time series

2013-06-10 Thread jim holtman
This should do it for you:


 base - http://research.stlouisfed.org/fred2/data/;

 files - lapply(state.abb, function(.state){
+ cat(.state, \n)
+ input - read.table(paste0(base, .state, RGSP.txt)
+ , skip = 11
+ , header = TRUE
+ , as.is = TRUE
+ )
+ input$DATE - as.Date(input$DATE, %Y-%m-%d)
+ input$SERIES - paste0(.state, RGSP)
+ input
+ })
AL
AK
AZ
AR
CA
CO
CT
DE
FL
GA
HI
ID
IL
IN
IA
KS
KY
LA
ME
MD
MA
MI
MN
MS
MO
MT
NE
NV
NH
NJ
NM
NY
NC
ND
OH
OK
OR
PA
RI
SC
SD
TN
TX
UT
VT
VA
WA
WV
WI
WY

 result - do.call(rbind, files)


 str(result)
'data.frame':   750 obs. of  3 variables:
 $ DATE  : Date, format: 1997-01-01 1998-01-01 1999-01-01
2000-01-01 ...
 $ VALUE : int  122541 126309 130898 132699 133888 137086 140020 146937
150968 153681 ...
 $ SERIES: chr  ALRGSP ALRGSP ALRGSP ALRGSP ...
 head(result,30)
 DATE  VALUE SERIES
1  1997-01-01 122541 ALRGSP
2  1998-01-01 126309 ALRGSP
3  1999-01-01 130898 ALRGSP
4  2000-01-01 132699 ALRGSP
5  2001-01-01 133888 ALRGSP
6  2002-01-01 137086 ALRGSP
7  2003-01-01 140020 ALRGSP
8  2004-01-01 146937 ALRGSP
9  2005-01-01 150968 ALRGSP
10 2006-01-01 153681 ALRGSP
11 2007-01-01 155388 ALRGSP
12 2008-01-01 155870 ALRGSP
13 2009-01-01 148074 ALRGSP
14 2010-01-01 151480 ALRGSP
15 2011-01-01 150330 ALRGSP
16 1997-01-01  37249 AKRGSP
17 1998-01-01  35341 AKRGSP
18 1999-01-01  34967 AKRGSP
19 2000-01-01  34192 AKRGSP
20 2001-01-01  35729 AKRGSP
21 2002-01-01  37111 AKRGSP
22 2003-01-01  36288 AKRGSP
23 2004-01-01  38179 AKRGSP
24 2005-01-01  37774 AKRGSP
25 2006-01-01  39836 AKRGSP
26 2007-01-01  40694 AKRGSP
27 2008-01-01  41039 AKRGSP
28 2009-01-01  44030 AKRGSP
29 2010-01-01  43591 AKRGSP
30 2011-01-01  44702 AKRGSP




On Mon, Jun 10, 2013 at 7:42 PM, arum arumk...@wrdf.org wrote:

 I am downloading time series data from FRED. I have a working download,
 but I
 do not want to write out the download for all 50 states likes this:
  IDRGSP -
 read.table('http://research.stlouisfed.org/fred2/data/IDRGSP.txt',
 skip=11,
 header=TRUE)
 IDRGSP$DATE - as.Date(IDRGSP$DATE, '%Y-%m-%d')
 IDRGSP$SERIES - 'IDRGSP'
 IDRGSP$DESC -  Real Total Gross Domestic Product by State for Idaho, Mil.
 of, A, NSA, 2012-06-05

 WYRGSP - read.table('http://research.stlouisfed.org/fred2/data/WYRGSP.txt
 ',
 skip=11, header=TRUE)
 WYRGSP$DATE - as.Date(WYRGSP$DATE, '%Y-%m-%d')
 WYRGSP$SERIES - 'WYRGSP'
 WYRGSP$DESC -  Real Total Gross Domestic Product by State for Wyoming,
 Mil. of, A, NSA, 2012-06-05
 RGSP - rbind(IDRGSP, WYRGSP)

 I want to loop but I can not get the paste to work correctly. I am trying
 this:  Can someone help me figure out the loop so I can build a table for
 all 50 states.
 ab - c(state.abb)
 base - 'http://research.stlouisfed.org/fred2/data/;
 type - RGSP.txt', skip=11, header=TRUE;

 tmp - NULL;
 for (a in ab) {
   url - paste(base, a, type, sep=);

 if (is.null(tmp))
 tmp - read.table(url)
   else tmp - rbind(tmp, read.table(url))
 }
 tmp

 thanks for your help




 --
 View this message in context:
 http://r.789695.n4.nabble.com/Help-with-R-loop-for-URL-download-from-FRED-to-create-US-time-series-tp4669209.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Substituting the values on the y-axis

2013-06-10 Thread Jim Lemon

On 06/11/2013 12:26 AM, diddle1...@fastwebnet.it wrote:

Hello,

I plotted a graph on R showing how salinity (in ‰, y-axis) changes with time(in
years, x-axis). However, right from the beginning on the Excel spreadsheet the v
alues for salinity appeared as, for example, 35000‰ instead of 35‰, which I gues
sed must have been a typing error for the website from which I extracted the dat
a (NOAA).Thus, I now would like to substitute these values with the correspondin
g smaller value, as it follows:

25000 35000-  25, 35   and so on.

Is there any way I can change this on R or do I have to modify these numbers bef
ore inputting the data on R (for example on Excel)? If so, can anybody tell me h
ow to do either of these?


Hi Emanuela,
I think that the axis.mult function in the plotrix package will do what 
you want with mult=0.001. Obviously you won't want to display the 
transformation, so set mult.label=.


Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Problem with ODBC connection

2013-06-10 Thread Jeff Newmiller
Given the resounding silence, I would venture to guess that no-one here is 
interested in troubleshooting ODBC connections to Excel. The problem is most 
likely in the ODBC driver for Excel (not in R or RODBC), and Excel is NOT a 
database (so any data format problem is unlikely to be detected).
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

Christofer Bogaso bogaso.christo...@gmail.com wrote:

Any response please? Was my question not clear to the list? Please let
me
know.

Thanks and regards,

-- Forwarded message --
From: Christofer Bogaso bogaso.christo...@gmail.com
Date: Sat, Jun 8, 2013 at 9:39 PM
Subject: Re: Problem with ODBC connection
To: r-help r-help@r-project.org


Hello All,

My previous post remains unanswered probably because the attachment was
not
working properly.

So I am re-posting it again.

My problem is in reading an Excel-2003 file through ODBC connection
using
RODBC package. Let say I have this Excel file:

http://www.2shared.com/document/HS3JeFyW/MyFile.html


I saved it in my F: drive and tried reading the contents using RODBC
connection:

 library(RODBC)
 MyData - sqlFetch(odbcConnectExcel(f:/MyFile.xls), )
 head(MyData, 30)


However it looks that the second column (with header 's') is not read
properly.

Can somebody here explain this bizarre thing? Did I do something wrong
in
reading that?

Really appreciate if someone could point out anything what might go
wrong.

Thanks and regards,


On Fri, Jun 7, 2013 at 4:46 PM, Christofer Bogaso 
bogaso.christo...@gmail.com wrote:

 Hello again,

 I am having problem with ODBC connection using the RODBC package.

 I am basically trying to read the attached Excel-2003 file using
RODBC
 package. Here is my code:

  head(sqlFetch(odbcConnectExcel(d:/1.xls), ), 30);
 odbcCloseAll()
Criteria  s  d fd  ffd1
 f1fd2f2 fd3 f3 F12 F13 F14 F15 F16
F17
 F18 F19 F20
 1 a NA NA NA NA 0.
 0.27755576 -0.00040332321NA  NA NA 
NA
  NA  NA  NA  NA  NA  NA  NA  NA
 2 s NA  0 NA NA 0.
 0.  0.000NA  NA NA 
NA
  NA  NA  NA  NA  NA  NA  NA  NA
 3 d NA  0 NA NA 0.01734723
 0.06938894  0.2775558  5.00  NA NA 
NA
  NA  NA  NA  NA  NA  NA  NA  NA
 4 f NA NA NA NA NA
 NA NA -4.25  NA NA  NA  NA  NA  NA  NA 
NA
  NA  NA  NA
 5 f NA  0 NA NA 0.
 0.  0.000 -1.53  NA NA 
NA
  NA  NA  NA  NA  NA  NA  NA  NA
 6 f NA NA NA NA NA
 NA  0.000  0.00  NA NA  NA  NA  NA  NA  NA 
NA
  NA  NA  NA
 7 f NA NA NA NA NA
 NA  0.000NA  NA NA  NA  NA  NA  NA  NA 
NA
  NA  NA  NA
 8 f NA  0 NA NA NA
 NA NANA  NA NA  NA  NA  NA  NA  NA 
NA
  NA  NA  NA
 9 f NA  0 NA NA NA
 NA NANA  NA NA  NA  NA  NA  NA  NA 
NA
  NA  NA  NA
 10f NA NA NA NA NA
 NA NANA  NA NA  NA  NA  NA  NA  NA 
NA
  NA  NA  NA
 11f NA NA NA NA NA
 NA NANA  NA NA  NA  NA  NA  NA  NA 
NA
  NA  NA  NA
 12f NA NA NA NA NA
 NA NANA  NA NA  NA  NA  NA  NA  NA 
NA
  NA  NA  NA
 13f NA NA NA NA NA
 NA NANA  NA NA  NA  NA  NA  NA  NA 
NA
  NA  NA  NA

 Here you see the data in second column could not read at all.

 Can somebody point me if I did something wrong?

 Thanks and regards,


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide 

Re: [R] Fwd: Problem with ODBC connection

2013-06-10 Thread Daniel Nordlund
 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 On Behalf Of Jeff Newmiller
 Sent: Monday, June 10, 2013 9:45 PM
 To: Christofer Bogaso; r-help
 Subject: Re: [R] Fwd: Problem with ODBC connection
 
 Given the resounding silence, I would venture to guess that no-one here is
 interested in troubleshooting ODBC connections to Excel. The problem is
 most likely in the ODBC driver for Excel (not in R or RODBC), and Excel is
 NOT a database (so any data format problem is unlikely to be detected).
 --
 -
 Jeff NewmillerThe .   .  Go
 Live...
 DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live
 Go...
   Live:   OO#.. Dead: OO#..  Playing
 Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
 /Software/Embedded Controllers)   .OO#.   .OO#.
 rocks...1k
 --
 -
 Sent from my phone. Please excuse my brevity.
 
 Christofer Bogaso bogaso.christo...@gmail.com wrote:
 

I tried reading your workbook using your code, i.e.

library(RODBC)
MyData - sqlFetch(odbcConnectExcel('mypath/Myfile.xls'), )
head(MyData, 30)

and got an error message saying that odbcConnectExcel is only usable with 
32-bit Windows and I have a 64-bit system, so I can't help you there.  But 
there are many other options in R for reading Excel workbooks.  I was able to 
read your data using the read.xls function from the gdata package.  I am not 
endorsing that package, it just happened to be the first package on my system 
that I tried.  

So if you can't read the data one way, try another.  You could install and load 
the sos package and runthe following function

findFn('xls')

and you will get all sorts of suggestions.


Hope this is helpful,

Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Fwd: Problem with ODBC connection

2013-06-10 Thread jwd
On Tue, 11 Jun 2013 02:19:14 +0545
Christofer Bogaso bogaso.christo...@gmail.com wrote:

Any real answer would be contingent on a reader being provided a
reproducible example. Since you don't provide that, there's not a lot
of point to an answer. However, to tilt at a windmill, depending on the
size and complexity of your data file, it might be easier to simply
export the data from Excel as a csv file and use read.table to bring it
in to R.

JWDougherty

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How can we access an element in a structure

2013-06-10 Thread jpm miao
Hi,

  I have a structure, which is the result of a function
  How can I access the elements in the gradient?

 dput(test1)
structure(-1.17782911684913, gradient = structure(c(-0.0571065371783791,
-0.144708170683529), .Dim = 1:2, .Dimnames = list(NULL, c(x1,
x2
 test1[[1]]
[1] -1.177829
 test1
[1] -1.177829
attr(,gradient)
  x1 x2
[1,] -0.05710654 -0.1447082
 test1[gradient]
[1] NA


  Thanks,

Miao

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.