date:20111117

Well, if your problem is that a workspace is being loaded automatically
and you don't want that workspace, you have several options:

1. Use a different directory for each project so that the file loaded
by default is the correct one.

2. Don't save your workspace, but regenerate it each time.

3. Use R --vanilla or your OS's equivalent to start R without loading anything
automatically, and use load() and save() to manually manage RData files.

Yes, it's convenient, but if you want to use a non-standard way of working
you need to understand what you're doing.

Sarah

On Thu, Nov 17, 2011 at 3:10 AM, Steven Yen s...@utk.edu wrote:
 Thanks Sarah. I have read about the problems with attach(), and I will try
 to avoid it.
 I have now found the line that's causing the problem is:

 setwd(z:/homework)

 With that line in place, either in a program or in Rprofile.site (?), then
 the moment I run R and simply enter (before reading any data)
 summary(mydata)
 I get sample statistics for a dozen variables!

 Do not save the workspace? I thought the option to save/use a binary file is
 meant to be convenient.

 I like working in the same working directory, and I like .rdata files. Does
 this sound hopeless? Thanks.

 At 09:26 PM 11/15/2011, Sarah Goslee wrote:

 Hi,

 The obvious answer is don't use attach() and you'll never have
 that problem. And see further comments inline.

 On Tue, Nov 15, 2011 at 6:05 PM, Steven Yen s...@utk.edu wrote:
 Can someone help me with this variable/data reading issue?
 I read a csv file and transform/create an additional variable (called y).

 The first set of commands below produced different sample statistics
 for hw11$y and y

 In the second set of command I renameuse the variable name yy, and
 sample statistics for $hw11$yy and yy are identical.

 Using y - yy fixed it, but I am not sure why I would need to do that.

 That y appeared to have come from a variable called y from
 another data frame (unrelated to the current run).

 Help!

   setwd(z:/homework)
   sink (z:/homework/hw11.our, append=T, split=T)
   hw11 - read.csv(ij10b.csv,header=T)
   hw11$y - hw11$e3
   attach(hw11)
 The following object(s) are masked _by_ '.GlobalEnv':
 y

 Look there. R even *told* you that it was going to use the
 y in the global environment rather than the one you were
 trying to attach.

 The other solution: don't save your workspace. Your other
 email on this topic suggested to me that there is a .RData
 file in your preferred working directory that contains an
 object y, and that's what is interfering with what you think
 should happen.

 Deleting that file, or using a different directory, or removing
 y before you attach the data frame would all work.

 But truly, the best possible strategy is to avoid using attach()
 so you don't have to worry about which object named y is
 really being used because you specify it explicitly.


   (n - dim(hw11)[1])
 [1] 13765
   summary(hw11$y)
 Min.  1st Qu.   Median Mean  3rd Qu. Max.
   0.   0.4500   1.   1.6726   2. 140.
   length(hw11$y)
 [1] 13765
   summary(y)
    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.
 0.0 0.0 0.0 0.24958 0.0 1.0
   length(y)
 [1] 601
  

   setwd(z:/homework)
   sink (z:/homework/hw11.our, append=T, split=T)
   hw11 - read.csv(ij10b.csv,header=T)
   hw11$yy - hw11$e3
   attach(hw11)
   hw11$yy - hw11$e3
   summary(hw11$yy)
 Min.  1st Qu.   Median Mean  3rd Qu. Max.
   0.   0.4500   1.   1.6726   2. 140.
   length(hw11$yy)
 [1] 13765
   summary(yy)
 Min.  1st Qu.   Median Mean  3rd Qu. Max.
   0.   0.4500   1.   1.6726   2. 140.
   length(yy)
 [1] 13765
  



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to define the bound between parameters in nls()

2011-11-17 Thread Jinsong Zhao


Hi there,

I have read the help page of nls(), there is lower or upper for defining 
the bounds of parameters. For example,


nls(y ~ 1-a*exp(-k1*x)-(1-a)*exp(-k2*x), data=data.1, start=list(a=0.02, 
k1=0.01, k2=0.0004), upper=c(1,1,1), lower=c(0,0,0))


I hope to define k1  k2, but I don't find a way.

Any suggestions will be really appreciated.

Regards,
Jinsong

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to read the text ?

2011-11-17 Thread haohao Tsing

hi,R users:
 I have such a text

num = 3
testco = 12
testno = 1;12;3
infp = test1;test2;test3
How can I read this text by readLines?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RV: Reporting a conflict between ADMB and Rtools on Windows systems

2011-11-17 Thread Gabor Grothendieck

On Thu, Nov 17, 2011 at 3:54 AM, Rubén Roa r...@azti.es wrote:

 I've just found that there is a conflict between tools used to build R 
 packages (Rtools) and ADMB due to the need to put Rtools compiler's location 
 in the PATH environmental variable to make Rtools work.

 On a Windows 7 64bit  with Rtools installed I installed ADMB-IDE latest 
 version and although I could translate ADMB code to cpp code I could not 
 build the cpp code into an executable via ADMB-IDE's compiler.

 On another Windows machine, a Windows Vista 32bits with Rtools installed I 
 also installed the latest ADMB-IDE and this time it was not possible to 
 create the .obj file on the way to build the executable when building with 
 ADMB-IDE. On this machine I also have a previous ADMB version (6.0.1) that I 
 used to run from the DOS shell. This ADMB also failed to build the .obj file.

 Now, going to PATH, the location info to make Rtools is:
 c:\Rtools\bin;c:\Rtools\MinGW\bin;c:\Rtools\MinGW64\bin;C:\Program Files 
 (x86)\MiKTeX 2.9\miktex\bin;
 If from this list I remove the reference to the compiler
 c:\Rtools\MinGW\bin
 then ADMB works again.
 So beware of this conflict. Suggestion of a solution will be appreciated. 
 Meanwhile, I run ADMB code in one computer and build R
 packages with Rtools in another computer.

The batchfiles Rcmd.bat, Rgui.bat temporarily add R and Rtools to your
path by looking them up in the registry and then calling Rcmd.exe or
Rgui.exe respectively.  When R is finished the path is restored to
what it was before.  By using those its not necessary to have either
on your path.These are self contained batch files with no
dependencies so can simply be placed anywhere on the path in order to
use them.

For those and a few other batch files of interest to Windows users of R see:
http://batchfiles.googlecode.com


-- 
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Spatial Statistics using R

2011-11-17 Thread vioravis

Thanks, Raphael. Just checked their website. It appears that they currently
do not have any online courses planned. 

--
View this message in context: 
http://r.789695.n4.nabble.com/Spatial-Statistics-using-R-tp4079092p4079574.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorizing for weighted distance

The fastest is probably to just implement the matrix calculation
directly in R with the %*% operator.

(X1-X2) %*% W %*% (X1-X2)

You don't need to worry about the transposing if you are passing R
vectors X1,X2. If they are 1-d matrices, you might need to.

Michael

On Thu, Nov 17, 2011 at 1:30 AM, Sachinthaka Abeywardana
sachin.abeyward...@gmail.com wrote:
 Hi All,

 I am trying to convert the following piece of matlab code to R:

 XX1 = sum(w(:,ones(1,N1)).*X1.*X1,1);          #square the elements of X1,
 weight it and repeat this vector N1 times
 XX2 = sum(w(:,ones(1,N2)).*X2.*X2,1);          #square the elements of X2,
 weigh and repeat this vector N2 times
 X1X2 = (w(:,ones(1,N1)).*X1)'*X2;                 #get the weighted
 'covariance' term
 XX1T = XX1';                                              #transpose
 z = XX1T(:,ones(1,N2)) + XX2(ones(1,N1),:) - 2*X1X2;            #get the
 squared weighted distance

 which is basically doing: z=(X1-X2)' W (X1-X2)

 What would the best way (for SPEED) to do this? or is vectorizing as above
 the best? Any hints, suggestions?

 Thanks,
 Sachin

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Exclude NA while summing

Or, for this specific application

rowSums(XXX, na.rm = TRUE)

Michael

On Thu, Nov 17, 2011 at 5:51 AM, Jim Holtman jholt...@gmail.com wrote:
 change to

 row.sums.m - apply(dummy.curr.res.m,1,sum, na.rm = TRUE)

 Sent from my iPad

 On Nov 17, 2011, at 5:18, Vikram Bahure economics.vik...@gmail.com wrote:

 Dear R users,

 I am new to R and have some query.

 I am having a dataset with binary output 0's and ones. But along with it it
 has NA's too. I want to sum all the rows and get the sum total for each
 column.

 But whenever there is a NA in an row the sum of the row is returned as NA
 so I am not able to sum up the values.

 *row.sums.m - apply(dummy.curr.res.m,1,sum)*

 It would be helpful if I get some input on this.


 Regards
 Vikram

    [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RV: Reporting a conflict between ADMB and Rtools on Windows systems

2011-11-17 Thread Rubén Roa

Thanks Gabor and Jan.
The batch files solution seems the way to go.
Will implement it!

Rubén

-Mensaje original-
De: Gabor Grothendieck [mailto:ggrothendi...@gmail.com] 
Enviado el: jueves, 17 de noviembre de 2011 13:58
Para: Rubén Roa
CC: r-help@r-project.org
Asunto: Re: [R] RV: Reporting a conflict between ADMB and Rtools on Windows 
systems

On Thu, Nov 17, 2011 at 3:54 AM, Rubén Roa r...@azti.es wrote:

 I've just found that there is a conflict between tools used to build R 
 packages (Rtools) and ADMB due to the need to put Rtools compiler's location 
 in the PATH environmental variable to make Rtools work.

 On a Windows 7 64bit  with Rtools installed I installed ADMB-IDE latest 
 version and although I could translate ADMB code to cpp code I could not 
 build the cpp code into an executable via ADMB-IDE's compiler.

 On another Windows machine, a Windows Vista 32bits with Rtools installed I 
 also installed the latest ADMB-IDE and this time it was not possible to 
 create the .obj file on the way to build the executable when building with 
 ADMB-IDE. On this machine I also have a previous ADMB version (6.0.1) that I 
 used to run from the DOS shell. This ADMB also failed to build the .obj file.

 Now, going to PATH, the location info to make Rtools is:
 c:\Rtools\bin;c:\Rtools\MinGW\bin;c:\Rtools\MinGW64\bin;C:\Program 
 Files (x86)\MiKTeX 2.9\miktex\bin; If from this list I remove the 
 reference to the compiler c:\Rtools\MinGW\bin then ADMB works again.
 So beware of this conflict. Suggestion of a solution will be 
 appreciated. Meanwhile, I run ADMB code in one computer and build R packages 
 with Rtools in another computer.

The batchfiles Rcmd.bat, Rgui.bat temporarily add R and Rtools to your path by 
looking them up in the registry and then calling Rcmd.exe or Rgui.exe 
respectively.  When R is finished the path is restored to what it was before.  
By using those its not necessary to have either
on your path.These are self contained batch files with no
dependencies so can simply be placed anywhere on the path in order to use them.

For those and a few other batch files of interest to Windows users of R see:
http://batchfiles.googlecode.com


--
Statistics  Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

On Wed, Nov 16, 2011 at 11:22 PM, muzz56 musah...@gmail.com wrote:
Thanks to everyone who replied to my post, I finally got it to work. I am
however not sure how well it worked since it run so quickly, but seems like
I have a 2000 x 2000 data set.

Behold the great and mighty power that is R! Don't worry -- on a
decent machine the correlation of a 2k x 2k data set should be pretty
fast. (It's about 9 seconds on my old-ish laptop with a bunch of other
junk running)

My followup questions would be, how do I get
only pairs with say a certain pearson correlation value additionally it
seems like my output didn't retain the headers but instead replaced them
with numbers making it hard to know which gene pairs correlate.

This is a little worrisome: R carries column names through cor() so
this would suggest you weren't using them. Were your headers listed as
part of your data (instead of being names)? If so, they would have
been taken as numbers.

Take a look at dimnames(NAMEOFDATA) -- if your headers aren't there,
then they are being treated as data instead of numbers. If they are,
can you provide some reproducible code and we can debug more fully.
The easiest way to send data is to use the dput() function to get a
copy-pasteable plain text representation. It would also be great if
you could restrict it to a subset of your data rather than the full 4M
data points, but if that's hard to do, don't worry.

You should have expected behavior like

X - matrix(1:9,3)
colnames(X) - c(A,B,C)
cor(X) # Prints with labels

Michael

On 16 November 2011 17:11, Nordlund, Dan (DSHS/RDA) [via R]
ml-node+s789695n4078114...@n4.nabble.com wrote:

-Original Message-
From: [hidden
email]http://user/SendEmail.jtp?type=nodenode=4078114i=0[mailto:
r-help-bounces@r-
project.org] On Behalf Of muzz56
Sent: Wednesday, November 16, 2011 12:28 PM
To: [hidden email]http://user/SendEmail.jtp?type=nodenode=4078114i=1
Subject: Re: [R] Pairwise correlation

Thanks Peter. I tried this after reading in the csv (read.csv) and
converted the data to matrix (as.matrix). But when I tried the
correlation,
I keeping getting the error (x must be numeric) yet when I view the
data,
its numeric.

What does R tell you if you execute the following?

str(x)

Just because the data looks like it is numeric when it prints doesn't mean
it is.

Dan

Daniel J. Nordlund
Washington State Department of Social and Health Services
Planning, Performance, and Accountability
Research and Data Analysis Division
Olympia, WA 98504-5204

__
[hidden email] http://user/SendEmail.jtp?type=nodenode=4078114i=2mailing
list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

--
If you reply to this email, your message will be added to the discussion
below:
http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4078114.html
To unsubscribe from Pairwise correlation, click
herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4076963code=bXVzYWhhc3NAZ21haWwuY29tfDQwNzY5NjN8LTE5ODYxNDM0OTI=
.
NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.InstantMailNamespacebreadcrumbs=instant+emails%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml

--
View this message in context:
http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4078915.html
Sent from the R help mailing list archive at Nabble.com.
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R2 for a mixed-effects model with AR(1) error structure

2011-11-17 Thread FMH



Dear All,

The following equation is a linear mixed-effects  model with linear trend and 
AR(1) error structure,

y = B0 + B1x + bo + b1x + e; e~AR(1) 

where  y is a response, x is the predictor, B0 and B1 are fixed effects and b0 
and b1 are random effects. 

Coud someone please advice me a function to compute the R2 for the goodness of 
fit of the above model in R package, and  are the computation for R2 in the 
mixed model with AR(1) error structure similar to the mixed model with no error 
structure?

thanks,
Fir
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Spatial Statistics using R

2011-11-17 Thread Roger Bivand

vioravis vioravis at gmail.com writes:
 
 Thanks, Raphael. Just checked their website. It appears that they currently
 do not have any online courses planned. 
 

You may find that this site:

http://geostat-course.org/

has a wider range of possible courses.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Spatial Statistics using R

2011-11-17 Thread jon . skoien


Hi Ravi,

You would probably get more answers to this if you posted to the list 
r-sig-geo. The following course was advertised a week ago and might 
match your needs:

http://www.itc.nl/personal/rossiter/teach/degeostats.html
You might also find the videos from this years' GEOSTAT course in Landau 
interesting:

http://www.archive.org/search.php?query=GEOSTAT%20Landau

Cheers,
Jon

On 17-Nov-11 7:28, vioravis wrote:

I am looking for online courses to learn Spatial Statistics using R.
Statistics.com is offering an online course in December on the same topic
but that schedule doesn't suit mine. Are there any other similar modes for
learning spatial statistics using R??? Can someone please advice???

Thank you.

Ravi

--
View this message in context: 
http://r.789695.n4.nabble.com/Spatial-Statistics-using-R-tp4079092p4079092.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Jon Olav Skøien
Joint Research Centre - European Commission
Institute for Environment and Sustainability (IES)
Global Environment Monitoring Unit

Via Fermi 2749, TP 440,  I-21027 Ispra (VA), ITALY

jon.sko...@jrc.ec.europa.eu
Tel:  +39 0332 789206

Disclaimer: Views expressed in this email are those of the individual and do 
not necessarily represent official views of the European Commission.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Spatial Statistics using R

2011-11-17 Thread Paul Hiemstra

On 11/17/2011 06:28 AM, vioravis wrote:
 I am looking for online courses to learn Spatial Statistics using R.
 Statistics.com is offering an online course in December on the same topic
 but that schedule doesn't suit mine. Are there any other similar modes for
 learning spatial statistics using R??? Can someone please advice???

 Thank you. 

 Ravi

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Spatial-Statistics-using-R-tp4079092p4079092.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
There is an online course by the ITC:

http://www.itc.nl/Pub/Study/Courses/C12-GFM-DE-02

cheers,
Paul

-- 
Paul Hiemstra, Ph.D.
Global Climate Division
Royal Netherlands Meteorological Institute (KNMI)
Wilhelminalaan 10 | 3732 GK | De Bilt | Kamer B 3.39
P.O. Box 201 | 3730 AE | De Bilt
tel: +31 30 2206 494

http://intamap.geo.uu.nl/~paul
http://nl.linkedin.com/pub/paul-hiemstra/20/30b/770

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pairwise correlation

I think something like this should do it, but I can't test without data:

rownames(mydata) - mydata[,1] # Put the elements in the first column
as rownames
mydata - mydata[,-1] # drop the things that are now rownames

Michael

On Thu, Nov 17, 2011 at 9:23 AM, Musa Hassan musah...@gmail.com wrote:
 Hi Michael,
 Thanks for the response. I have noticed that the error occurred during my
 data read. It appears that the rownames (which when the data is transposed
 become my colnames) were converted to numbers instead of strings as they
 should be. The original header names don't change, just the rownames. I have
 to figure out how to import the data and have the strings not converted.
 Right now am using:
 mydata = read.csv(mydata.csv, headers=T,stringsAsFactors=F)

 then to convert the data frame to matrix
 mydata=data.matrix(mydata)

 Then I just do the correlation as Peter suggested.

 expression=cor(t(expression))

 Thanks.

 On 17 November 2011 08:51, R. Michael Weylandt michael.weyla...@gmail.com
 wrote:

 On Wed, Nov 16, 2011 at 11:22 PM, muzz56 musah...@gmail.com wrote:
  Thanks to everyone who replied to my post, I finally got it to work. I
  am
  however not sure how well it worked since it run so quickly, but seems
  like
  I have a 2000 x 2000 data set.

 Behold the great and mighty power that is R! Don't worry -- on a
 decent machine the correlation of a 2k x 2k data set should be pretty
 fast. (It's about 9 seconds on my old-ish laptop with a bunch of other
 junk running)

   My followup questions would be, how do I get
  only pairs with say a certain pearson correlation value additionally it
  seems like my output didn't retain the headers but instead replaced them
  with numbers making it hard to know which gene pairs correlate.

 This is a little worrisome: R carries column names through cor() so
 this would suggest you weren't using them. Were your headers listed as
 part of your data (instead of being names)? If so, they would have
 been taken as numbers.

 Take a look at dimnames(NAMEOFDATA) -- if your headers aren't there,
 then they are being treated as data instead of numbers. If they are,
 can you provide some reproducible code and we can debug more fully.
 The easiest way to send data is to use the dput() function to get a
 copy-pasteable plain text representation. It would also be great if
 you could restrict it to a subset of your data rather than the full 4M
 data points, but if that's hard to do, don't worry.

 You should have expected behavior like

 X - matrix(1:9,3)
 colnames(X) - c(A,B,C)
 cor(X) # Prints with labels

 Michael

 
  On 16 November 2011 17:11, Nordlund, Dan (DSHS/RDA) [via R] 
  ml-node+s789695n4078114...@n4.nabble.com wrote:
 
   -Original Message-
   From: [hidden
   email]http://user/SendEmail.jtp?type=nodenode=4078114i=0[mailto:
  r-help-bounces@r-
   project.org] On Behalf Of muzz56
   Sent: Wednesday, November 16, 2011 12:28 PM
   To: [hidden
   email]http://user/SendEmail.jtp?type=nodenode=4078114i=1
   Subject: Re: [R] Pairwise correlation
  
   Thanks Peter. I tried this after reading in the csv (read.csv) and
   converted the data to matrix (as.matrix). But when I tried the
   correlation,
   I keeping getting the error (x must be numeric) yet when I view the
   data,
   its numeric.
  
 
  What does R tell you if you execute the following?
 
  str(x)
 
  Just because the data looks like it is numeric when it prints doesn't
  mean
  it is.
 
 
  Dan
 
  Daniel J. Nordlund
  Washington State Department of Social and Health Services
  Planning, Performance, and Accountability
  Research and Data Analysis Division
  Olympia, WA 98504-5204
 
 
  __
  [hidden email]
  http://user/SendEmail.jtp?type=nodenode=4078114i=2mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
  --
   If you reply to this email, your message will be added to the
  discussion
  below:
 
  http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4078114.html
   To unsubscribe from Pairwise correlation, click
  herehttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_codenode=4076963code=bXVzYWhhc3NAZ21haWwuY29tfDQwNzY5NjN8LTE5ODYxNDM0OTI=
  .
 
  NAMLhttp://r.789695.n4.nabble.com/template/NamlServlet.jtp?macro=macro_viewerid=instant_html%21nabble%3Aemail.namlbase=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.InstantMailNamespacebreadcrumbs=instant+emails%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml
 
 
 
  --
  View this message in context:
  http://r.789695.n4.nabble.com/Pairwise-correlation-tp4076963p4078915.html
  Sent from the R help mailing list archive at Nabble.com.
         [[alternative HTML version

[R] how to read a freetext line ?

2011-11-17 Thread Jie TANG

hi everyone .
 Here I have a text where there are some integer and string variables.But I
can not read them by readLines and scan

the text is :

weight ;30;130
food:2;1;12
color:white;black

the first column is the names of the variables and others are the value of
them.
 the column in different line are different.
Can anyone help me ?
-- 
TANG Jie
Email: totang...@gmail.com
Tel: 0086-2154896104
Shanghai Typhoon Institute,China

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to read a freetext line ?

Hi,

On Thu, Nov 17, 2011 at 9:37 AM, Jie TANG totang...@gmail.com wrote:
 hi everyone .
  Here I have a text where there are some integer and string variables.But I
 can not read them by readLines and scan

I've seen this question several times this morning. If that's you,
please do not post multiple times. If you haven't gotten an answer in
a couple days, then it's okay to ask again, but the trouble is usually
with your question, like here.

 the text is :

 weight ;30;130
 food:2;1;12
 color:white;black

 the first column is the names of the variables and others are the value of
 them.
  the column in different line are different.
 Can anyone help me ?

What have you tried?
What format do you need? For instance, reading them in as a single
string is easy. Using strsplit() to separate that single string into
several strings is easy. But without knowing what you are trying to
achieve, there's really no way to help you beyond suggesting those two
functions.

Sarah
-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] package installtion

2011-11-17 Thread Scott Raynaud

I believe the problem is a column of zeroes in my x matrix.  I have tried the 
suggestions in the documentation, 
so now to try to confirm the probelm I'd like to run debug.  Here's where I 
think the problem is:

###~~  Fitting the model using lmer funtion    ~~###
(fitmodel - lmer(modelformula,data,family=binomial(link=logit),nAGQ=1))
mtrace(fitmodel)

I added the mtrace to catch the error, but get the following:

Error in mtrace(fitmodel) : Can't find fitmodel

How can I debug this?

 
- Original Message -
From: Rolf Turner rolf.tur...@xtra.co.nz
To: Scott Raynaud scott.rayn...@yahoo.com
Cc: r-help@r-project.org r-help@r-project.org
Sent: Wednesday, November 16, 2011 6:04 PM
Subject: Re: [R] package installtion

On 17/11/11 05:37, Scott Raynaud wrote:
 That might be an option if it weren't my most important predictor.  I'm 
 thinking my best bet is to use MLWin for the estimation since it will 
 properly set fixed effects
  to 0.  All my other sample size simulation programs use SAS PROC IML which I 
don't have/can't afford.  I like R since it's free, but I can't work around 
the problem
 I'm currently having.

This is the ``push every possible button until you get a result and to hell 
with what
anything actually means'' approach to statistics.  The probability of getting a
*meaningful* result from this approach is close to zero.

Why don't you try to *understand* what is going on, rather than wildly throwing
every possible piece of software at the problem until one such piece runs?

    cheers,

        Rolf Turner


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] List of lists to data frame?

2011-11-17 Thread Bert Gunter

I don't know if this is faster, but ...

out - do.call(rbind,
   lapply(s, function(x)data.frame(x$category,x$name,as.vector(x$series

## You can then name the columns of out via names()

Note: No fancy additional packages are required.

-- Bert

On Wed, Nov 16, 2011 at 6:39 PM, Kevin Burton rkevinbur...@charter.net wrote:
 Say I have the following data:

 s - list()
 s[[A]] - list(name=first, series=ts(rnorm(50), frequency=10,
 start=c(2000,1)), category=top)
 s[[B]] - list(name=second, series=ts(rnorm(60), frequency=10,
 start=c(2000,2)), category=next)

 If I use unlist since this is a list of lists I don't end up with a data
 frame. And the number of rows in the data frame should equal the number of
 time series entries. In the sample above it would be 110. I would expect
 that the name and category strings would be recycled for each row. My brute
 force code attempts to build the data frame by appending to the master data
 frame but like I said it is *very* slow.

 Kevin

 -Original Message-
 From: R. Michael Weylandt [mailto:michael.weyla...@gmail.com]
 Sent: Wednesday, November 16, 2011 5:26 PM
 To: rkevinbur...@charter.net
 Cc: r-help@r-project.org
 Subject: Re: [R] List of lists to data frame?

 unlist(..., recursive = F)

 Michael

 On Wed, Nov 16, 2011 at 6:20 PM,  rkevinbur...@charter.net wrote:

 I would like to make the following faster:

        df - NULL
        for(i in 1:length(s))
        {
                df - rbind(df, cbind(names(s[i]), time(s[[i]]$series),
 as.vector(s[[i]]$series), s[[i]]$category))
        }
        names(df) - c(name, time, value, category)
        return(df)

 The s object is a list of lists. It is constructed like:

 s[[object]] - list(. . . . . .)

 where object would be the name associated with this list
 s[[i]]$series is a 'ts' object and s[[i]]$category is a name.

 Constructing this list is reasonably fast but to do some more
 processing on the data it would be easier if it were converted to a data
 frame.
 Right now the above code is unacceptably slow at converting this list
 of lists to a data frame. Any suggestions on how to optimize this are
 welcome.

 Thank you.

 Kevin

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package installtion

Why are you trying to take the matrix trace of a regression model?
(That's the only hit for mtrace on my system at least)

Perhaps you mean to use traceback() or, even more useful,
options(error = recover)

Michael

On Thu, Nov 17, 2011 at 9:49 AM, Scott Raynaud scott.rayn...@yahoo.com wrote:
 I believe the problem is a column of zeroes in my x matrix.  I have tried the 
 suggestions in the documentation,
 so now to try to confirm the probelm I'd like to run debug.  Here's where I 
 think the problem is:

 ###~~  Fitting the model using lmer funtion    ~~###
 (fitmodel - lmer(modelformula,data,family=binomial(link=logit),nAGQ=1))
 mtrace(fitmodel)

 I added the mtrace to catch the error, but get the following:

 Error in mtrace(fitmodel) : Can't find fitmodel

 How can I debug this?


 - Original Message -
 From: Rolf Turner rolf.tur...@xtra.co.nz
 To: Scott Raynaud scott.rayn...@yahoo.com
 Cc: r-help@r-project.org r-help@r-project.org
 Sent: Wednesday, November 16, 2011 6:04 PM
 Subject: Re: [R] package installtion

 On 17/11/11 05:37, Scott Raynaud wrote:
 That might be an option if it weren't my most important predictor.  I'm 
 thinking my best bet is to use MLWin for the estimation since it will 
 properly set fixed effects
  to 0.  All my other sample size simulation programs use SAS PROC IML which 
I don't have/can't afford.  I like R since it's free, but I can't work around 
the problem
 I'm currently having.

 This is the ``push every possible button until you get a result and to hell 
 with what
 anything actually means'' approach to statistics.  The probability of getting 
 a
 *meaningful* result from this approach is close to zero.

 Why don't you try to *understand* what is going on, rather than wildly 
 throwing
 every possible piece of software at the problem until one such piece runs?

     cheers,

         Rolf Turner


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to read the text ?

See Sarah's reply here:
http://www.mail-archive.com/r-help@r-project.org/msg152883.html

Michael

On Thu, Nov 17, 2011 at 7:54 AM, haohao Tsing haohaor...@gmail.com wrote:
 hi,R users:
  I have such a text

 num = 3
 testco = 12
 testno = 1;12;3
 infp = test1;test2;test3
 How can I read this text by readLines?

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling and R misconceptions; was: package installtion

2011-11-17 Thread Uwe Ligges

This is hopeless, since you never seem to listen to our advice, 
therefore this will be my very last try:


So you actually need local advice, both for statistical concepts and R 
related. No statistics software can estimate effects of variables that 
you observed to be constant (e.g. 0) all the time. If any software does, 
please delete it a once from your machine.
Instead, ask a local statistician for advice on your problem. You 
certainly want to show the data and your model to the local expert - 
since you don't show us. And then you want to ask for local R course 
since reading the documentation seems not to help. Applying mtrace() in 
a non exiting object shows this straight away.


Uwe Ligges






On 17.11.2011 15:49, Scott Raynaud wrote:

I believe the problem is a column of zeroes in my x matrix.  I have tried the 
suggestions in the documentation,
so now to try to confirm the probelm I'd like to run debug.  Here's where I 
think the problem is:

###~~  Fitting the model using lmer funtion~~###
(fitmodel- lmer(modelformula,data,family=binomial(link=logit),nAGQ=1))
mtrace(fitmodel)

I added the mtrace to catch the error, but get the following:

Error in mtrace(fitmodel) : Can't find fitmodel

How can I debug this?


- Original Message -
From: Rolf Turnerrolf.tur...@xtra.co.nz
To: Scott Raynaudscott.rayn...@yahoo.com
Cc: r-help@r-project.orgr-help@r-project.org
Sent: Wednesday, November 16, 2011 6:04 PM
Subject: Re: [R] package installtion

On 17/11/11 05:37, Scott Raynaud wrote:

That might be an option if it weren't my most important predictor.  I'm 
thinking my best bet is to use MLWin for the estimation since it will properly 
set fixed effects
   to 0.  All my other sample size simulation programs use SAS PROC IML which I 
don't have/can't afford.  I like R since it's free, but I can't work around the 
problem
I'm currently having.


This is the ``push every possible button until you get a result and to hell 
with what
anything actually means'' approach to statistics.  The probability of getting a
*meaningful* result from this approach is close to zero.

Why don't you try to *understand* what is going on, rather than wildly throwing
every possible piece of software at the problem until one such piece runs?

 cheers,

 Rolf Turner


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to read a freetext line ?

2011-11-17 Thread R. Michael Weylandt michael.weyla...@gmail.com

Hi,

Please copy your replies to r-help so others may participate in the
discussion.

2011/11/17 Jie TANG totang...@gmail.com:
 yes ,I have tried readLines
 by
 config- readLines(configfile,ok=TRUE,n=-1)
 #but when strsplit is used as below
 food-unlist(strsplit(config[2].:))
 #here food  is a vector
 but the value of food in the text .e.g.12 , is still a string 12
 ,not a integer 12.
 so I have to use strtoi and strtrim ,but I can not decide the value of food
 is 1 digital or more e.g 1 or 12,
 so
 food-strtoi(strtrim(strsplit(config[2]),:)[2]，1))
~~~the number
 of the stringtointeger  could  not be get beforehand in my hand since it
 will be changed by
  other users.so how could I resolve this problem by read the numierical
 number into my configure file ?
 thank you .


 food-strtoi(strtrim(strsplit(config[2]),:)[2]，1))


So your problem is with strsplit(), and not with reading in the data?
I did it in many steps so that you can see how each bit works:

 string1 - food:2;1;12 # one of your lines
 string2 - strsplit(string1, :) # separate name from values by :
 varname - string2[[1]][1]
 varname
[1] food
 values - unlist(strsplit(string2[[1]][2], ;)) # separate individual values 
 by ;
 values
[1] 2  1  12
 values - as.numeric(values) # convert to numbers
 values
[1]  2  1 12





 2011/11/17 Sarah Goslee sarah.gos...@gmail.com

 Hi,

 On Thu, Nov 17, 2011 at 9:37 AM, Jie TANG totang...@gmail.com wrote:
  hi everyone .
   Here I have a text where there are some integer and string
  variables.But I
  can not read them by readLines and scan

 I've seen this question several times this morning. If that's you,
 please do not post multiple times. If you haven't gotten an answer in
 a couple days, then it's okay to ask again, but the trouble is usually
 with your question, like here.

  the text is :
 
  weight ;30;130
  food:2;1;12
  color:white;black
 
  the first column is the names of the variables and others are the value
  of
  them.
   the column in different line are different.
  Can anyone help me ?

 What have you tried?
 What format do you need? For instance, reading them in as a single
 string is easy. Using strsplit() to separate that single string into
 several strings is easy. But without knowing what you are trying to
 achieve, there's really no way to help you beyond suggesting those two
 functions.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] optim seems to be finding a local minimum

2011-11-17 Thread Dimitri Liakhovitski

One more thing: trying to defend R's honor, I've run optimx instead of
optim (after dividing the IV by its max - same as for optim). I did
not use L-BFGS-B with lower bounds anymore. Instead, I've used
Nelder-Mead (no bounds).
First, it was faster: for a loop across 10 different IVs BFGS took
6.14 sec and Nelder-Mead took just 3.9 sec.
Second, the solution was better - Nelder-Mead fits were ALL better
than L-BFGS-B fits and ALL better than Excel solver's solutions. Of
course, those were small improvements, but still, it's nice!
Dimitri


On Mon, Nov 14, 2011 at 5:26 PM, Dimitri Liakhovitski
dimitri.liakhovit...@gmail.com wrote:
 Just to provide some closure:

 I ended up dividing the IV by its max so that the input vector (IV) is
 now between zero and one. I still used optim:
 myopt - optim(fn=myfunc, par=c(1,1), method=L-BFGS-B, lower=c(0,0))
 I was able to get great fit, in 3 cases out of 10 I've beaten Excel
 Solver, but in 7 cases I lost to Excel - but again, by really tiny
 margins (generally less than 1% of Excel's fit value).

 Thank you everybody!
 Dimitri

 On Fri, Nov 11, 2011 at 10:28 AM, John C Nash nas...@uottawa.ca wrote:
 Some tips:

 1) Excel did not, as far as I can determine, find a solution. No point seems 
 to satisfy
 the KKT conditions (there is a function kktc in optfntools on R-forge 
 project optimizer.
 It is called by optimx).

 2) Scaling of the input vector is a good idea given the seeming wide range 
 of values. That
 is, assuming this can be done. If the function depends on the relative 
 values in the input
 vector rather than magnitude, this may explain the trouble with your 
 function. That is, if
 the function depends on the relative change in the input vector and not its 
 scale, then
 optimizers will have a lot of trouble if the scale factor for this vector is 
 implicitly
 one of the optimization parameters.

 3) If you can get the gradient function you will almost certainly be able to 
 do better,
 especially in finding whether you have a minimum i.e., null gradient, 
 positive definite
 Hessian. When you have gradient function, kktc uses Jacobian(gradient) to 
 get the Hessian,
 avoiding one level of digit cancellation.

 JN


 On 11/11/2011 10:20 AM, Dimitri Liakhovitski wrote:
 Thank you very much to everyone who replied!
 As I mentioned - I am not a mathematician, so sorry for stupid
 comments/questions.
 I intuitively understand what you mean by scaling. While the solution
 space for the first parameter (.alpha) is relatively compact (probably
 between 0 and 2), the second one (.beta) is all over the place -
 because it is a function of IV (input vector). And that's, probably,
 my main challenge - that I am trying to write a routine for different
 possible IVs that I might be facing (they may be in hundreds, in
 thousands, in millions). Should I be rescaling the IV somehow (e.g.,
 by dividing it by its max) - or should I do something with the
 parameter .beta inside my function?

 So far, I've written a loop over many different starting points for
 both parameters. Then, I take the betas around the best solution so
 far, split it into smaller steps for beta (as starting points) and
 optimize again for those starting points. What disappoints me is that
 even when I found a decent solution (the minimized value of 336) it
 was still worse than the Solver solution!

 And I am trying to prove to everyone here that we should do R, not Excel :-)

 Thanks again for your help, guys!
 Dimitri


 On Fri, Nov 11, 2011 at 9:10 AM, John C Nash nas...@uottawa.ca wrote:
 I won't requote all the other msgs, but the latest (and possibly a bit 
 glitchy) version of
 optimx on R-forge

 1) finds that some methods wander into domains where the user function 
 fails try() (new
 optimx runs try() around all function calls). This includes L-BFGS-B

 2) reports that the scaling is such that you really might not expect to 
 get a good solution

 then

 3) Actually gets a better result than the

 xlf-myfunc(c(0.888452533990788,94812732.0897449))
 xlf
 [1] 334.607


 with Kelley's variant of Nelder Mead (from dfoptim package), with

 myoptx
  method                        par       fvalues fns  grs itns conv  KKT1
 4 LBFGSB                     NA, NA 8.988466e+307  NA NULL NULL     NA
 2 Rvmmin           0.1, 200186870.6      25593.83  20    1 NULL    0 FALSE
 3 bobyqa 6.987875e-01, 2.001869e+08      1933.229  44   NA NULL    0 FALSE
 1   nmkb 8.897590e-01, 9.470163e+07      334.1901 204   NA NULL    0 FALSE
   KKT2 xtimes  meths
 4    NA   0.01 LBFGSB
 2 FALSE   0.11 Rvmmin
 3 FALSE   0.24 bobyqa
 1 FALSE   1.08   nmkb

 But do note the terrible scaling. Hardly surprising that this function 
 does not work. I'll
 have to delve deeper to see what the scaling setup should be because of 
 the nature of the
 function setup involving some of the data. (optimx includes parscale on 
 all methods).

 However, original poster DID include code, so it was easy to do a quick 
 check. Good for

Re: [R] Adding a year to existing date

2011-11-17 Thread MacQueen, Don

Here is an example that could probably be described as adding a year:

dates - c('2008-01-01','2009-03-02')
tmp - as.POSIXlt(dates)tmp$year - tmp$year+1
dates2 - format(tmp)
 dates
[1] 2008-01-01 2009-03-02
 dates2
[1] 2009-01-01 2010-03-02

## to begin to understand how it works, give the command
##   unclass(tmp)
## (and read the help pages
##   ?as.POSIXlt
##   ?DateTimeClasses

Another example:

dates - as.Date(c('2008-01-01','2009-03-02'))
tmp - as.POSIXlt(dates)
tmp$year - tmp$year+1
dates2 - as.Date(tmp)



##   ?as.Date
##   ?Date




-Don


-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 11/16/11 8:33 PM, arunkumar akpbond...@gmail.com wrote:

Hi 

 I need to add an year to and date field in the dataframe.

Please help me

X Date
1 2008-01-01
2 2008-02-01
3 2003-03-01


Thanks in advance

--
View this message in context:
http://r.789695.n4.nabble.com/Adding-a-year-to-existing-date-tp4078930p407
8930.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read.table with double precision

2011-11-17 Thread Alaios

Dear all I have a txt file with the following contents
1 50.790643000 6.063498
2 50.790738000 6.063471
3 50.791081000 6.063380
4 50.791189000 6.063552


I am usind read.table('myfile.txt',sep= )

which unfortunately returns only integers and not doubles that are required to 
store the 

50.790643000

What can I do to force it to store things into doubles?

B.R
Alex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Adding a year to existing date

2011-11-17 Thread Keith Jewell

Just looking at the ambiguity in adding a year

 dates - as.Date(c('2007-03-01','2008-02-29'))
 tmp - as.POSIXlt(dates)
 tmp$year - tmp$year+1
 dates2 - as.Date(tmp)
 dates2
[1] 2008-03-01 2009-03-01
 dates2 - dates
Time differences in days
[1] 366 366

KJ

MacQueen, Don macque...@llnl.gov wrote in message 
news:caea785f.7cfdb%macque...@llnl.gov...
 Here is an example that could probably be described as adding a year:

 dates - c('2008-01-01','2009-03-02')
 tmp - as.POSIXlt(dates)tmp$year - tmp$year+1
 dates2 - format(tmp)
 dates
 [1] 2008-01-01 2009-03-02
 dates2
 [1] 2009-01-01 2010-03-02

 ## to begin to understand how it works, give the command
 ##   unclass(tmp)
 ## (and read the help pages
 ##   ?as.POSIXlt
 ##   ?DateTimeClasses

 Another example:

 dates - as.Date(c('2008-01-01','2009-03-02'))
 tmp - as.POSIXlt(dates)
 tmp$year - tmp$year+1
 dates2 - as.Date(tmp)



 ##   ?as.Date
 ##   ?Date




 -Don


 -- 
 Don MacQueen

 Lawrence Livermore National Laboratory
 7000 East Ave., L-627
 Livermore, CA 94550
 925-423-1062





 On 11/16/11 8:33 PM, arunkumar akpbond...@gmail.com wrote:

Hi

 I need to add an year to and date field in the dataframe.

Please help me

X Date
1 2008-01-01
2 2008-02-01
3 2003-03-01


Thanks in advance

--
View this message in context:
http://r.789695.n4.nabble.com/Adding-a-year-to-existing-date-tp4078930p407
8930.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.table with double precision

I'm having trouble replicating/understanding why that would happen since I do 
it all the time. The only thing that raises a hint of suspicion is using the 
blank space separator , but I'm pretty sure that's fine  What does str() give? 
Possibly factors?

If you are sure that's happening as described, can you send a sample .txt ( 
won't get scrubbed) and your exact import code?

Michael

On Nov 17, 2011, at 11:49 AM, Alaios ala...@yahoo.com wrote:

 Dear all I have a txt file with the following contents
 1 50.790643000 6.063498
 2 50.790738000 6.063471
 3 50.791081000 6.063380
 4 50.791189000 6.063552
 
 
 I am usind read.table('myfile.txt',sep= )
 
 which unfortunately returns only integers and not doubles that are required 
 to store the�
 
 50.790643000
 
 What can I do to force it to store things into doubles?
 
 B.R
 Alex
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read.table with double precision

Hi,

On Thu, Nov 17, 2011 at 11:49 AM, Alaios ala...@yahoo.com wrote:
 Dear all I have a txt file with the following contents
 1 50.790643000 6.063498
 2 50.790738000 6.063471
 3 50.791081000 6.063380
 4 50.791189000 6.063552


 I am usind read.table('myfile.txt',sep= )

 which unfortunately returns only integers and not doubles that are required 
 to store the

 50.790643000

Using that exact file you included? If so, then I suspect you are
confusing display and storage, though even then my default session
doesn't show integers.

How do you know that it is returning only integers? What is
options()$digits set to?

 myfile
  V1V2   V3
1  1 50.790643 6.063498
2  2 50.790738 6.063471
3  3 50.791081 6.063380
4  4 50.791189 6.063552

 sprintf(%2.10f, myfile[1,2])
[1] 50.790643


 What can I do to force it to store things into doubles?

What is str(myfile) ?

Sarah


-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] changelog for MASS?

2011-11-17 Thread Liviu Andronic

On Thu, Nov 17, 2011 at 7:33 AM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 Hmmm...sorry -- the only thing I can suggest is maybe striking some
 sort of deal that you change when it gets however many months out of
 date:  if you look here
 (http://cran.r-project.org/src/contrib/Archive/MASS/), you can see the
 last time each version of MASS was updated and by seeing what you

You can also get this info on crantastic (scroll to 'prev versions'):
http://crantastic.org/packages/MASS

Regards
Liviu


 have, you can see about how out of date you are. In the context of
 MASS, I wouldn't worry so much: it's tied to a book, not active
 research, so I don't think it gets updated too often in big ways.

 The other thing is to actually compare differences in the source code,
 though that might be more trouble than it's worth.

 Michael

 On Mon, Nov 14, 2011 at 4:30 PM, Xu Wang xuwang...@gmail.com wrote:
 Thanks Michael,

 But I can't see the dates on the NEWS so I have no idea what changed from
 last version or from whichever version we actually have installed. Do you
 see what I mean?

 Thanks,

 Xu

 --
 View this message in context: 
 http://r.789695.n4.nabble.com/changelog-for-MASS-tp4034473p4040941.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Do you know how to read?
http://www.alienetworks.com/srtest.cfm
http://goodies.xfce.org/projects/applications/xfce4-dict#speed-reader
Do you know how to write?
http://garbl.home.comcast.net/~garbl/stylemanual/e.htm#e-mail

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] package installtion

2011-11-17 Thread Scott Raynaud

See my responses in brackets below.

- Original Message -
From: Rolf Turner rolf.tur...@xtra.co.nz
To: Scott Raynaud scott.rayn...@yahoo.com
Cc: r-help@r-project.org r-help@r-project.org
Sent: Wednesday, November 16, 2011 6:04 PM
Subject: Re: [R] package installtion

On 17/11/11 05:37, Scott Raynaud wrote:
 That might be an option if it weren't my most important predictor.  I'm 
 thinking my best bet is to use MLWin for the estimation since it will 
 properly set fixed effects
  to 0.  All my other sample size simulation programs use SAS PROC IML which I 
don't have/can't afford.  I like R since it's free, but I can't work around 
the problem
 I'm currently having.

This is the ``push every possible button until you get a result and to hell 
with what
anything actually means'' approach to statistics [Well, I'm simply echoing the 
simulation 
software instructions in planning to use MLWin.  I assume the approach is 
validated.  In
the meantime, I'd like to have a deeper understanding of why R isn't working.  
I have a 
hunch, but don''t know how to confirm it].  The probability of getting a
*meaningful* result from this approach is close to zero [You're most certainly 
right if
there is no sound rationale behind the method.  In this case there is and the 
probability is 
much higher than you state.  That's not to say I haven't made an error 
somewhere. Maybe 
further investigation of sort I endeavor to pursue will reveal that].

Why don't you try to *understand* what is going on [Precisely what I'm trying 
to do.  
However, I need help which I hope I can find here.], rather than wildly throwing
every possible piece of software at the problem  [It's not wildly throwing 
every piece of 
software at the problem.  It's simply a matter of understanding what works and 
what doesn't]
 until one such piece runs?

    cheers,

        Rolf Turner

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] modelling and R misconceptions; was: package installtion

2011-11-17 Thread Scott Raynaud

My responses are in brackets below, plus a final note after the main text.

- Original Message -
From: Uwe Ligges lig...@statistik.tu-dortmund.de
To: Scott Raynaud scott.rayn...@yahoo.com
Cc: r-help@r-project.org r-help@r-project.org
Sent: Thursday, November 17, 2011 9:16 AM
Subject: Re: [R] modelling and R misconceptions; was: package installtion

This is hopeless [That's a matter of perception-even concentration camp 
prisoners 
found a way to hope (see Viktor Frankl)], since you never [never is a strong 
word 
and many times leads to cognitive errors] seem to listen to our 
advice [It's possible that I misunderstood your recommedations (more likely), 
or that you communicated poorly (less likely)], 
therefore this will be my very last try:

So you actually need local advice [Yes I need advice-that's why I post here!], 
both for statistical concepts and R related [I don't claim to be a statistical 
genius, 
but I can hold my own.  Now, R is a different matter].  No statistics software 
can estimate effects of variables that you observed to be constant (e.g. 0) 
all the time [I think you misuderstood my intentions-I never wanted to estimate 
effects that are 0 all of the time]. If any software does, 
please delete it a once from your machine.
Instead, ask a local statistician for advice on your problem. You 
certainly want to show the data and your model to the local expert - 
since you don't show us. [I gave a detailed explanation in a previous post 
which I repeat here:

|OK, I'm using William Browne's MLPowSim to create an R script which will 
simulate samples for estimation of sample size in mixed models.  I have subjects
| nested in hospitals with hospitals treated as random and all of my covariates 
at level 1.  My outcome is death, so it's binary and I'll have a fixed and 
|random intercept.  My interest is in the relation of the covariates to the 
outcome.  
| 
|My most important variable is gestational age (GA) which my investigators 
divide thusly: 23-24, 25-26, 27-28, 29-30 and 31-32.  I have recoded the
| dummies for GA in the script according to the MLPowSim instructions to a 
random multinomial variable:
| 
|   macpred-rmultinom(n2,1,c(.1031,.1482,.2385,.4404,.0698)) 
|   x[,3]-macpred[1,][l2id]
|   x[,4]-macpred[2,][l2id]
|   x[,5]-macpred[3,][l2id]
|   x[,6]-macpred[4,][l2id]
|
|GA 23-24 is the reference with p=.0698.  I started with a structured sampling 
scheme of 20, 60, 100, 120 and 140 level 2 units.  My level 2 units have 
|different sizes.  So at 20 I had 5 hospitals with 100 patients, 4 with 280, 3 
with 460, 3 with 640, 3 with 820 and 2 with 1000.  Thus, at 60 hospitals, I 
have 15, 
|12, 9, 9, 9, 6 with the same cell sample sizes.
| 
|According to the MLPowSim documentation, with small probablities it's possible 
to have a column of zeroes in the X matrix if there are not many units in 
|the random factor.  R will choke on this but MLWin sets the associated fixed 
effects to 0.  When R choked, I increased from 20 to 60 as my minimum as 
|suggested in the MLPowSim documentation.  Still no luck.

Since this is a simulation, I assume once and a while that by chance a 
coefficient could be 0. 
In fact, Browne mentions as much in his documentation.  There is a bit more to 
my simulation, 
but I thought I'd try to keep it as simple as possible, at least at the outset.]

And then you want to ask for local R course 
since reading the documentation seems not to help [You got that right!]. 
Applying mtrace() in 
a non exiting object shows this straight away.

Uwe Ligges

Apparently I misuderstood the prupose of mtrace after reading the 
documentation-I thought it was 
to debug problems of the sort I've encountered.  Michael Weylandt provided 
appropriate direction 
in the previous post for which I am grateful.

Not all of us can be intellectual superstars.  That's why we ask for help.  
This much I did read and understand
from the R posting guide:

Responding to other posts: 
* Rudeness and ad hominem comments are not acceptable. Brevity is OK. 
It's a good lesson to learn.

On 17.11.2011 15:49, Scott Raynaud wrote:
 I believe the problem is a column of zeroes in my x matrix.  I have tried the 
 suggestions in the documentation,
 so now to try to confirm the probelm I'd like to run debug.  Here's where I 
 think the problem is:

 ###~~      Fitting the model using lmer funtion    ~~###
 (fitmodel- lmer(modelformula,data,family=binomial(link=logit),nAGQ=1))
 mtrace(fitmodel)

 I added the mtrace to catch the error, but get the following:

 Error in mtrace(fitmodel) : Can't find fitmodel

 How can I debug this?

 - Original Message -
 From: Rolf Turnerrolf.tur...@xtra.co.nz
 To: Scott Raynaudscott.rayn...@yahoo.com
 Cc: r-help@r-project.orgr-help@r-project.org
 Sent: Wednesday, November 16, 2011 6:04 PM
 Subject: Re: [R] package installtion

 On 17/11/11 05:37, Scott Raynaud wrote:

Re: [R] Help with error: no acceptable C compiler found in $PATH

Hstrangeif possible, this might be solvable by simply
updating to the release version R 2.14. If it's at all possible, I'd
start there.

Can you find the object it's unhappy about? On my machine, I do the following

1) Open Finder
2) Macintosh HD - Library - Frameworks - R.framework - Versions -
2.13 - Resources - library - RCurl - libs - x86_64 - RCurl.so

Going the other way, are you sure you have Curl on your system? I'm
pretty sure it's standard on all Macs but you never know...follow some
of the instructions given here: http://www.omegahat.org/RCurl/FAQ.html
 You should be able to type curl-config in the terminal and get a
meaningful response if it is.

Did you change something on the OS level recently? I don't really know
why this would have all fallen apart, I just re-reinstalled RCurl on R
2.13.2 OSX 10.5.8 with no problem at all.

Michael

On Wed, Nov 16, 2011 at 11:14 AM, Hari Easwaran hariharan...@gmail.com wrote:
 Hi Michael,
 Thanks for your response. Using the binary seems to solve partially. I am
 able to install (I think!) RCurl but not able to load the library. Below is
 the info you required and the error while loading RCurl.
 sessionInfo()
 R version 2.13.2 (2011-09-30)
 Platform: x86_64-apple-darwin9.8.0/x86_64 (64-bit)
 locale:
 [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
 attached base packages:
 [1] stats     graphics  grDevices utils     datasets  methods   base
 install.packages(RCurl)
 trying URL
 'http://watson.nci.nih.gov/cran_mirror/bin/macosx/leopard/contrib/2.13/RCurl_1.7-0.tgz'
 Content type 'application/octet-stream' length 680511 bytes (664 Kb)
 opened URL
 ==
 downloaded 664 Kb

 The downloaded packages are in
 /var/folders/a6/a60JdPfrHC0ZAizZWyNM-E+++TI/-Tmp-//RtmpYE7JLJ/downloaded_packages
 library(RCurl)
 Loading required package: bitops
 Error in dyn.load(file, DLLpath = DLLpath, ...) :
   unable to load shared object
 '/Library/Frameworks/R.framework/Versions/2.13/Resources/library/RCurl/libs/x86_64/RCurl.so':

 dlopen(/Library/Frameworks/R.framework/Versions/2.13/Resources/library/RCurl/libs/x86_64/RCurl.so,
 6): Library not loaded:
 @rpath/R.framework/Versions/2.13/Resources/lib/libR.dylib
   Referenced from:
 /Library/Frameworks/R.framework/Versions/2.13/Resources/library/RCurl/libs/x86_64/RCurl.so
   Reason: image not found
 Error: package/namespace load failed for 'RCurl'
 Warning: dependency ‘Rcompression’ is not available
 also installing the dependency ‘XML’
 Seems like now I need 'Rcompression'. I googled this and found a
 Rcompression package 'zlib' (http://www.omegahat.org/Rcompression/). However
 the site says that zlib for Mac OS X: zlib is already included as part of
 Mac OS X.
 I am wondering what to do? To my bliss, why did the previous R version
 oblivious of these issues!
 Really appreciate any hep.
 SIncerely,
 Hari


 On Tue, Nov 15, 2011 at 11:33 PM, R. Michael Weylandt
 michael.weyla...@gmail.com wrote:

 Yes, you probably need some sort of C compiler, but why can't you just
 download the appropriate binary directly? I just did on OS X 10.5.8
 (admittedly for R 2.13.2, not 2.14) with no problems. The output of

 sessionInfo()
 install.packages(RCurl)

 if you don't mind please.

 Thanks,

 Michael

 On Tue, Nov 15, 2011 at 2:12 PM, Hari Easwaran hariharan...@gmail.com
 wrote:
  Dear all,
  I am trying to install a package from bioconductor (biomaRt) for which I
  need the RCurl package. I get the following main error message when I
  try
  to install RCurl (and its dependencies).
 
  configure: error: no acceptable C compiler found in $PATH
  See `config.log' for more details.
  ERROR: configuration failed for package ‘RCurl’
 
  I searched for possible solutions and read in some online mailing list
  that
  I might have to install Xcode to install the gcc compiler. I am not sure
  if
  I should do this because I have installed RCurl in previous versions of
  R
  without any problems (on this same computer). I upgraded to the latest R
  (R
  version 2.14.0) and faced this problem. So I downgraded to R version
  2.13.2
  and still cannot install RCurl. I think my last successful installation
  of
  RCurl was with R version 2.11.
 
  Following is the complete error message and my R version details.
  I really appreciate any help or suggestions.
 
  Sincerely,
  Hari
 
  trying URL '
  http://watson.nci.nih.gov/cran_mirror/src/contrib/XML_3.4-3.tar.gz'
  Content type 'application/octet-stream' length 906364 bytes (885 Kb)
  opened URL
  ==
  downloaded 885 Kb
 
  trying URL '
  http://watson.nci.nih.gov/cran_mirror/src/contrib/RCurl_1.7-0.tar.gz'
  Content type 'application/octet-stream' length 813252 bytes (794 Kb)
  opened URL
  ==
  downloaded 794 Kb
 
  * installing *source* package ‘XML’ ...
  checking for gcc... no
  checking for cc... no
  checking for cl.exe... no

Re: [R] Non-finite finite-difference value error in eha's, aftreg

2011-11-17 Thread John C Nash


This kind of error seems to surprise R users. It surprises me that it doesn't 
happen much
more frequently. The BFGS method of optim() from the 1990 Pascal version of 
my book was
called the Variable Metric method as per Fletcher's 1970 paper it was drawn 
from. It
really works much better with analytic gradients, and the Rvmmin package which 
is an all-R
version that adds bounds and masks is set up to generate a warning if they are 
not
available. Even with bounds, the finite different derivative code can step over 
a cliff
edge with   del - (f(x+h) - f(x))/h  i.e., bounds may not be checked within 
the numerical
derivative functions. And BFGS is not set up with bounds. L-BFGS-B which 
has them is
actually a rather different method.

If you get such error messages, why not capture the parameter vector and check 
the
function computation at those parameters and nearby? Yes, a bit tedious, but 
rarely have I
found it a waste of time. For information, there should be a small function 
available
shortly on R-forge (project optimizer, likely in the optfntools package) to do 
an axial
search around a set of parameters and generate some information about the 
functional
surface. I still have to prepare documentation and examples, but if anxious, 
contact me
off-list.

JN



 Message: 21
 Date: Wed, 16 Nov 2011 15:06:00 +0100
 From: Milan Bouchet-Valat nalimi...@club.fr
 To: r-help r-help@r-project.org
 Subject: [R] Non-finite finite-difference value error in eha's
   aftreg
 Message-ID: 1321452360.13624.2.camel@milan
 Content-Type: text/plain; charset=UTF-8
 
 Hi list!
 
 
 I'm getting an error message when trying to fit an accelerated failure
 time parametric model using the aftreg() function from package eha:
  Error in optim(beta, Fmin, method = BFGS, control = list(trace =
as.integer(printlevel)),  : 
  non-finite finite-difference value [2]
 This only happens when adding four specific covariates at the same time
 in the model (see below). I understand that kind of problem can come
 from a too high correlations between my covariates, but is there
 anything I can do to avoid it? Does something need to be improved in
 aftreg.fit?
 
 My data set is constituted of 34,505 observations (years) of 2,717
 individuals, which seems reasonable to me to fit a complex model like
 that (covariates are all factors with less than 10 levels). I can send
 it by private mail if somebody wants to help debugging this.
 
 The details of the model and errors follow, but feel free to ask for
 more testing. I'm using R 2.13.1 (x86_64-redhat-linux-gnu), eha 2.0-5
 and survival 2.36-9.
 
 
 Thanks for your help!
 
 
  m -aftreg(Surv(start, end, event) ~ homo1 + sexego + dipref1
 ++ t.since.school.q,
 +data=ms, dist=loglogistic, id=ident)
 Error in optim(beta, Fmin, method = BFGS, control = list(trace =
 as.integer(printlevel)),  : 
   non-finite finite-difference value [2]
 Calls: aftreg - aftreg.fit - aftp0 - optim
 
  traceback()
 4: optim(beta, Fmin, method = BFGS, control = list(trace =
 as.integer(printlevel)), 
hessian = TRUE)
 3: aftp0(printlevel, ns, nn, id, strata, Y, X, offset, dis, means)
 2: aftreg.fit(X, Y, dist, strats, offset, init, shape, id, control, 
center)
 1: aftreg(Surv(start, end, event) ~ homo1 + sexego + dipref1 + 
t.since.school.q, data = ms, dist = loglogistic, id = ident)


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] How to Fit Inflated Negative Binomial

2011-11-17 Thread Ben Bolker

Tyler Rinker tyler_rinker at hotmail.com writes:

 
 
 try: library(pscl)
 
 There's a zeroinfl for zero inflated neg. binom.
 
 Tyler

  
  Dear All,
  I am trying to fit some data both as a negative binomial and a zero 
  inflated binomial.
  For the first case, I have no particular problems, see the small snippet 
  below
  

 
set.seed(123) #to have reproducible results

## You don't actually need MASS::rnegbin, rnbinom in base
##  R works fine (different parameter names)

x6 - c(rep(0,100),rnbinom(500,mu=5,size=4))

## sample() is irrelevant, it just permutes the results

library(pscl)
zz - zeroinfl(x6~1|1,dist=negbin)
exp(coef(zz)[1])  ## mu
zz$theta  ## theta
plogis(coef(zz)[2]) ## zprob

Alternatively you can use fitdistr with the dzinbinom()
function from the emdbook package:

library(emdbook)
fitdistr(x6,dzinbinom,start=list(mu=4,size=5,zprob=0.2))

The pscl solution is likely to be much more robust.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] hierachical code system

2011-11-17 Thread Albert-Jan Roskam

Hi,
 
Thanks for your reply. Based on your suggestions, I managed to simplify the 
code, but only a little. I don't see how I could do without a loop, given the 
nestedness of the hierachy. See the code below, which is working, but I'd like 
to simplify it.
 
# sample data
theCodes - c('STAT.01', 'STAT.01.01', 'STAT.01.01.01', 'STAT.01.01.02', 
'STAT.01.01.03', 'STAT.01.01.04', 'STAT.01.01.05', 'STAT.01.01.06', 
'STAT.01.01.06.01', 'STAT.01.01.06.02', 'STAT.01.01.06.03', 'STAT.01.01.06.04', 
'STAT.01.01.06.05', 'STAT.01.01.06.06', 'STAT.01.02', 'STAT.01.02.01', 
'STAT.01.02.02', 'STAT.01.02.03', 'STAT.01.02.03.01', 'STAT.01.02.03.02', 
'STAT.01.02.03.03', 'STAT.01.02.03.04', 'STAT.01.02.03.05', 'STAT.01.03')
theValues - as.numeric(c(NA, NA, 15074.23366, 4882.942034, 1619.59628, 
1801.722877, 1019.973666, NA, 503.9239317, 917.2189347, 6018.830465, 
1944.11311, 1427.575402, 1965.725428, NA, 5857.293612, 5933.770263, NA, 
6077.089518, 1427.180073, 455.9387993, 859.766603, 1002.983331, 2225.328211))
df - as.data.frame(cbind(code=theCodes, value=theValues))
df$value - as.numeric(df$value)
 
# actual code
getDepth - function(df) {
    df$diepte - do.call(rbind, lapply(strsplit(df$code, \\.), length)) - 1
    return(df)
    }
getParents - function(df) {
    df$parent - substr(df$code, 1, 4 + (df$diepte - 1) * 3)
    return(df)
    }
getTotals - function(df, depth) {
    s - subset(df, diepte==depth)
    if(!parent %in% names(df)) s - getParents(s)
    agg - aggregate(s[value], s[parent], FUN=sum, na.rm=TRUE)
    merged - merge(df, agg, by.x=code, by.y=parent, all=TRUE, 
suffixes=c(, _summed))
    isSum - !is.na(merged$value_summed)
    merged[isSum, value] - merged[isSum, value_summed] 
    merged$value_summed - merged$parent - NULL
    return(merged)
    }
#library(debug)
#mtrace(getTotals)
df - getDepth(df)
for( depth in max(df$diepte):2 ) {
    if (depth == max(df$diepte)) {
    x - getTotals(df, depth) 
    } else {
    x - getTotals(x, depth)
    }
    }

Cheers!!
Albert-Jan


~~
All right, but apart from the sanitation, the medicine, education, wine, public 
order, irrigation, roads, a fresh water system, and public health, what have 
the Romans ever done for us?
~~



From: ONKELINX, Thierry thierry.onkel...@inbo.be
To: Albert-Jan Roskam fo...@yahoo.com; R Mailing List r-help@r-project.org
Sent: Wednesday, November 16, 2011 2:34 PM
Subject: RE: [R] hierachical code system

Dear Albert-Jan,

The easiest way is to create extra variables with the corresponding 
aggregation level. substr() en strsplit() can be your friends. Once you have 
those variables you can use aggregate() or any other aggregating function. You 
don't need loops.

Best regards,

Thierry

 -Oorspronkelijk bericht-
 Van: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
 Namens Albert-Jan Roskam
 Verzonden: woensdag 16 november 2011 14:28
 Aan: R Mailing List
 Onderwerp: [R] hierachical code system
 
 Hi,
 
 I have a hierachical code system such as the example below (the printed data
 are easiest to read). I would like to write a function that returns an 
 'imputed'
 data frame, ie. where the the parent values are calculated as the sum of the
 child values. So, for instance, STAT.01.01.06  is the sum of STAT.01.01.06.01
 through STAT.01.01.06.06. The code I have written uses two for loops, and,
 moreover, does not work as intended. My starting point was to determine the
 code depth by counting the dots in the variable 'code' (using strsplit), then
 iterate over the tree from deep to shallow. Does anybody have a good idea as
 to how to approach this in R?
 
 theCodes - c('STAT.01', 'STAT.01.01', 'STAT.01.01.01', 'STAT.01.01.02',
 'STAT.01.01.03', 'STAT.01.01.04', 'STAT.01.01.05', 'STAT.01.01.06',
 'STAT.01.01.06.01', 'STAT.01.01.06.02', 'STAT.01.01.06.03', 
 'STAT.01.01.06.04',
 'STAT.01.01.06.05', 'STAT.01.01.06.06', 'STAT.01.02', 'STAT.01.02.01',
 'STAT.01.02.02', 'STAT.01.02.03', 'STAT.01.02.03.01', 'STAT.01.02.03.02',
 'STAT.01.02.03.03', 'STAT.01.02.03.04', 'STAT.01.02.03.05', 'STAT.01.03')
 theValues - c('NA', 'NA', '15074.23366', '4882.942034', '1619.59628',
 '1801.722877', '1019.973666', 'NA', '503.9239317', '917.2189347',
 '6018.830465', '1944.11311', '1427.575402', '1965.725428', 'NA', 
 '5857.293612',
 '5933.770263', '6077.089518', 'NA', '1427.180073', '455.9387993', 
 '859.766603',
 '1002.983331', '2225.328211') df - as.data.frame(cbind(code=theCodes,
 value=theValues))
 print(df)
    code   value
 1   STAT.01  NA
 2    STAT.01.01  NA
 3 STAT.01.01.01 15074.23366
 4 STAT.01.01.02 4882.942034
 5 STAT.01.01.03  1619.59628
 6 STAT.01.01.04 1801.722877
 7 STAT.01.01.05 1019.973666
 8 STAT.01.01.06  NA
 9  STAT.01.01.06.01 503.9239317
 10 STAT.01.01.06.02

[R] how to include a factor or class Variable

2011-11-17 Thread arunkumar1111

Hi

How to include a factor or class variable to a fixed effect of lmer
function. when i included it throws an error. Please help

My code

data - read.delim(C:/TestData/data.txt)
 
 Mon=as.factor(data$Month)
 
 lmerform= Y~ X2 +X3 + Month:Mon + (1|State)+ (1+ X5|State)
 lmerfit=lmer(formula=lmerform,data=data)
 summary(lmerfit)

My data

State   YearMonth   Y   X2  X3  X4  X5  X6
GA  19601   27.8397.5   42.250.778.365.8
FA  19602   29.9413.3   38.152  79.266.9
GA  19613   29.8439.2   40.354  79.267.8
FA  19614   30.8459.7   39.555.379.269.6
GA  19621   31.2492.9   37.354.777.468.7
FA  19622   33.3528.6   38.163.780.273.6
GA  19633   35.6560.3   39.369.880.476.3


--
View this message in context: 
http://r.789695.n4.nabble.com/how-to-include-a-factor-or-class-Variable-tp4079991p4079991.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] set random numbers seed for different cpu's

2011-11-17 Thread fantomas

Hi

I'm running the same R script (throuth linux shell)  of several cpu's. This
R program uses random numbers and the result should be different every time.
But if put jobs (through Torque) for several cpu's I get the same result. As
a resealt my program saves numbers in file with randomly generated names.
works like a charm on one cpu, but I get the same result from different
cpu's.
So my question is, how can I resolve this? How to set pseudo random number
seed so that different cpu's would produce different results?

Thank you in advance.

--
View this message in context: 
http://r.789695.n4.nabble.com/set-random-numbers-seed-for-different-cpu-s-tp4080165p4080165.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Log-transform and specifying Gamma

2011-11-17 Thread Peter Minting


Dear R help,
I am trying to work out if I am justified in log-transforming data and 
specifying Gamma in the same glm.
Does it have to be one or the other?
I have attached an R script and the datafile to show what I mean.
Also, I cannot find a mixed-model that allows Gamma errors (so I cannot find a 
way of including random effects).
What should I do?
Many thanks,
Pete
  #trying to solve question 'can you log-transform and specify Gamma in the same 
model' question
ToadsBd-read.table(file.choose(),header=T)
list(ToadsBd)
#first see how well treatment group predicts Bd score with non-log transformed 
data
mod1-glm(Bd~factor(group))
summary(mod1)
#massively overdispersed. Are the data non-normal?
shapiro.test(Bd)
W = 0.3652, p-value = 5.666e-13
#yes, definitely non-normal
#try log-transforming data and see if that helps 
plot(qqnorm(Bd),log=y)
#log plot straightens it out, almost, so yes log-transform helps
#try model again with log transformed Bd score
mod2-glm(logBd~factor(group))
summary(mod2)
#a big improvement but still overdispersed
#other options - specify an error family? Looks like original data are Gamma 
distributed
#should test if variance increases or remains constant with mean on scale of 
the original, non-logged data
par(mfrow=c(2,2))
plot(mod1)
#can you tell this from a diagnostic plot? Not sure how. If not, how do you 
assess this?
#in the meantime, assume it does and try Gamma (using default link = 
reciprocal) with non-logged data
mod3-glm(Bd~factor(group),family=Gamma)
summary(mod3)
#mod3 is a major improvement on mod1 and less dispersed than mod2 but has a 
much larger AIC than mod2
#is it valid to specify Gamma in a model where the data have been 
log-transformed?
#or does it have to be a choice between transformation or Gamma?
#if specify both, model is quite good, but it may not be valid. Please help!
mod4-glm(logBd~factor(group),family=Gamma)
summary(mod4)
#residual deviance now well below df, not overdispersed and the effect of group 
on Bd is significant
#I would also like to include assessment of the effect of site, but this is a 
random effect requiring a mixed model
#I cannot find a mixed model that works with Gamma errors. What can I do?
toadgroup   Bd  logBd   startg  site
1   1   0.5 0.405   13.60
2   1   0.3 0.262   15.90
3   1   0.3 0.262   14.40
4   1   0.4 0.336   15.30
5   1   6.5 2.015   15.10
6   1   0.1 0.095   15.70
7   1   0.2 0.182   20.20
8   1   17.72.929   17.30
9   1   0.6 0.470   18.70
10  1   0.1 0.095   24.61
11  1   0.6 0.470   20  1
12  1   9   2.303   16.31
13  1   1.6 0.956   19.41
14  1   3.4 1.482   12.81
15  1   6.3 1.988   19.71
16  2   1.3 0.833   12.60
17  2   63.34.164   22.60
18  2   0.7 0.531   18.30
19  2   33.23.532   15.50
20  2   2.2 1.163   13.20
21  2   479 6.174   16.40
22  2   0.1 0.095   19.10
23  2   47.63.884   16.10
24  2   195.6   5.281   14.10
25  2   41  3.738   16.30
26  2   1984.2  7.593   13.71
27  2   6.3 1.988   13.91
28  2   126.7   4.850   22  1
29  2   105.1   4.664   12.71
30  2   6747.8  8.817   18.21
31  2   282.6   5.648   15.81
32  3   1.6 0.956   18.60
33  3   2576.3  7.854   15.30
34  3   11240   9.327   17.40
35  3   678.1   6.521   18.80
36  3   9926.8  9.203   17.50
37  3   103.4   4.648   16.10
38  3   2401.7  7.784   15.50
39  3   2616.4  7.870   16.50
40  3   35.33.592   18.90
41  3   174.7   5.169   22.70
42  3   362 5.894   17.51
43  3   2765.7  7.925   13.81
44  3   29033.8 10.276  16.51
45  3   34  3.555   21.11
46  3   258.4   5.558   15.91
47  3   10.12.407   14.91
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pairwise correlation

2011-11-17 Thread muzz56

Hi Michael,
Here is a sample of the data.

 Gene Array1 Array2 Array3 Array4 Array5 Array6 Array7 Array8 Array9 Array10
Array11  Fth1 26016.01 23134.66 17445.71 39856.04 27245.45 23622.98 37887.75
49857.46 25864.73 21852.51 29198.4  B2m 7573.64 7768.52 6608.24 8571.65
6380.78 6242.76 6903.92 7330.63 7256.18 5678.21 10937.05  Tmsb4x 6192.44
4277.22 5024.59 4851.51 3062.55 4562.43 7948.1 5018.58 3200.17 2855.77
6139.23  H2-D1 3141.41 3986.06 3328.62 4726.6 3589.89 2885.95 7509.88
5257.62 4742.26 3431.33 5300.72  Prdx5 3935.7 3938.9 3401.68 4193.14 4028.95
3438.19 6640.15 5486.61 4424.57 3368.83 5265.92
I want to retain the gene names in the data. What you've proposed will take
them out and I'll have to append them back to the results after the cor()

On 17 November 2011 09:33, Michael Weylandt [via R] 
ml-node+s789695n4080177...@n4.nabble.com wrote:

 I think something like this should do it, but I can't test without data:

 rownames(mydata) - mydata[,1] # Put the elements in the first column
 as rownames
 mydata - mydata[,-1] # drop the things that are now rownames

 Michael

 On Thu, Nov 17, 2011 at 9:23 AM, Musa Hassan [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4080177i=0
 wrote:

  Hi Michael,
  Thanks for the response. I have noticed that the error occurred during
 my
  data read. It appears that the rownames (which when the data is
 transposed
  become my colnames) were converted to numbers instead of strings as they
  should be. The original header names don't change, just the rownames. I
 have
  to figure out how to import the data and have the strings not converted.
  Right now am using:
  mydata = read.csv(mydata.csv, headers=T,stringsAsFactors=F)
 
  then to convert the data frame to matrix
  mydata=data.matrix(mydata)
 
  Then I just do the correlation as Peter suggested.
 
  expression=cor(t(expression))
 
  Thanks.
 
  On 17 November 2011 08:51, R. Michael Weylandt [hidden 
  email]http://user/SendEmail.jtp?type=nodenode=4080177i=1

  wrote:
 
  On Wed, Nov 16, 2011 at 11:22 PM, muzz56 [hidden 
  email]http://user/SendEmail.jtp?type=nodenode=4080177i=2
 wrote:
   Thanks to everyone who replied to my post, I finally got it to work.
 I
   am
   however not sure how well it worked since it run so quickly, but
 seems
   like
   I have a 2000 x 2000 data set.
 
  Behold the great and mighty power that is R! Don't worry -- on a
  decent machine the correlation of a 2k x 2k data set should be pretty
  fast. (It's about 9 seconds on my old-ish laptop with a bunch of other
  junk running)
 
My followup questions would be, how do I get
   only pairs with say a certain pearson correlation value additionally
 it
   seems like my output didn't retain the headers but instead replaced
 them
   with numbers making it hard to know which gene pairs correlate.
 
  This is a little worrisome: R carries column names through cor() so
  this would suggest you weren't using them. Were your headers listed as
  part of your data (instead of being names)? If so, they would have
  been taken as numbers.
 
  Take a look at dimnames(NAMEOFDATA) -- if your headers aren't there,
  then they are being treated as data instead of numbers. If they are,
  can you provide some reproducible code and we can debug more fully.
  The easiest way to send data is to use the dput() function to get a
  copy-pasteable plain text representation. It would also be great if
  you could restrict it to a subset of your data rather than the full 4M
  data points, but if that's hard to do, don't worry.
 
  You should have expected behavior like
 
  X - matrix(1:9,3)
  colnames(X) - c(A,B,C)
  cor(X) # Prints with labels
 
  Michael
 
  
   On 16 November 2011 17:11, Nordlund, Dan (DSHS/RDA) [via R] 
   [hidden email] http://user/SendEmail.jtp?type=nodenode=4080177i=3
 wrote:
  
-Original Message-
From: [hidden
email]http://user/SendEmail.jtp?type=nodenode=4078114i=0
 [mailto:
   r-help-bounces@r-
project.org] On Behalf Of muzz56
Sent: Wednesday, November 16, 2011 12:28 PM
To: [hidden
email]http://user/SendEmail.jtp?type=nodenode=4078114i=1
Subject: Re: [R] Pairwise correlation
   
Thanks Peter. I tried this after reading in the csv (read.csv) and
converted the data to matrix (as.matrix). But when I tried the
correlation,
I keeping getting the error (x must be numeric) yet when I view
 the
data,
its numeric.
   
  
   What does R tell you if you execute the following?
  
   str(x)
  
   Just because the data looks like it is numeric when it prints
 doesn't
   mean
   it is.
  
  
   Dan
  
   Daniel J. Nordlund
   Washington State Department of Social and Health Services
   Planning, Performance, and Accountability
   Research and Data Analysis Division
   Olympia, WA 98504-5204
  
  
   __
   [hidden email]
   http://user/SendEmail.jtp?type=nodenode=4078114i=2mailing list
   https://stat.ethz.ch/mailman/listinfo/r-help

[R] Combining data

2011-11-17 Thread Nasrin Pak

Hi all;

It seemed to be easy at first, but I didn't manage to find the answer
through the google search. I have a set of data for every second of the
experiment, but I don't need such a high resolution for my analysis. I want
to replace every 30 row of my data with their average value. And then save
the new data set in a new csv file to be able to have a smaller excel data
sheet. What is the command for combining certain number of data into their
average value?

Thank you

-- 
Nasrin  Pak
MSc Student in Environmental Physics
University of Calgary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] lme contrast Error in `$-.data.frame`(`tmp`, df, value = numeric(0)) :

2011-11-17 Thread Tanu Soni

I am trying to run a lme model and some contrast for a matrix .
lnY
 [1] 10.911628 11.198557 11.316971 11.464869 11.575233 11.612101 11.755903
11.722035 11.757705 11.863744 11.846515 11.852721 11.866936 11.838452
11.946680 11.885509
[17] 11.583309 11.750082 11.756005 11.630797 11.705536 11.566722 11.679448
11.703521NA 11.570949 11.716919 11.573343 11.733770 11.720801
11.804124 11.775074
[33] 11.801669 11.856955 11.875859 11.851852 11.830149 11.920156 11.954247
11.880917 11.806162  7.823646 11.909182NANA 11.912386
12.048816 11.958284
[49] 11.929021 11.986062 11.968418 11.967999 11.911608



plate
 [1] 2 1 2 2 1 1 1 2 2 1 2 1 3 3 3 3 3 3 4 4 4 4 4 4 5 5 5 5 5 5 6 6 6 6 6
6 7 7 7 7 7 7 8 8 8 8 8 9 9 9 9 9 9
Levels: 1 2 3 4 5 6 7 8 9
 gb
 [1] tSac  tSac  tAceK tAceK cDMSO cDMSO tAceK tSac  cDMSO tAceK cDMSO
tSac  cDMSO cDMSO tSac  tSac  tAceK tAceK tSac  cDMSO tSac  tAceK cDMSO
tAceK tSac  cDMSO tAceK cDMSO
[29] tAceK tSac  cDMSO cDMSO tSac  tAceK tSac  tAceK tSac  tAceK cDMSO
cDMSO tAceK tSac  tAceK tSac  cDMSO tAceK tSac  tSac  cDMSO tAceK tSac
tAceK cDMSO
Levels: cDMSO tAceK tSac
 time
 [1] 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr 1hr
1hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr 4hr
15m 15m 15m 15m 15m 15m
[43] 15m 15m 15m 15m 15m 15m 15m 15m 15m 15m 15m
Levels: 15m 1hr 4hr


metab2-data.frame(plate,lnY,gb,time)
fm1-lme(lnY ~ time*gb,random=~1|plate,metab2,na.action=na.omit)
t1-contrast(fm1,a = list(gb =cDMSO ,time=15m  ),b = list(gb =
tAceK,time=15m))
t2-contrast(fm1,a = list(gb =cDMSO ,time=15m  ),b = list(gb =
tSac,time=15m))
I am doing similar contrasts in 1hr and 4hr time
result:

t1
lme model parameter contrast

   Contrast  S.E.  LowerUpper   t
df Pr(|t|)
 0.01466447  0.3880718  -0.7459424   0.77527130.0439 0.97
 t2
lme model parameter contrast

  Contrast S.E.  Lower Upper
t   df  Pr(|t|)
 0.8007098 0.401809 0.01317859  1.588241 1.99  39
0.0533

but it doesnt work when my lnY is


lnY
 [1] 14.08164 14.03683 15.23784 14.86681 15.69648 15.62681 15.38057
13.79152 15.59356 15.26301 15.49928 14.02714 15.54317 15.44776 14.51406
14.26436 14.76043 15.01506
[19] 13.75356 15.36528 13.86303 14.40074 15.39995 14.34945 14.32001
15.41146 14.43210 15.87487 14.31152 13.75980 15.44153 15.72775 13.83677
14.35888 14.08998 14.40057
[37] 15.25646 15.21430 15.21883 15.09338 15.24249 15.15223 15.19692
15.10101 15.16232 15.81154 15.30002 15.31443 15.25059 15.10284 15.38775
15.28618 15.38108
I am able to fit the model i.e i am getting my fm1

 t1
lme model parameter contrast

Error in `$-.data.frame`(`*tmp*`, df, value = numeric(0)) :
  replacement has 0 rows, data has 1

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] function sum for array

2011-11-17 Thread Simone Salvadei

I'm looking for a function that allows to sum the elements of an array
along a dimension that can be different from the classical ones (rows or
columns).

Let's suppose for example that:

- A is an array with dimensions 2 x 3 x 4
- I want to compute B, a 2 x 3 matrix with elements equal to the sum of the
corrensponding elements on each of the 3 strata.

I've tried to use   apply(A,3,sum) but the result is a vector, not a matrix.
Another solution is a less elegant

B=matrix(rep(0,6),ncol=3)
for(t in 1:4) B = B + A[ , , t]

May anybody help?
S

-- 
---

Simone Salvadei

Faculty of Economics
Department of Financial and Economic Studies and Quantitative Methods
University of Rome Tor Vergata
e-mail: simone.salva...@uniroma2.it federico.belo...@uniroma2.it
url: http://www.economia.uniroma2.it/phd/econometricsempiricaleconomics/
http://www.econometrics.it/
---

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] hierarchical clustering within a size limit

2011-11-17 Thread petaltail

You can print out the nodes and their corresponding clusters into a file by
this:
 write.table (hc,file=hc_40clusters.cvs, quote=FALSE, sep=   )


--
View this message in context: 
http://r.789695.n4.nabble.com/hierarchical-clustering-within-a-size-limit-tp3515354p4080551.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Time Series w/ Unequal Time Steps

2011-11-17 Thread kthomp271828

Hi.  I am new to R and actually have several questions related to this topic. 
A row in my data looks like the following:

418 12  6/21/2010   9:37:12 40.7219593  -73.9962579 
1.3406345525960568
0.019682641058810173

In order, the columns are id, week, date, time, latitude, longitude, heading
and displacement (no actual header though).  I would like to read in the
date, time, heading and displacement from a file.  I would like to combine
date and time into a single DateTime object. Then, I would like to do two
things: (1) view a 3d scatterplot of DateTime, heading and displacement and
(2) do an autoregression for heading indexed by DateTime.

Thanks for any help.
Regards,
Keith

--
View this message in context: 
http://r.789695.n4.nabble.com/Time-Series-w-Unequal-Time-Steps-tp4080562p4080562.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R help

2011-11-17 Thread Arango, Ana

Hello:
I have some trouble making a prediction from an AR(p) model. After I have the 
AR(p) model fitted , I want to use a new data set to make predictions.
But I get the error:  Error in newdata - object$x.mean :   non-numeric argument 
to binary operator.

A small  version of my  original data looks like:
X1

X2

X3

X4

40813.65

1

10

41.86755

40813.65

1

8

41.86755

40813.66

1

8

41.86755

40813.66

1

8

41.86755

40813.66

1

8

41.86755

40813.67

1

8

41.86755

40813.67

1

6

41.86755

40813.67

1

6

41.86755

40813.68

1

6

41.86755

40813.68

1

6

41.86755

40813.73

1

4

41.86755


Sh-read.table(C:\\ Desktop\\Sh.txt,sep=,,header=TRUE)
model - ar.yw(Sh[,3])

My new data looks like:
X3
10
8
8
8
8
8
6
6
6
6
4
4
4
4
4
4
4
4
4
4
5
5


me-read.table(C:\\Users\\351240\\Desktop\\me.txt,sep=,,header=TRUE)

predict(model,me,n.ahead = 1)

Then I get the error: Error in newdata - object$x.mean :   non-numeric argument 
to binary operator.
Can someone help me please.
Thanks,
Ana Lucia

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Difference between two time series

2011-11-17 Thread Sarwarul Chy

Hello Michael,
Thanks again for your reply. Actually, I am working with wind data.

I have some sample data for actual load.

scan(/home/sam/Desktop/tt.dat) -tt   ## This is the input for the  actual
output of the generation
t = ts(tt, start=8, end=24, frequency=1,)

I have another random sequence for Generator Dispatch 

scan(/home/sam/Desktop/ss.dat) -ss## Input for the Generator Dispatch
s= ts(ss, start=10,end=22, frequency=1)

What I want to do now to take the Max and Min difference  of this two
sequence (t and s) over a fixed  time interval. 
 Something like, X=max(t-s, start=10, end=12) # I have an error here, I want
he difference between two over an interval
Y=min(t-sx, start=10, end=12)

Then predict the max and min error between time t and t+1 on the basis of
information that I have at t-1. Thanks again.
Sam

--
View this message in context: 
http://r.789695.n4.nabble.com/Difference-between-two-time-series-tp819843p4080672.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] White lines on persp plots in pdf format

2011-11-17 Thread Marc Schwartz

On Nov 17, 2011, at 10:13 AM, Miguel Lacerda wrote:

 Hi,
 
 I am using the persp function to plot 3D surfaces, but the plots have
 little white lines when I print them to a pdf file (visible in
 Acrobat, Foxit, Evince, Xpdf and Gimp). This does not happen when I
 create png or tiff images. Here is some sample code:
 
 pdf(test.pdf)
 
 x - seq(0,1,length=101)
 
 f - dnorm(x, 0, 0.25)
 
 z - c()
 
 for(i in 1:100) z - cbind(z,f)
 
 persp(z, col=red,theta=40, phi=10, shade=1.5, d=4, border=NA)
 
 dev.off()
 
 The resulting graph is attached. Anyone know how to get rid of the
 little white lines?
 
 Thanks!
 Miguel


See:

  http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-are-there-unwanted-borders

and the Note section of ?pdf.

HTH,

Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] set random numbers seed for different cpu's

http://search.dilbert.com/comic/Random%20Number%20Generator

In all seriousness, you could set the seed differently on each machine
after putting jobs through Torque (i.e., as part of the batch script,
maybe using some piece of hardware id you can get through system()
somehow or other: possibly network id?) and you're very,very,very,very
likely to get different results.

Michael

On Thu, Nov 17, 2011 at 9:30 AM, fantomas tomas.iesman...@gmail.com wrote:
Hi

I'm running the same R script (throuth linux shell) of several cpu's. This
R program uses random numbers and the result should be different every time.
But if put jobs (through Torque) for several cpu's I get the same result. As
a resealt my program saves numbers in file with randomly generated names.
works like a charm on one cpu, but I get the same result from different
cpu's.
So my question is, how can I resolve this? How to set pseudo random number
seed so that different cpu's would produce different results?

Thank you in advance.

--
View this message in context:
http://r.789695.n4.nabble.com/set-random-numbers-seed-for-different-cpu-s-tp4080165p4080165.html
Sent from the R help mailing list archive at Nabble.com.

Re: [R] set random numbers seed for different cpu's

Sorry -- that came off as very muddled.

What I meant to say:

To make it (almost) certain you will get different results on each
machine, you can reset the PRNG seed on each machine in some way
unique to that machine. What immediately came to mind was IP address,
which you can access with something like this:

x - system(curl -s http://checkip.dyndns.org | sed 's/[a-zA-Z/
:]//g' , intern = TRUE)
# Note you might have to tweak it for your OS

Michael

On Thu, Nov 17, 2011 at 3:31 PM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
http://search.dilbert.com/comic/Random%20Number%20Generator

Michael

On Thu, Nov 17, 2011 at 9:30 AM, fantomas tomas.iesman...@gmail.com wrote:
Hi

Thank you in advance.

--
View this message in context:
http://r.789695.n4.nabble.com/set-random-numbers-seed-for-different-cpu-s-tp4080165p4080165.html
Sent from the R help mailing list archive at Nabble.com.

Re: [R] Pairwise correlation

I can't see how it's stored like that and the email servers garble it
up. Use dput() to create a plain text representation and paste that
back in.

Thanks,
Michael

On Thu, Nov 17, 2011 at 9:37 AM, muzz56 musah...@gmail.com wrote:
 Hi Michael,
 Here is a sample of the data.

  Gene Array1 Array2 Array3 Array4 Array5 Array6 Array7 Array8 Array9 Array10
 Array11  Fth1 26016.01 23134.66 17445.71 39856.04 27245.45 23622.98 37887.75
 49857.46 25864.73 21852.51 29198.4  B2m 7573.64 7768.52 6608.24 8571.65
 6380.78 6242.76 6903.92 7330.63 7256.18 5678.21 10937.05  Tmsb4x 6192.44
 4277.22 5024.59 4851.51 3062.55 4562.43 7948.1 5018.58 3200.17 2855.77
 6139.23  H2-D1 3141.41 3986.06 3328.62 4726.6 3589.89 2885.95 7509.88
 5257.62 4742.26 3431.33 5300.72  Prdx5 3935.7 3938.9 3401.68 4193.14 4028.95
 3438.19 6640.15 5486.61 4424.57 3368.83 5265.92
 I want to retain the gene names in the data. What you've proposed will take
 them out and I'll have to append them back to the results after the cor()

 On 17 November 2011 09:33, Michael Weylandt [via R] 
 ml-node+s789695n4080177...@n4.nabble.com wrote:

 I think something like this should do it, but I can't test without data:

 rownames(mydata) - mydata[,1] # Put the elements in the first column
 as rownames
 mydata - mydata[,-1] # drop the things that are now rownames

 Michael

 On Thu, Nov 17, 2011 at 9:23 AM, Musa Hassan [hidden 
 email]http://user/SendEmail.jtp?type=nodenode=4080177i=0
 wrote:

  Hi Michael,
  Thanks for the response. I have noticed that the error occurred during
 my
  data read. It appears that the rownames (which when the data is
 transposed
  become my colnames) were converted to numbers instead of strings as they
  should be. The original header names don't change, just the rownames. I
 have
  to figure out how to import the data and have the strings not converted.
  Right now am using:
  mydata = read.csv(mydata.csv, headers=T,stringsAsFactors=F)
 
  then to convert the data frame to matrix
  mydata=data.matrix(mydata)
 
  Then I just do the correlation as Peter suggested.
 
  expression=cor(t(expression))
 
  Thanks.
 
  On 17 November 2011 08:51, R. Michael Weylandt [hidden 
  email]http://user/SendEmail.jtp?type=nodenode=4080177i=1

  wrote:
 
  On Wed, Nov 16, 2011 at 11:22 PM, muzz56 [hidden 
  email]http://user/SendEmail.jtp?type=nodenode=4080177i=2
 wrote:
   Thanks to everyone who replied to my post, I finally got it to work.
 I
   am
   however not sure how well it worked since it run so quickly, but
 seems
   like
   I have a 2000 x 2000 data set.
 
  Behold the great and mighty power that is R! Don't worry -- on a
  decent machine the correlation of a 2k x 2k data set should be pretty
  fast. (It's about 9 seconds on my old-ish laptop with a bunch of other
  junk running)
 
    My followup questions would be, how do I get
   only pairs with say a certain pearson correlation value additionally
 it
   seems like my output didn't retain the headers but instead replaced
 them
   with numbers making it hard to know which gene pairs correlate.
 
  This is a little worrisome: R carries column names through cor() so
  this would suggest you weren't using them. Were your headers listed as
  part of your data (instead of being names)? If so, they would have
  been taken as numbers.
 
  Take a look at dimnames(NAMEOFDATA) -- if your headers aren't there,
  then they are being treated as data instead of numbers. If they are,
  can you provide some reproducible code and we can debug more fully.
  The easiest way to send data is to use the dput() function to get a
  copy-pasteable plain text representation. It would also be great if
  you could restrict it to a subset of your data rather than the full 4M
  data points, but if that's hard to do, don't worry.
 
  You should have expected behavior like
 
  X - matrix(1:9,3)
  colnames(X) - c(A,B,C)
  cor(X) # Prints with labels
 
  Michael
 
  
   On 16 November 2011 17:11, Nordlund, Dan (DSHS/RDA) [via R] 
   [hidden email] http://user/SendEmail.jtp?type=nodenode=4080177i=3
 wrote:
  
-Original Message-
From: [hidden
email]http://user/SendEmail.jtp?type=nodenode=4078114i=0
 [mailto:
   r-help-bounces@r-
project.org] On Behalf Of muzz56
Sent: Wednesday, November 16, 2011 12:28 PM
To: [hidden
email]http://user/SendEmail.jtp?type=nodenode=4078114i=1
Subject: Re: [R] Pairwise correlation
   
Thanks Peter. I tried this after reading in the csv (read.csv) and
converted the data to matrix (as.matrix). But when I tried the
correlation,
I keeping getting the error (x must be numeric) yet when I view
 the
data,
its numeric.
   
  
   What does R tell you if you execute the following?
  
   str(x)
  
   Just because the data looks like it is numeric when it prints
 doesn't
   mean
   it is.
  
  
   Dan
  
   Daniel J. Nordlund
   Washington State Department of Social and Health Services
   Planning, Performance, and Accountability
   Research

Re: [R] function sum for array

It might not be as general as you have in mind, but this works:

X = array(1:24, c(2,3,4))
rowSums(X, dims = 2)

Combined with aperm() it's pretty powerful.

Michael


On Thu, Nov 17, 2011 at 11:24 AM, Simone Salvadei
simone.salva...@gmail.com wrote:
 I'm looking for a function that allows to sum the elements of an array
 along a dimension that can be different from the classical ones (rows or
 columns).

 Let's suppose for example that:

 - A is an array with dimensions 2 x 3 x 4
 - I want to compute B, a 2 x 3 matrix with elements equal to the sum of the
 corrensponding elements on each of the 3 strata.

 I've tried to use   apply(A,3,sum) but the result is a vector, not a matrix.
 Another solution is a less elegant

 B=matrix(rep(0,6),ncol=3)
 for(t in 1:4) B = B + A[ , , t]

 May anybody help?
 S

 --
 ---

 Simone Salvadei

 Faculty of Economics
 Department of Financial and Economic Studies and Quantitative Methods
 University of Rome Tor Vergata
 e-mail: simone.salva...@uniroma2.it federico.belo...@uniroma2.it
 url: http://www.economia.uniroma2.it/phd/econometricsempiricaleconomics/
 http://www.econometrics.it/
 ---

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] how to include a factor or class Variable

2011-11-17 Thread Bert Gunter

Please post this to r-sig-mixed-models instead. Or, better yet,
consult your local statistician, as your question indicates a profound
lack of understanding that may require more back and forth discussion
than can occur on an internet help site.

-- Bert

On Thu, Nov 17, 2011 at 5:36 AM, arunkumar akpbond...@gmail.com wrote:
 Hi

 How to include a factor or class variable to a fixed effect of lmer
 function. when i included it throws an error. Please help

 My code

 data - read.delim(C:/TestData/data.txt)

  Mon=as.factor(data$Month)

  lmerform= Y~ X2 +X3 + Month:Mon + (1|State)+ (1+ X5|State)
  lmerfit=lmer(formula=lmerform,data=data)
  summary(lmerfit)

 My data

 State   Year    Month   Y       X2      X3      X4      X5      X6
 GA      1960    1       27.8    397.5   42.2    50.7    78.3    65.8
 FA      1960    2       29.9    413.3   38.1    52      79.2    66.9
 GA      1961    3       29.8    439.2   40.3    54      79.2    67.8
 FA      1961    4       30.8    459.7   39.5    55.3    79.2    69.6
 GA      1962    1       31.2    492.9   37.3    54.7    77.4    68.7
 FA      1962    2       33.3    528.6   38.1    63.7    80.2    73.6
 GA      1963    3       35.6    560.3   39.3    69.8    80.4    76.3


 --
 View this message in context: 
 http://r.789695.n4.nabble.com/how-to-include-a-factor-or-class-Variable-tp4079991p4079991.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Combining data

2011-11-17 Thread MacQueen, Don

There is no single command to do all of what you want.

Read the posting guide for advice on how to ask questions that are more
likely to receive helpful answers.


The mean() function is a command for combining certain number of data
into their average value.

The write.csv() function will create a new csv file.

The aggregate() function may help.

-Don

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 11/17/11 7:37 AM, Nasrin Pak astronas...@gmail.com wrote:

Hi all;

It seemed to be easy at first, but I didn't manage to find the answer
through the google search. I have a set of data for every second of the
experiment, but I don't need such a high resolution for my analysis. I
want
to replace every 30 row of my data with their average value. And then save
the new data set in a new csv file to be able to have a smaller excel data
sheet. What is the command for combining certain number of data into their
average value?

Thank you

-- 
Nasrin  Pak
MSc Student in Environmental Physics
University of Calgary

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Combining data

2011-11-17 Thread Bert Gunter

Well, for

 What is the command for combining certain number of data into their
average value?

one way would be (calling the data vector, x)

colMeans(matrix ( x[ seq_len(30 * floor(length(x)/30))], nrow=30))

Note that this will leave out the mean of any values with indices
beyond the largest multiple of 30 less than or equal to the length of
x.

There are probably 87 other ways to do this, many of which might be
better, simpler, faster, or slicker.

-- Bert




On Thu, Nov 17, 2011 at 1:15 PM, MacQueen, Don macque...@llnl.gov wrote:
 There is no single command to do all of what you want.

 Read the posting guide for advice on how to ask questions that are more
 likely to receive helpful answers.


 The mean() function is a command for combining certain number of data
 into their average value.

 The write.csv() function will create a new csv file.

 The aggregate() function may help.

 -Don

 --
 Don MacQueen

 Lawrence Livermore National Laboratory
 7000 East Ave., L-627
 Livermore, CA 94550
 925-423-1062





 On 11/17/11 7:37 AM, Nasrin Pak astronas...@gmail.com wrote:

Hi all;

It seemed to be easy at first, but I didn't manage to find the answer
through the google search. I have a set of data for every second of the
experiment, but I don't need such a high resolution for my analysis. I
want
to replace every 30 row of my data with their average value. And then save
the new data set in a new csv file to be able to have a smaller excel data
sheet. What is the command for combining certain number of data into their
average value?

Thank you

--
Nasrin  Pak
MSc Student in Environmental Physics
University of Calgary

       [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorizing for weighted distance

I'm not quite sure of what you mean by not worry if it's 1d R matrices. X1
and X2 are both n by d matrices and W is d by d.

Thanks for the help though. Any other ideas?

Thanks
Sachin

On Friday, November 18, 2011, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:
 The fastest is probably to just implement the matrix calculation
 directly in R with the %*% operator.

 (X1-X2) %*% W %*% (X1-X2)

 You don't need to worry about the transposing if you are passing R
 vectors X1,X2. If they are 1-d matrices, you might need to.

 Michael

 On Thu, Nov 17, 2011 at 1:30 AM, Sachinthaka Abeywardana
 sachin.abeyward...@gmail.com wrote:
 Hi All,

 I am trying to convert the following piece of matlab code to R:

 XX1 = sum(w(:,ones(1,N1)).*X1.*X1,1);  #square the elements of
X1,
 weight it and repeat this vector N1 times
 XX2 = sum(w(:,ones(1,N2)).*X2.*X2,1);  #square the elements of
X2,
 weigh and repeat this vector N2 times
 X1X2 = (w(:,ones(1,N1)).*X1)'*X2; #get the weighted
 'covariance' term
 XX1T = XX1';  #transpose
 z = XX1T(:,ones(1,N2)) + XX2(ones(1,N1),:) - 2*X1X2;#get the
 squared weighted distance

 which is basically doing: z=(X1-X2)' W (X1-X2)

 What would the best way (for SPEED) to do this? or is vectorizing as
above
 the best? Any hints, suggestions?

 Thanks,
 Sachin

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] merging corpora and metadata

2011-11-17 Thread Henri-Paul Indiogine

Greetings!

I loose all my metadata after concatenating corpora. This is an
example of what happens:

 meta(corpus.1)
   MetaID cid fid selfirst selend fname
1   0   1  11 2169   2518WCPD-2001-01-29-Pg217.scrb
2   0   1  14 9189   9702 WCPD-2003-01-13-Pg39.scrb
3   0   1  14 2109   2577 WCPD-2003-01-13-Pg39.scrb




17  0   1 11417863  18256WCPD-2007-04-30-Pg515.scrb


 meta(corpus.2)
   MetaID cid fid selfirst selend fname
1   0   2   211016  11600   DCPD-200900595.scrb
2   0   2   619510  20098   DCPD-201000636.scrb
3   0   2   623935  24573   DCPD-201000636.scrb




94  0   2 12716225  17128   WCPD-2009-01-12-Pg22-3.scrb


 tot.corpus - c(corpus.1, corpus.2)
 meta(tot.corpus)

MetaID
10
20
30




111  0


This is from the structure of corpus.1

..$ MetaData:List of 2
  .. ..$ create_date: POSIXlt[1:1], format: 2011-11-17 21:09:57
  .. ..$ creator: chr henk
  ..$ Children: NULL
  ..- attr(*, class)= chr MetaDataNode
 - attr(*, DMetaData)='data.frame':   17 obs. of  6 variables:
  ..$ MetaID  : num [1:17] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ cid : int [1:17] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ fid : int [1:17] 11 14 14 17 46 80 80 80 91 91 ...
  ..$ selfirst: num [1:17] 2169 9189 2109 8315 9439 ...
  ..$ selend  : num [1:17] 2518 9702 2577 8881 10102 ...
  ..$ fname   : chr [1:17] WCPD-2001-01-29-Pg217.scrb
WCPD-2003-01-13-Pg39.scrb WCPD-2003-01-13-Pg39.scrb
WCPD-2004-05-17-Pg856.scrb ...
 - attr(*, class)= chr [1:3] VCorpus Corpus list


Any idea on what I could do to keep the metadata in the merged corpus?

Thanks,
Henri-Paul


-- 
Henri-Paul Indiogine

Curriculum  Instruction
Texas AM University
TutorFind Learning Centre

Email: hindiog...@gmail.com
Skype: hindiogine
Website: http://people.cehd.tamu.edu/~sindiogine

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorizing for weighted distance

I fail to see why you would need another idea: you asked how to
multiply matrices efficiently, I told you how to multiply matrices
efficiently.

if you want to calculate X1-X2 times W times X1-X2, then simply do so:

X1 - matrix(1:6, 3)
X2 - matrix(7:12, 3)
W = matrix(runif(9), 3)

t(X1-X2) %*% W %*% (X1-X2)

which gives

142.7789 142.7789
142.7789 142.7789

You could squeeze out one iota more of speed with

crossprod(X1-X2, W) %*% (X1-X2)

to get the same result, but unless you are doing massive scale linear
processing, I'm not sure it's worth the loss of clarity.

I was only giving you a heads up on the sometimes confusing difference
between matrix multiplication in MATLAB and in R by which a vector is
not a 1d matrix and so does not require explicit transposition.

Michael


On Thu, Nov 17, 2011 at 4:35 PM, Sachinthaka Abeywardana
sachin.abeyward...@gmail.com wrote:
 I'm not quite sure of what you mean by not worry if it's 1d R matrices. X1
 and X2 are both n by d matrices and W is d by d.

 Thanks for the help though. Any other ideas?

 Thanks
 Sachin

 On Friday, November 18, 2011, R. Michael Weylandt
 michael.weyla...@gmail.com wrote:
 The fastest is probably to just implement the matrix calculation
 directly in R with the %*% operator.

 (X1-X2) %*% W %*% (X1-X2)

 You don't need to worry about the transposing if you are passing R
 vectors X1,X2. If they are 1-d matrices, you might need to.

 Michael

 On Thu, Nov 17, 2011 at 1:30 AM, Sachinthaka Abeywardana
 sachin.abeyward...@gmail.com wrote:
 Hi All,

 I am trying to convert the following piece of matlab code to R:

 XX1 = sum(w(:,ones(1,N1)).*X1.*X1,1);          #square the elements of
 X1,
 weight it and repeat this vector N1 times
 XX2 = sum(w(:,ones(1,N2)).*X2.*X2,1);          #square the elements of
 X2,
 weigh and repeat this vector N2 times
 X1X2 = (w(:,ones(1,N1)).*X1)'*X2;                 #get the weighted
 'covariance' term
 XX1T = XX1';                                              #transpose
 z = XX1T(:,ones(1,N2)) + XX2(ones(1,N1),:) - 2*X1X2;            #get the
 squared weighted distance

 which is basically doing: z=(X1-X2)' W (X1-X2)

 What would the best way (for SPEED) to do this? or is vectorizing as
 above
 the best? Any hints, suggestions?

 Thanks,
 Sachin

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Spatial Statistics using R

2011-11-17 Thread David L Carlson

These might get you started

Analysing spatial point patterns in R by Adrian Baddeley
CSIRO and University of Western Australia
http://www.csiro.au/files/files/p10ib.pdf

Spatial Regression Analysis in R: A Workbook, by Luc Anselin
Spatial Analysis Laboratory
http://geodacenter.asu.edu/system/files/rex1.pdf


--
David L Carlson
Associate Professor of Anthropology
Texas AM University
College Station, TX 77843-4352


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of vioravis
Sent: Thursday, November 17, 2011 12:29 AM
To: r-help@r-project.org
Subject: [R] Spatial Statistics using R

I am looking for online courses to learn Spatial Statistics using R.
Statistics.com is offering an online course in December on the same topic
but that schedule doesn't suit mine. Are there any other similar modes for
learning spatial statistics using R??? Can someone please advice???

Thank you. 

Ravi

--
View this message in context:
http://r.789695.n4.nabble.com/Spatial-Statistics-using-R-tp4079092p4079092.h
tml
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorizing for weighted distance

Hi Michael,

Thanks for that. The X1 and X2 are vectors are typically 1000 by 3
matrices, and hoping to scale up to much larger dimensions (say 20,000 by
3).

I do appreciate your help and seems like this is the best way to do this, I
was just wondering if I could squeeze out just a bit more performance,
thats all.

Anyway thanks again, much appreciated.

Thanks,
Sachin

On Fri, Nov 18, 2011 at 9:15 AM, R. Michael Weylandt 
michael.weyla...@gmail.com wrote:

 I fail to see why you would need another idea: you asked how to
 multiply matrices efficiently, I told you how to multiply matrices
 efficiently.

 if you want to calculate X1-X2 times W times X1-X2, then simply do so:

 X1 - matrix(1:6, 3)
 X2 - matrix(7:12, 3)
 W = matrix(runif(9), 3)

 t(X1-X2) %*% W %*% (X1-X2)

 which gives

 142.7789 142.7789
 142.7789 142.7789

 You could squeeze out one iota more of speed with

 crossprod(X1-X2, W) %*% (X1-X2)

 to get the same result, but unless you are doing massive scale linear
 processing, I'm not sure it's worth the loss of clarity.

 I was only giving you a heads up on the sometimes confusing difference
 between matrix multiplication in MATLAB and in R by which a vector is
 not a 1d matrix and so does not require explicit transposition.

 Michael


 On Thu, Nov 17, 2011 at 4:35 PM, Sachinthaka Abeywardana
 sachin.abeyward...@gmail.com wrote:
  I'm not quite sure of what you mean by not worry if it's 1d R matrices.
 X1
  and X2 are both n by d matrices and W is d by d.
 
  Thanks for the help though. Any other ideas?
 
  Thanks
  Sachin
 
  On Friday, November 18, 2011, R. Michael Weylandt
  michael.weyla...@gmail.com wrote:
  The fastest is probably to just implement the matrix calculation
  directly in R with the %*% operator.
 
  (X1-X2) %*% W %*% (X1-X2)
 
  You don't need to worry about the transposing if you are passing R
  vectors X1,X2. If they are 1-d matrices, you might need to.
 
  Michael
 
  On Thu, Nov 17, 2011 at 1:30 AM, Sachinthaka Abeywardana
  sachin.abeyward...@gmail.com wrote:
  Hi All,
 
  I am trying to convert the following piece of matlab code to R:
 
  XX1 = sum(w(:,ones(1,N1)).*X1.*X1,1);  #square the elements of
  X1,
  weight it and repeat this vector N1 times
  XX2 = sum(w(:,ones(1,N2)).*X2.*X2,1);  #square the elements of
  X2,
  weigh and repeat this vector N2 times
  X1X2 = (w(:,ones(1,N1)).*X1)'*X2; #get the weighted
  'covariance' term
  XX1T = XX1';  #transpose
  z = XX1T(:,ones(1,N2)) + XX2(ones(1,N1),:) - 2*X1X2;#get
 the
  squared weighted distance
 
  which is basically doing: z=(X1-X2)' W (X1-X2)
 
  What would the best way (for SPEED) to do this? or is vectorizing as
  above
  the best? Any hints, suggestions?
 
  Thanks,
  Sachin
 
 [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Vectorizing for weighted distance

I'm starting to get a clearer idea of what you mean: there are two
(possibly three) routes you can go:

1) If your matrices are sparse (mostly zero) there's some specialized
work on multiplying them quickly

2) You can look at the RcppArmadillo package which interfaces to a
very high quality linear algebra backend. I think this one is likely
to give a very nice speedup without requiring too much additional
work.

3) (This one is the most technically difficult, but it can be pretty
powerful if done correctly) You can recompile R using a BLAS (basic
linear algebra system) that's optimized for your machine, rather than
a rather generic one that most computers come with. Something like
this: http://math-atlas.sourceforge.net/

Michael

On Thu, Nov 17, 2011 at 5:51 PM, Sachinthaka Abeywardana
sachin.abeyward...@gmail.com wrote:
 Hi Michael,
 Thanks for that. The X1 and X2 are vectors are typically 1000 by 3 matrices,
 and hoping to scale up to much larger dimensions (say 20,000 by 3).
 I do appreciate your help and seems like this is the best way to do this, I
 was just wondering if I could squeeze out just a bit more performance, thats
 all.
 Anyway thanks again, much appreciated.
 Thanks,
 Sachin

 On Fri, Nov 18, 2011 at 9:15 AM, R. Michael Weylandt
 michael.weyla...@gmail.com wrote:

 I fail to see why you would need another idea: you asked how to
 multiply matrices efficiently, I told you how to multiply matrices
 efficiently.

 if you want to calculate X1-X2 times W times X1-X2, then simply do so:

 X1 - matrix(1:6, 3)
 X2 - matrix(7:12, 3)
 W = matrix(runif(9), 3)

 t(X1-X2) %*% W %*% (X1-X2)

 which gives

 142.7789 142.7789
 142.7789 142.7789

 You could squeeze out one iota more of speed with

 crossprod(X1-X2, W) %*% (X1-X2)

 to get the same result, but unless you are doing massive scale linear
 processing, I'm not sure it's worth the loss of clarity.

 I was only giving you a heads up on the sometimes confusing difference
 between matrix multiplication in MATLAB and in R by which a vector is
 not a 1d matrix and so does not require explicit transposition.

 Michael


 On Thu, Nov 17, 2011 at 4:35 PM, Sachinthaka Abeywardana
 sachin.abeyward...@gmail.com wrote:
  I'm not quite sure of what you mean by not worry if it's 1d R matrices.
  X1
  and X2 are both n by d matrices and W is d by d.
 
  Thanks for the help though. Any other ideas?
 
  Thanks
  Sachin
 
  On Friday, November 18, 2011, R. Michael Weylandt
  michael.weyla...@gmail.com wrote:
  The fastest is probably to just implement the matrix calculation
  directly in R with the %*% operator.
 
  (X1-X2) %*% W %*% (X1-X2)
 
  You don't need to worry about the transposing if you are passing R
  vectors X1,X2. If they are 1-d matrices, you might need to.
 
  Michael
 
  On Thu, Nov 17, 2011 at 1:30 AM, Sachinthaka Abeywardana
  sachin.abeyward...@gmail.com wrote:
  Hi All,
 
  I am trying to convert the following piece of matlab code to R:
 
  XX1 = sum(w(:,ones(1,N1)).*X1.*X1,1);          #square the elements of
  X1,
  weight it and repeat this vector N1 times
  XX2 = sum(w(:,ones(1,N2)).*X2.*X2,1);          #square the elements of
  X2,
  weigh and repeat this vector N2 times
  X1X2 = (w(:,ones(1,N1)).*X1)'*X2;                 #get the weighted
  'covariance' term
  XX1T = XX1';                                              #transpose
  z = XX1T(:,ones(1,N2)) + XX2(ones(1,N1),:) - 2*X1X2;            #get
  the
  squared weighted distance
 
  which is basically doing: z=(X1-X2)' W (X1-X2)
 
  What would the best way (for SPEED) to do this? or is vectorizing as
  above
  the best? Any hints, suggestions?
 
  Thanks,
  Sachin
 
         [[alternative HTML version deleted]]
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide
  http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] return only pairwise correlations greater than given value

2011-11-17 Thread B77S

This is probably not the prettiest or most efficient function ever, but this
seems to do what I wanted.


spec.cor - function(dat, r, ...){

require(reshape)

d1 - data.frame(cor(dat))
d2 - melt(d1)
d2[,3] - rep(rownames(d1), nrow(d2)/length(unique(d2[,1])))
d2 - d2[,c(variable, V3, value)]
colnames(d2) - c(V1, V2, value)
d2 - d2[with(d2, which(V1 != V2, arr.ind=T)), ]
d2 - d2[which(d2[,3] =r | d2[,3] = -r, arr.ind=T),]
d2[,1:2] - t(apply(d2[,1:2], MARGIN=1, function(x) sort(x)))
d2 - unique(d2)

return(d2)
}



data(mtcars)


 spec.cor(mtcars[,2:5], .6)
Using  as id variables
V1   V2  value
2  cyl disp  0.9020329
3  cyl   hp  0.8324475
4  cyl drat -0.6999381
7 disp   hp  0.7909486
8 disp drat -0.7102139



I'm not sure how to make melt() quit giving the Using  as id variables
warning, but I don't really care either.






B77S wrote:
 
 Thanks Michael, 
 
 I just started on the following code (below), and realized I should ask,
 as this likely exists already. 
 
 basically what I'd like is for the function to return (basically) what you
 just suggested, plus the names of the two variables (I suppose pasted
 together would be good). 
 
 I hope that is clear, and obviously I didn't get so far as to add the
 names to the output.
 
 # 
 sig.cor - function(dat, r, ...){
 
 cv2 - data.frame(cor(dat))
 var.names - rownames(cv2)
 
 list.cv2 - which(cv2 =r | cv2 = -r, arr.ind=T)
 cor.r - cv2[list.cv2[which(list.cv2 [,row]!=list.cv2 [,col]),]]
 cor.names - var.names[list.cv2[which(list.cv2 [,row]!=list.cv2
 [,col]),]]
 
   
 return(cor.r)
 
 }
 
 
 data(mtcars)
 sig.cor(mtcars[,2:5], .90)
 
 
 # sig.cor(mtcars[,2:5], .90)
 #[1] 0.9020329 0.9020329
 
 
 # Ideally this would look likt this:
 
 cyl-disp
 0.9020329
 
 
 
 
 
 
 Michael Weylandt wrote:
 
 What exactly do you mean returns them? More generally I suppose,
 what do you have in mind to do with this?
 
 You could do something like this:
 
 BigCorrelation - function(X){
 
  return(which(abs(cor(X))  0.9, arr.ind = T))
 }
 
 but it hardly seems worth its own function call.
 
 On Thu, Nov 17, 2011 at 12:42 AM, B77S lt;bps0002@gt; wrote:
 Hello,

  I would like to find out if a function already exists that returns only
 pairwise correlations above/below a certain threshold (e.g, -.90, .90)

 Thank you.



 --
 View this message in context:
 http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079028.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 


--
View this message in context: 
http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4081534.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] return only pairwise correlations greater than given value

Hi Brad,

You do not really need to reshape the correlation matrix.  This seems
to do what you want:

spec.cor - function(dat, r, ...) {
  x - cor(dat, ...)
  x[upper.tri(x, TRUE)] - NA
  i - which(abs(x) = r, arr.ind = TRUE)
  data.frame(matrix(colnames(x)[as.vector(i)], ncol = 2), value = x[i])
}

spec.cor(mtcars[, 2:5], .6)

Cheers,

Josh

On Wed, Nov 16, 2011 at 9:58 PM, B77S bps0...@auburn.edu wrote:
 Thanks Michael,

 I just started on the following code (below), and realized I should as as
 this might exist.

 basically what I'd like is for the function to return (basically) what you
 just suggested, plus the names of the two variables (I suppose pasted
 together would be good).

 I hope that is clear.

 #
 sig.cor - function(dat, r, ...){

 cv2 - data.frame(cor(dat))
 var.names - rownames(cv2)

 list.cv2 - which(cv2 =r | cv2 = -r, arr.ind=T)
 cor.r - cv2[list.cv2[which(list.cv2 [,row]!=list.cv2 [,col]),]]
 cor.names - var.names[list.cv2[which(list.cv2 [,row]!=list.cv2
 [,col]),]]


 return(cor.r)

 }





 Michael Weylandt wrote:

 What exactly do you mean returns them? More generally I suppose,
 what do you have in mind to do with this?

 You could do something like this:

 BigCorrelation - function(X){

      return(which(abs(cor(X))  0.9, arr.ind = T))
 }

 but it hardly seems worth its own function call.

 On Thu, Nov 17, 2011 at 12:42 AM, B77S lt;bps0002@gt; wrote:
 Hello,

  I would like to find out if a function already exists that returns only
 pairwise correlations above/below a certain threshold (e.g, -.90, .90)

 Thank you.



 --
 View this message in context:
 http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079028.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 View this message in context: 
 http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079044.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Combining data

2011-11-17 Thread David Winsemius



On Nov 17, 2011, at 10:37 AM, Nasrin Pak wrote:


Hi all;

It seemed to be easy at first, but I didn't manage to find the answer
through the google search. I have a set of data for every second of  
the
experiment, but I don't need such a high resolution for my analysis.  
I want
to replace every 30 row of my data with their average value. And  
then save
the new data set in a new csv file to be able to have a smaller  
excel data
sheet. What is the command for combining certain number of data into  
their

average value?



This aggregates mean values in groups of ten.

 aggregate(data.frame(a=rnorm(100)), list(rep(1:10, each=10)),  
FUN=mean)

   Group.1   a
11 -0.59492893
22  0.20087525
33 -0.06310919
44 -0.60778424
55 -0.01435818
66 -0.01159243
77  0.05921309
88 -0.04881492
99  0.43796040
10  10 -0.02968688


Thank you

--
Nasrin  Pak
MSc Student in Environmental Physics
University of Calgary

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] return only pairwise correlations greater than given value

2011-11-17 Thread B77S

Excellent; thanks Josh.


Joshua Wiley-2 wrote:
 
 Hi Brad,
 
 You do not really need to reshape the correlation matrix.  This seems
 to do what you want:
 
 spec.cor - function(dat, r, ...) {
   x - cor(dat, ...)
   x[upper.tri(x, TRUE)] - NA
   i - which(abs(x) = r, arr.ind = TRUE)
   data.frame(matrix(colnames(x)[as.vector(i)], ncol = 2), value = x[i])
 }
 
 spec.cor(mtcars[, 2:5], .6)
 
 Cheers,
 
 Josh
 
 On Wed, Nov 16, 2011 at 9:58 PM, B77S lt;bps0002@gt; wrote:
 Thanks Michael,

 I just started on the following code (below), and realized I should as as
 this might exist.

 basically what I'd like is for the function to return (basically) what
 you
 just suggested, plus the names of the two variables (I suppose pasted
 together would be good).

 I hope that is clear.

 #
 sig.cor - function(dat, r, ...){

 cv2 - data.frame(cor(dat))
 var.names - rownames(cv2)

 list.cv2 - which(cv2 =r | cv2 = -r, arr.ind=T)
 cor.r - cv2[list.cv2[which(list.cv2 [,row]!=list.cv2 [,col]),]]
 cor.names - var.names[list.cv2[which(list.cv2 [,row]!=list.cv2
 [,col]),]]


 return(cor.r)

 }





 Michael Weylandt wrote:

 What exactly do you mean returns them? More generally I suppose,
 what do you have in mind to do with this?

 You could do something like this:

 BigCorrelation - function(X){

      return(which(abs(cor(X))  0.9, arr.ind = T))
 }

 but it hardly seems worth its own function call.

 On Thu, Nov 17, 2011 at 12:42 AM, B77S lt;bps0002@gt; wrote:
 Hello,

  I would like to find out if a function already exists that returns
 only
 pairwise correlations above/below a certain threshold (e.g, -.90, .90)

 Thank you.



 --
 View this message in context:
 http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079028.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 View this message in context:
 http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4079044.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 
 -- 
 Joshua Wiley
 Ph.D. Student, Health Psychology
 Programmer Analyst II, ATS Statistical Consulting Group
 University of California, Los Angeles
 https://joshuawiley.com/
 
 __
 R-help@ mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 


--
View this message in context: 
http://r.789695.n4.nabble.com/return-only-pairwise-correlations-greater-than-given-value-tp4079028p4081643.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging corpora and metadata

2011-11-17 Thread R. Michael Weylandt michael.weyla...@gmail.com

Hi Henri-Paul,

This can be rather tricky.  It would really help if you could give us
a reproducible example.  In this case, because you are dealing with
non standard data structures (or at least added attributes), the data
exactly as R sees it.  This means either A) code to create some data
that demonstrates your problem or B) the output of calling
dput(corpus.1) (see ?dput for what it does and what to do).

One possibility (though it does not concatenate per se):

combined - list(corpus.1, corpus.2)

*if* (there are only attributes in corpus.1 OR corpus.2) OR (the
attribute names in corpus.1 and corpus.2 are unique), then you could
do:

combined - c(corpus.1, corpus.2)
attributes(combined) - c(attributes(corpus.1), attributes(corpus.2)

but note that it is *very* likely that at least the names attributes
overlap, so you would need to address that somehow.  If attributes
overlap, you need to somehow merge them, and what is an appropriate
way to do that, I have no idea without knowing more about the data and
what is expected by functions that work with it.

Best regards,

Josh

On Thu, Nov 17, 2011 at 1:43 PM, Henri-Paul Indiogine
hindiog...@gmail.com wrote:
 Greetings!

 I loose all my metadata after concatenating corpora. This is an
 example of what happens:

 meta(corpus.1)
   MetaID cid fid selfirst selend                         fname
 1       0   1  11     2169   2518    WCPD-2001-01-29-Pg217.scrb
 2       0   1  14     9189   9702     WCPD-2003-01-13-Pg39.scrb
 3       0   1  14     2109   2577     WCPD-2003-01-13-Pg39.scrb

 
 

 17      0   1 114    17863  18256    WCPD-2007-04-30-Pg515.scrb


 meta(corpus.2)
   MetaID cid fid selfirst selend                         fname
 1       0   2   2    11016  11600           DCPD-200900595.scrb
 2       0   2   6    19510  20098           DCPD-201000636.scrb
 3       0   2   6    23935  24573           DCPD-201000636.scrb

 
 

 94      0   2 127    16225  17128   WCPD-2009-01-12-Pg22-3.scrb


 tot.corpus - c(corpus.1, corpus.2)
 meta(tot.corpus)

    MetaID
 1        0
 2        0
 3        0

 
 

 111      0


 This is from the structure of corpus.1

 ..$ MetaData:List of 2
  .. ..$ create_date: POSIXlt[1:1], format: 2011-11-17 21:09:57
  .. ..$ creator    : chr henk
  ..$ Children: NULL
  ..- attr(*, class)= chr MetaDataNode
  - attr(*, DMetaData)='data.frame':   17 obs. of  6 variables:
  ..$ MetaID  : num [1:17] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ cid     : int [1:17] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ fid     : int [1:17] 11 14 14 17 46 80 80 80 91 91 ...
  ..$ selfirst: num [1:17] 2169 9189 2109 8315 9439 ...
  ..$ selend  : num [1:17] 2518 9702 2577 8881 10102 ...
  ..$ fname   : chr [1:17] WCPD-2001-01-29-Pg217.scrb
 WCPD-2003-01-13-Pg39.scrb WCPD-2003-01-13-Pg39.scrb
 WCPD-2004-05-17-Pg856.scrb ...
  - attr(*, class)= chr [1:3] VCorpus Corpus list


 Any idea on what I could do to keep the metadata in the merged corpus?

 Thanks,
 Henri-Paul


 --
 Henri-Paul Indiogine

 Curriculum  Instruction
 Texas AM University
 TutorFind Learning Centre

 Email: hindiog...@gmail.com
 Skype: hindiogine
 Website: http://people.cehd.tamu.edu/~sindiogine

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging corpora and metadata

What package is all this from()? 

You might check if there is a special rbind/cbind method provided. I don't 
think you can easily change the behavior of c()

Michael

On Nov 17, 2011, at 4:43 PM, Henri-Paul Indiogine hindiog...@gmail.com wrote:

 Greetings!
 
 I loose all my metadata after concatenating corpora. This is an
 example of what happens:
 
 meta(corpus.1)
   MetaID cid fid selfirst selend fname
 1   0   1  11 2169   2518WCPD-2001-01-29-Pg217.scrb
 2   0   1  14 9189   9702 WCPD-2003-01-13-Pg39.scrb
 3   0   1  14 2109   2577 WCPD-2003-01-13-Pg39.scrb
 
 
 
 
 17  0   1 11417863  18256WCPD-2007-04-30-Pg515.scrb
 
 
 meta(corpus.2)
   MetaID cid fid selfirst selend fname
 1   0   2   211016  11600   DCPD-200900595.scrb
 2   0   2   619510  20098   DCPD-201000636.scrb
 3   0   2   623935  24573   DCPD-201000636.scrb
 
 
 
 
 94  0   2 12716225  17128   WCPD-2009-01-12-Pg22-3.scrb
 
 
 tot.corpus - c(corpus.1, corpus.2)
 meta(tot.corpus)
 
MetaID
 10
 20
 30
 
 
 
 
 111  0
 
 
 This is from the structure of corpus.1
 
 ..$ MetaData:List of 2
  .. ..$ create_date: POSIXlt[1:1], format: 2011-11-17 21:09:57
  .. ..$ creator: chr henk
  ..$ Children: NULL
  ..- attr(*, class)= chr MetaDataNode
 - attr(*, DMetaData)='data.frame':17 obs. of  6 variables:
  ..$ MetaID  : num [1:17] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ cid : int [1:17] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ fid : int [1:17] 11 14 14 17 46 80 80 80 91 91 ...
  ..$ selfirst: num [1:17] 2169 9189 2109 8315 9439 ...
  ..$ selend  : num [1:17] 2518 9702 2577 8881 10102 ...
  ..$ fname   : chr [1:17] WCPD-2001-01-29-Pg217.scrb
 WCPD-2003-01-13-Pg39.scrb WCPD-2003-01-13-Pg39.scrb
 WCPD-2004-05-17-Pg856.scrb ...
 - attr(*, class)= chr [1:3] VCorpus Corpus list
 
 
 Any idea on what I could do to keep the metadata in the merged corpus?
 
 Thanks,
 Henri-Paul
 
 
 -- 
 Henri-Paul Indiogine
 
 Curriculum  Instruction
 Texas AM University
 TutorFind Learning Centre
 
 Email: hindiog...@gmail.com
 Skype: hindiogine
 Website: http://people.cehd.tamu.edu/~sindiogine
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] calling self written R functions

Hi All,

I have written a function (say) called foo, saved in a file called
foo.R. Just going by Matlab syntax I usually just change my folder path and
therefore can call it at will.

When it comes to R, how is the usual way of calling/loading it? because R
doesnt seem to automatically find the function from a folder (which might
be stupid to attempt in the first place).

Thanks,
Sachin

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging corpora and metadata

Hi Michael,

require(sos)
findFn({meta}, sortby = Function)
## see that only two functions have the exact name, 'meta'
## one is titled, Meta Data Management in the package 'tm'
## seems a pretty likely choice

Also, the fact that it is a truly terrible idea does not mean it is not easy:

mvir - new.env()
mvir$c - function(x, ...) {cat(sure you can!\n); mean(x, ...)}
attach(mvir)

c(x = 1:10)
detach(mvir)

rm(mvir)

Cheers,

Josh


On Thu, Nov 17, 2011 at 5:25 PM, R. Michael Weylandt
michael.weyla...@gmail.com michael.weyla...@gmail.com wrote:
 What package is all this from()?

 You might check if there is a special rbind/cbind method provided. I don't 
 think you can easily change the behavior of c()

 Michael

 On Nov 17, 2011, at 4:43 PM, Henri-Paul Indiogine hindiog...@gmail.com 
 wrote:

 Greetings!

 I loose all my metadata after concatenating corpora. This is an
 example of what happens:

 meta(corpus.1)
   MetaID cid fid selfirst selend                         fname
 1       0   1  11     2169   2518    WCPD-2001-01-29-Pg217.scrb
 2       0   1  14     9189   9702     WCPD-2003-01-13-Pg39.scrb
 3       0   1  14     2109   2577     WCPD-2003-01-13-Pg39.scrb

 
 

 17      0   1 114    17863  18256    WCPD-2007-04-30-Pg515.scrb


 meta(corpus.2)
   MetaID cid fid selfirst selend                         fname
 1       0   2   2    11016  11600           DCPD-200900595.scrb
 2       0   2   6    19510  20098           DCPD-201000636.scrb
 3       0   2   6    23935  24573           DCPD-201000636.scrb

 
 

 94      0   2 127    16225  17128   WCPD-2009-01-12-Pg22-3.scrb


 tot.corpus - c(corpus.1, corpus.2)
 meta(tot.corpus)

    MetaID
 1        0
 2        0
 3        0

 
 

 111      0


 This is from the structure of corpus.1

 ..$ MetaData:List of 2
  .. ..$ create_date: POSIXlt[1:1], format: 2011-11-17 21:09:57
  .. ..$ creator    : chr henk
  ..$ Children: NULL
  ..- attr(*, class)= chr MetaDataNode
 - attr(*, DMetaData)='data.frame':    17 obs. of  6 variables:
  ..$ MetaID  : num [1:17] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ cid     : int [1:17] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ fid     : int [1:17] 11 14 14 17 46 80 80 80 91 91 ...
  ..$ selfirst: num [1:17] 2169 9189 2109 8315 9439 ...
  ..$ selend  : num [1:17] 2518 9702 2577 8881 10102 ...
  ..$ fname   : chr [1:17] WCPD-2001-01-29-Pg217.scrb
 WCPD-2003-01-13-Pg39.scrb WCPD-2003-01-13-Pg39.scrb
 WCPD-2004-05-17-Pg856.scrb ...
 - attr(*, class)= chr [1:3] VCorpus Corpus list


 Any idea on what I could do to keep the metadata in the merged corpus?

 Thanks,
 Henri-Paul


 --
 Henri-Paul Indiogine

 Curriculum  Instruction
 Texas AM University
 TutorFind Learning Centre

 Email: hindiog...@gmail.com
 Skype: hindiogine
 Website: http://people.cehd.tamu.edu/~sindiogine

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calling self written R functions

Hi Sachin,

Nope, R does not work that way.  You do have several options, though.
For a function or two, consider creating/editing a workspace .Rprofile
file.

https://www.google.com/?q=Rprofile

should bring up a fair number of pages describing this, you might look at a few.

If you find yourself getting a little collection of functions, with
some of them possibly depending on each other, and/or wanting to have
documentation for your function(s), it is time to write a package.
There was a nice video tutorial of this at the LA R User Group meeting
not too long ago, you can find it here:
http://www.youtube.com/watch?v=TER-rQoVs0k
You can also see the official manual on extensions for how to write
packages: http://cran.r-project.org/doc/manuals/R-exts.html

Cheers!

Josh

On Thu, Nov 17, 2011 at 5:26 PM, Sachinthaka Abeywardana
sachin.abeyward...@gmail.com wrote:
 Hi All,

 I have written a function (say) called foo, saved in a file called
 foo.R. Just going by Matlab syntax I usually just change my folder path and
 therefore can call it at will.

 When it comes to R, how is the usual way of calling/loading it? because R
 doesnt seem to automatically find the function from a folder (which might
 be stupid to attempt in the first place).

 Thanks,
 Sachin

        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Joshua Wiley
Ph.D. Student, Health Psychology
Programmer Analyst II, ATS Statistical Consulting Group
University of California, Los Angeles
https://joshuawiley.com/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] calling self written R functions

?source

source(/path/to/foo.R) will load it into R.

Sarah

On Thu, Nov 17, 2011 at 8:26 PM, Sachinthaka Abeywardana
sachin.abeyward...@gmail.com wrote:
 Hi All,

 I have written a function (say) called foo, saved in a file called
 foo.R. Just going by Matlab syntax I usually just change my folder path and
 therefore can call it at will.

 When it comes to R, how is the usual way of calling/loading it? because R
 doesnt seem to automatically find the function from a folder (which might
 be stupid to attempt in the first place).

 Thanks,
 Sachin



-- 
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging corpora and metadata

Hi Josh,

You're absolutely right. I suppose one could set up some sort of S3
thing for Henri's problem:

c - function(..., recursive = FALSE) UseMethod(c)
c.default - base::c
c.corpus - function(..., recursive = FALSE) {ans = c.default(...);
attributes(ans) - c(do.call(attributes, ...))}

But agreed, it seems deeply risky.

Cheers,

Michael

On Thu, Nov 17, 2011 at 9:01 PM, Joshua Wiley jwiley.ps...@gmail.com wrote:
 Hi Michael,

 require(sos)
 findFn({meta}, sortby = Function)
 ## see that only two functions have the exact name, 'meta'
 ## one is titled, Meta Data Management in the package 'tm'
 ## seems a pretty likely choice

 Also, the fact that it is a truly terrible idea does not mean it is not easy:

 mvir - new.env()
 mvir$c - function(x, ...) {cat(sure you can!\n); mean(x, ...)}
 attach(mvir)

 c(x = 1:10)
 detach(mvir)

 rm(mvir)

 Cheers,

 Josh


 On Thu, Nov 17, 2011 at 5:25 PM, R. Michael Weylandt
 michael.weyla...@gmail.com michael.weyla...@gmail.com wrote:
 What package is all this from()?

 You might check if there is a special rbind/cbind method provided. I don't 
 think you can easily change the behavior of c()

 Michael

 On Nov 17, 2011, at 4:43 PM, Henri-Paul Indiogine hindiog...@gmail.com 
 wrote:

 Greetings!

 I loose all my metadata after concatenating corpora. This is an
 example of what happens:

 meta(corpus.1)
   MetaID cid fid selfirst selend                         fname
 1       0   1  11     2169   2518    WCPD-2001-01-29-Pg217.scrb
 2       0   1  14     9189   9702     WCPD-2003-01-13-Pg39.scrb
 3       0   1  14     2109   2577     WCPD-2003-01-13-Pg39.scrb

 
 

 17      0   1 114    17863  18256    WCPD-2007-04-30-Pg515.scrb


 meta(corpus.2)
   MetaID cid fid selfirst selend                         fname
 1       0   2   2    11016  11600           DCPD-200900595.scrb
 2       0   2   6    19510  20098           DCPD-201000636.scrb
 3       0   2   6    23935  24573           DCPD-201000636.scrb

 
 

 94      0   2 127    16225  17128   WCPD-2009-01-12-Pg22-3.scrb


 tot.corpus - c(corpus.1, corpus.2)
 meta(tot.corpus)

    MetaID
 1        0
 2        0
 3        0

 
 

 111      0


 This is from the structure of corpus.1

 ..$ MetaData:List of 2
  .. ..$ create_date: POSIXlt[1:1], format: 2011-11-17 21:09:57
  .. ..$ creator    : chr henk
  ..$ Children: NULL
  ..- attr(*, class)= chr MetaDataNode
 - attr(*, DMetaData)='data.frame':    17 obs. of  6 variables:
  ..$ MetaID  : num [1:17] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ cid     : int [1:17] 1 1 1 1 1 1 1 1 1 1 ...
  ..$ fid     : int [1:17] 11 14 14 17 46 80 80 80 91 91 ...
  ..$ selfirst: num [1:17] 2169 9189 2109 8315 9439 ...
  ..$ selend  : num [1:17] 2518 9702 2577 8881 10102 ...
  ..$ fname   : chr [1:17] WCPD-2001-01-29-Pg217.scrb
 WCPD-2003-01-13-Pg39.scrb WCPD-2003-01-13-Pg39.scrb
 WCPD-2004-05-17-Pg856.scrb ...
 - attr(*, class)= chr [1:3] VCorpus Corpus list


 Any idea on what I could do to keep the metadata in the merged corpus?

 Thanks,
 Henri-Paul


 --
 Henri-Paul Indiogine

 Curriculum  Instruction
 Texas AM University
 TutorFind Learning Centre

 Email: hindiog...@gmail.com
 Skype: hindiogine
 Website: http://people.cehd.tamu.edu/~sindiogine

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Joshua Wiley
 Ph.D. Student, Health Psychology
 Programmer Analyst II, ATS Statistical Consulting Group
 University of California, Los Angeles
 https://joshuawiley.com/


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pairwise correlation

Here's a function Josh Wiley provided in another thread:

spec.cor - function(dat, r, ...) {
x - cor(dat, ...)
x[upper.tri(x, TRUE)] - NA
i - which(abs(x) = r, arr.ind = TRUE)
data.frame(matrix(colnames(x)[as.vector(i)], ncol = 2), value = x[i])
}

Michael

On Thu, Nov 17, 2011 at 4:08 PM, Musa Hassan musah...@gmail.com wrote:
 Hi Michael,
 I was able to solve this. I just used the WGCNA library which allows for
 stringsAsFactors to be defined in the work space making everything stored as
 strings remain strings. My problem now is parsing through the results to
 pull out only significant correlations defined by a certain Pearson
 correlation value say 0.8.

 On 17 November 2011 15:32, R. Michael Weylandt michael.weyla...@gmail.com
 wrote:

 I can't see how it's stored like that and the email servers garble it
 up. Use dput() to create a plain text representation and paste that
 back in.

 Thanks,
 Michael

 On Thu, Nov 17, 2011 at 9:37 AM, muzz56 musah...@gmail.com wrote:
  Hi Michael,
  Here is a sample of the data.
 
   Gene Array1 Array2 Array3 Array4 Array5 Array6 Array7 Array8 Array9
  Array10
  Array11  Fth1 26016.01 23134.66 17445.71 39856.04 27245.45 23622.98
  37887.75
  49857.46 25864.73 21852.51 29198.4  B2m 7573.64 7768.52 6608.24 8571.65
  6380.78 6242.76 6903.92 7330.63 7256.18 5678.21 10937.05  Tmsb4x 6192.44
  4277.22 5024.59 4851.51 3062.55 4562.43 7948.1 5018.58 3200.17 2855.77
  6139.23  H2-D1 3141.41 3986.06 3328.62 4726.6 3589.89 2885.95 7509.88
  5257.62 4742.26 3431.33 5300.72  Prdx5 3935.7 3938.9 3401.68 4193.14
  4028.95
  3438.19 6640.15 5486.61 4424.57 3368.83 5265.92
  I want to retain the gene names in the data. What you've proposed will
  take
  them out and I'll have to append them back to the results after the
  cor()
 
  On 17 November 2011 09:33, Michael Weylandt [via R] 
  ml-node+s789695n4080177...@n4.nabble.com wrote:
 
  I think something like this should do it, but I can't test without
  data:
 
  rownames(mydata) - mydata[,1] # Put the elements in the first column
  as rownames
  mydata - mydata[,-1] # drop the things that are now rownames
 
  Michael
 
  On Thu, Nov 17, 2011 at 9:23 AM, Musa Hassan [hidden
  email]http://user/SendEmail.jtp?type=nodenode=4080177i=0
  wrote:
 
   Hi Michael,
   Thanks for the response. I have noticed that the error occurred
   during
  my
   data read. It appears that the rownames (which when the data is
  transposed
   become my colnames) were converted to numbers instead of strings as
   they
   should be. The original header names don't change, just the rownames.
   I
  have
   to figure out how to import the data and have the strings not
   converted.
   Right now am using:
   mydata = read.csv(mydata.csv, headers=T,stringsAsFactors=F)
  
   then to convert the data frame to matrix
   mydata=data.matrix(mydata)
  
   Then I just do the correlation as Peter suggested.
  
   expression=cor(t(expression))
  
   Thanks.
  
   On 17 November 2011 08:51, R. Michael Weylandt [hidden
   email]http://user/SendEmail.jtp?type=nodenode=4080177i=1
 
   wrote:
  
   On Wed, Nov 16, 2011 at 11:22 PM, muzz56 [hidden
   email]http://user/SendEmail.jtp?type=nodenode=4080177i=2
  wrote:
Thanks to everyone who replied to my post, I finally got it to
work.
  I
am
however not sure how well it worked since it run so quickly, but
  seems
like
I have a 2000 x 2000 data set.
  
   Behold the great and mighty power that is R! Don't worry -- on a
   decent machine the correlation of a 2k x 2k data set should be
   pretty
   fast. (It's about 9 seconds on my old-ish laptop with a bunch of
   other
   junk running)
  
 My followup questions would be, how do I get
only pairs with say a certain pearson correlation value
additionally
  it
seems like my output didn't retain the headers but instead
replaced
  them
with numbers making it hard to know which gene pairs correlate.
  
   This is a little worrisome: R carries column names through cor() so
   this would suggest you weren't using them. Were your headers listed
   as
   part of your data (instead of being names)? If so, they would have
   been taken as numbers.
  
   Take a look at dimnames(NAMEOFDATA) -- if your headers aren't there,
   then they are being treated as data instead of numbers. If they are,
   can you provide some reproducible code and we can debug more fully.
   The easiest way to send data is to use the dput() function to get a
   copy-pasteable plain text representation. It would also be great if
   you could restrict it to a subset of your data rather than the full
   4M
   data points, but if that's hard to do, don't worry.
  
   You should have expected behavior like
  
   X - matrix(1:9,3)
   colnames(X) - c(A,B,C)
   cor(X) # Prints with labels
  
   Michael
  
   
On 16 November 2011 17:11, Nordlund, Dan (DSHS/RDA) [via R] 
[hidden email]
http://user/SendEmail.jtp?type=nodenode=4080177i=3
  wrote:

Re: [R] calling self written R functions

Looks like the function I was looking for was source(), but thanks Joshua I
certainly do need to make a package once I finish this set of re-coding
into R from Matlab. Fingers crossed the effort is worth it.

Thanks,
Sachin

On Fri, Nov 18, 2011 at 1:34 PM, Sarah Goslee sarah.gos...@gmail.comwrote:

 ?source

 source(/path/to/foo.R) will load it into R.

 Sarah

 On Thu, Nov 17, 2011 at 8:26 PM, Sachinthaka Abeywardana
 sachin.abeyward...@gmail.com wrote:
  Hi All,
 
  I have written a function (say) called foo, saved in a file called
  foo.R. Just going by Matlab syntax I usually just change my folder path
 and
  therefore can call it at will.
 
  When it comes to R, how is the usual way of calling/loading it? because R
  doesnt seem to automatically find the function from a folder (which might
  be stupid to attempt in the first place).
 
  Thanks,
  Sachin
 


 --
 Sarah Goslee
 http://www.functionaldiversity.org


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Log-transform and specifying Gamma

2011-11-17 Thread Ben Bolker

Peter Minting peter_minting at hotmail.com writes:

 
 
 Dear R help,
 I am trying to work out if I am justified in
 log-transforming data and specifying Gamma in the same glm.
 Does it have to be one or the other?

 No, but I've never seen it done. 

 I have attached an R script and the datafile to show what I mean.
 Also, I cannot find a mixed-model that allows Gamma errors
 (so I cannot find a way of including random effects).
 What should I do?
 Many thanks,
 Pete
 
 

ToadsBd-read.table(Bd.txt,header=TRUE,
   colClasses=c(rep(factor,2),rep(numeric,3),factor))

with(ToadsBd,table(group,site))
## 47 toads, 3 groups per site
library(ggplot2)
library(mgcv)

## plot points, add linear regressions per group/site

ggplot(ToadsBd,aes(x=startg,y=logBd,colour=group))+
  geom_point()+facet_grid(.~site)+geom_smooth(method=lm)

## not much going on with startg ...  PERHAPS
## similar slopes across sites?

ggplot(ToadsBd,aes(x=site:group,y=logBd,colour=site))+
  geom_boxplot()+geom_point()

## I'm curious -- I thought the groups were just blocking
## factors, but maybe not? The patterns of group 1, 2, 3
## are consistent across sites ...

## take a quick look at the raw data ...
ggplot(ToadsBd,aes(x=site:group,y=Bd,colour=site))+
  geom_boxplot()+geom_point()

mod1 - lm(logBd~group*site*startg,data=ToadsBd)
summary(mod1)

oldpar - par(mfrow=c(2,2))
plot(mod1)
par(oldpar)

## we definitely have to take care of the heteroscedasticity ...

library(MASS)
boxcox(mod1)

## square root transform ... ??
ToadsBd - transform(ToadsBd,sqrtLogBd=sqrt(logBd))
mod2 - lm(sqrtLogBd~group*site*startg,data=ToadsBd)

oldpar - par(mfrow=c(2,2))
plot(mod2)
par(oldpar)

## still not perfect, but perhaps OK
library(coefplot2)
coefplot2(mod2)

mod3 - update(mod2,.~.-group:site:startg)
coefplot2(mod3)
drop1(mod3,test=F)
mod4 - update(mod3,.~group+site+startg)
coefplot2(mod4)

## look at results on new (transformed) scale
ggplot(ToadsBd,aes(x=site:group,y=sqrtLogBd,colour=site))+
  geom_boxplot()+geom_point()

## Conclusions:
##  don't mess around with random effects for only three groups in two sites
##  I have done a fair amount of stepwise selection, so the p-values
## really can't be taken seriously, but it was clear from the
##beginning that there were differences among groups, which *seem*
##to be consistent among sites (which really makes me wonder what
##the groups are.  (The weak effect of site might well go away
##once one took the effect of snooping into account ...)
## sqrt(log(x)) seems to be adequate to get reasonably
##   homogeneous variances, but it really is a very strong transformation.
##   It makes the results somewhat hard to interpret.  Alternatively,
##you could just look at a nonparametric test (e.g. Kruskal-Wallis
##on site:group), but nonparametric tests make it hard to
##add lots of structure to the model

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] S4 : defining [- using inheritance from 2 classes

2011-11-17 Thread cgenolin

Hi the list,

I define a class 'C' that inherit from two classes 'A' and 'B'. 'A' and 'B'
have no slot with similar names.

setClass(
Class=C,
contains=c(A,B)
)


To define the get operator '[' for class C, I simply use the get of A or
B (the constante 'SLOT_OF_A' is a character holding the names of all the
slot of A) :

setMethod([,C,
  function(x,i,j,drop){
 if(i%in%SLOT_OF_A){
x - as(x,'A')
 }else{
x - as(x,'B')
 }
 return(x[i,j])
}


Is it possible to do something similar for the set operator '[-' ?

Thanks
Christophe 

--
View this message in context: 
http://r.789695.n4.nabble.com/S4-defining-using-inheritance-from-2-classes-tp4082217p4082217.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Drawing ticks in the 3rd and 4th row of a lattice

2011-11-17 Thread Ashim Kapoor

Dear all,

I want to draw ticks on the 3rd and 4th row of a lattice. How do I do this
? In my search of the help, I discovered a parameter alternating,which kind
of says where the ticks will be but does not suffice for me.

I am running this command : -

barchart(X03/1000~time|Company,
 data=df1[which(df1$time!=1),],
 horiz=F,

scales=list(x=list(rot=45,labels=paste(Mar,c(07,08,09,10,11
 ,par.strip.text=list(lineheight=1,lines=2,cex=.75),
 ylab = In Rs.
Million,xlab=,layout=c(3,4),as.table=T,between=list(y=1))


where my data is  : -

 dput(df1)
structure(list(Company = structure(c(9L, 7L, 1L, 6L, 8L, 4L,
2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L,
7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L,
2L, 5L, 11L, 10L, 9L, 7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L, 9L,
7L, 1L, 6L, 8L, 4L, 2L, 5L, 11L, 10L), .Label = c(Bharat Petroleum Corpn.
Ltd.,
Chennai Petroleum Corpn. Ltd., Company Name, Essar Oil Ltd.,
Hindalco Industries Ltd., Hindustan Petroleum Corpn. Ltd.,
Indian Oil Corpn. Ltd., Mangalore Refinery  Petrochemicals Ltd.,
Reliance Industries Ltd., Steel Authority Of India Ltd.,
Sterlite Industries (India) Ltd.), class = factor), time = c(7,
7, 7, 7, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 8, 8, 8, 8, 8, 9, 9,
9, 9, 9, 9, 9, 9, 9, 9, 10, 10, 10, 10, 10, 10, 10, 10, 10, 10,
11, 11, 11, 11, 11, 11, 11, 11, 11, 11, 1, 1, 1, 1, 1, 1, 1,
1, 1, 1), X03 = c(722931.1, 751620.5, 304456.3, 294868.9, 192712.6,
36695.4, 188313.4, 98954.9, 100088.7, 72379.9, 848517.5, 864562.2,
347310.9, 301022.1, 253514.5, 165661.6, 206377.7, 108897, 109336.3,
71207.6, 1003504.6, 1145993.8, 392261.5, 341086, 289737.4, 359837.2,
252964.3, 90036.2, 90474.8, 127623.2, 1411082.1, 907480.4, 364637.5,
290915.7, 255397.4, 328557.2, 202855.3, 118725.4, 116647.6, 106254.9,
1772254.7, 1204856.9, 469935.6, 313527.6, 320131.1, 384323.5,
260813.9, 137403.3, 137238.5, 136888.4, 1151658, 974902.76, 375720.36,
308284.06, 262298.6, 255014.98, 64.92, 110803.36, 110757.18,
102870.8)), row.names = c(Reliance Industries Ltd..7, Indian Oil Corpn.
Ltd..7,
Bharat Petroleum Corpn. Ltd..7, Hindustan Petroleum Corpn. Ltd..7,
Mangalore Refinery  Petrochemicals Ltd..7, Essar Oil Ltd..7,
Chennai Petroleum Corpn. Ltd..7, Hindalco Industries Ltd..7,
Sterlite Industries (India) Ltd..7, Steel Authority Of India Ltd..7,
Reliance Industries Ltd..8, Indian Oil Corpn. Ltd..8, Bharat Petroleum
Corpn. Ltd..8,
Hindustan Petroleum Corpn. Ltd..8, Mangalore Refinery  Petrochemicals
Ltd..8,
Essar Oil Ltd..8, Chennai Petroleum Corpn. Ltd..8, Hindalco Industries
Ltd..8,
Sterlite Industries (India) Ltd..8, Steel Authority Of India Ltd..8,
Reliance Industries Ltd..9, Indian Oil Corpn. Ltd..9, Bharat Petroleum
Corpn. Ltd..9,
Hindustan Petroleum Corpn. Ltd..9, Mangalore Refinery  Petrochemicals
Ltd..9,
Essar Oil Ltd..9, Chennai Petroleum Corpn. Ltd..9, Hindalco Industries
Ltd..9,
Sterlite Industries (India) Ltd..9, Steel Authority Of India Ltd..9,
Reliance Industries Ltd..10, Indian Oil Corpn. Ltd..10, Bharat
Petroleum Corpn. Ltd..10,
Hindustan Petroleum Corpn. Ltd..10, Mangalore Refinery  Petrochemicals
Ltd..10,
Essar Oil Ltd..10, Chennai Petroleum Corpn. Ltd..10, Hindalco
Industries Ltd..10,
Sterlite Industries (India) Ltd..10, Steel Authority Of India Ltd..10,
Reliance Industries Ltd..11, Indian Oil Corpn. Ltd..11, Bharat
Petroleum Corpn. Ltd..11,
Hindustan Petroleum Corpn. Ltd..11, Mangalore Refinery  Petrochemicals
Ltd..11,
Essar Oil Ltd..11, Chennai Petroleum Corpn. Ltd..11, Hindalco
Industries Ltd..11,
Sterlite Industries (India) Ltd..11, Steel Authority Of India Ltd..11,
Reliance Industries Ltd..1, Indian Oil Corpn. Ltd..1, Bharat Petroleum
Corpn. Ltd..1,
Hindustan Petroleum Corpn. Ltd..1, Mangalore Refinery  Petrochemicals
Ltd..1,
Essar Oil Ltd..1, Chennai Petroleum Corpn. Ltd..1, Hindalco Industries
Ltd..1,
Sterlite Industries (India) Ltd..1, Steel Authority Of India Ltd..1
), .Names = c(Company, time, X03), reshapeLong = structure(list(
varying = structure(list(X03 = c(X03.07, X03.08, X03.09,
X03.10, X03.11, X03.1)), .Names = X03, v.names = X03, times =
c(7,
8, 9, 10, 11, 1)), v.names = X03, idvar = Company, timevar =
time), .Names = c(varying,
v.names, idvar, timevar)), class = data.frame)

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] merging corpora and metadata

2011-11-17 Thread Henri-Paul Indiogine

Hi Joshua!

2011/11/17 Joshua Wiley jwiley.ps...@gmail.com:
 One possibility (though it does not concatenate per se):

 combined - list(corpus.1, corpus.2)

Thanks I will look into it.


 *if* (there are only attributes in corpus.1 OR corpus.2) OR (the
 attribute names in corpus.1 and corpus.2 are unique), then you could
 do:

Unfortunately this is not the case.In the meanwhile I rewrote the
code that generates the corpus so that the documents are combined into
a single corpus _before_ the metadata are added.   That solved the
problem.

Thanks for your feedback and suggestions.

Henri-Paul



-- 
Henri-Paul Indiogine

Curriculum  Instruction
Texas AM University
TutorFind Learning Centre

Email: hindiog...@gmail.com
Skype: hindiogine
Website: http://people.cehd.tamu.edu/~sindiogine

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] calculating symmetric matrix