from:"Mike"

Re: [R] strangely long floating point with write.table()

2014-03-15 Thread Mike Miller

Thanks for the ideas.  It is great to have such skilled assistance with 
this issue.  That said, I don't think we've solved this one, yet.


Looking back at where my numbers came from, I found that I had read in 
integers from a file, divided by 1000, then (critically) subtracted those 
numbers from 2.  It turns out that the important part seems to be the 
subtraction, not the data source.


It isn't necessary to read in the data to get the effect.  Here is a 
simple example:


write.table(c(1,2)-c(0.995,1.995), file="data.txt", row.names=F, col.names=F)

$ cat data.txt
0.005
0.00489

Here is another simple example that uses seq() and does not require 
reading in data.  As you can see, the output for both commands should be 
the same, but there is a big difference in how the numbers are represented 
in the output.  What causes the inconsistency within and between these two 
output files?



write.table(1-seq(0.995,0.840,-.005), file="data1.txt", row.names=F, 
col.names=F)
write.table(2-seq(1.995,1.840,-.005), file="data2.txt", row.names=F, 
col.names=F)


$ head -33 data[12].txt
==> data1.txt <==
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
0.055
0.0601
0.0649
0.0701
0.075
0.08
0.085
0.09
0.095
0.1
0.105
0.11
0.115
0.12
0.125
0.13
0.135
0.14
0.145
0.15
0.155
0.16

==> data2.txt <==
0.00489
0.00979
0.0149
0.0198
0.0249
0.0298
0.0349
0.0398
0.0449
0.0498
0.0549
0.0598
0.0649
0.0698
0.075
0.0798
0.085
0.0899
0.095
0.0999
0.105
0.11
0.115
0.12
0.125
0.13
0.135
0.14
0.145
0.15
0.155
0.16

Importantly, if I do this...

write.table(seq(0.005,0.160,.005), file="data.txt", row.names=F, col.names=F)

...I'm producing all the same values, but no number in the output file 
exceeds three digits to the right of the decimal.


Thanks again for all of the helpful comments and ideas.

Best,
Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota
http://scholar.google.com/citations?user=EV_phq4AAAAJ


On Sat, 15 Mar 2014, Duncan Murdoch wrote:


On 14-03-14 11:03 PM, Mike Miller wrote:

On Fri, 14 Mar 2014, Duncan Murdoch wrote:


On 14-03-14 8:59 PM, Mike Miller wrote:

What I'm using:

R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)


That's not current, but it's not very old...


According to some docs, options(digits) controls numerical precision in
output of write.table().  I'm using the default value for digits:


getOption("digits")

[1] 7

I have a bunch of numbers in a data frame that are only a few digits to
the right of the decimal:


That's not enough to reproduce this.  Put together a self-contained
reproducible example if you're wondering why something behaves as it
does. With just a bunch of output, you'll just get uninformed guesses.



Thanks for the tip.  Here's what I've done:


data2 <- data[c(94,120),c(18,20,21)]


Thanks, I got the data2.Rdata file.  Peter was right, you don't have what you 
think you have in that dataframe.  See below.



save(data2, file="data2.Rdata")
q("no")


$ R

load("data2.Rdata")
data2

V18   V20  V21
94  0.008 0.008 0.64
120 0.023 0.023 0.000529


I'll create a dataframe that looks like yours:

data3 <- data.frame(V18=c(0.008, 0.023), V20=c(0.008, 0.023), 

V21=c(0.64, 0.000529))

data3

   V18   V20  V21
1 0.008 0.008 0.64
2 0.023 0.023 0.000529


But it's not the same:


data2-data3

 V18   V20   V21
94   6.938894e-18  6.938894e-18  1.219727e-19
120 -9.020562e-17 -9.020562e-17 -4.119968e-18

I can't tell where these errors crept in; they are likely there in your 
"data" object, which you didn't give us.  I'd guess as Peter did that your 
numbers are the results of computations that introduced rounding error.


Duncan Murdoch


write.table(data2, file="data2.txt", sep="\t", row.names=F, col.names=F)


$ cat data2.txt
0.00801 0.00801 6.41e-05
0.0229  0.0229  0.0005289996

The data2.Rdata file is attached to this message.

I guess that is enough to reproduce this exact finding.  I don't know how
it works in general.

I don't have a newer version of R available right now.  It did the same
thing on an older version (2.15.1).

Interestingly, on a different machine with an even older version (2.12.2)
I see something a little different:

0.008   0.008   6.41e-05
0

Re: [R] strangely long floating point with write.table()

2014-03-14 Thread Mike Miller


On Fri, 14 Mar 2014, Duncan Murdoch wrote:


On 14-03-14 8:59 PM, Mike Miller wrote:

What I'm using:

R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)


That's not current, but it's not very old...

According to some docs, options(digits) controls numerical precision in 
output of write.table().  I'm using the default value for digits:



getOption("digits")

[1] 7

I have a bunch of numbers in a data frame that are only a few digits to 
the right of the decimal:


That's not enough to reproduce this.  Put together a self-contained 
reproducible example if you're wondering why something behaves as it 
does. With just a bunch of output, you'll just get uninformed guesses.



Thanks for the tip.  Here's what I've done:


data2 <- data[c(94,120),c(18,20,21)]
save(data2, file="data2.Rdata")
q("no")


$ R

load("data2.Rdata")
data2

  V18   V20  V21
94  0.008 0.008 0.64
120 0.023 0.023 0.000529

write.table(data2, file="data2.txt", sep="\t", row.names=F, col.names=F)


$ cat data2.txt
0.00801 0.00801 6.41e-05
0.0229  0.0229  0.0005289996

The data2.Rdata file is attached to this message.

I guess that is enough to reproduce this exact finding.  I don't know how 
it works in general.


I don't have a newer version of R available right now.  It did the same 
thing on an older version (2.15.1).


Interestingly, on a different machine with an even older version (2.12.2) 
I see something a little different:


0.008   0.008   6.41e-05
0.0229  0.0229  0.0005289996

Best,
Mike__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] strangely long floating point with write.table()

2014-03-14 Thread Mike Miller


What I'm using:

R version 3.0.1 (2013-05-16) -- "Good Sport"
Copyright (C) 2013 The R Foundation for Statistical Computing
Platform: x86_64-unknown-linux-gnu (64-bit)


According to some docs, options(digits) controls numerical precision in 
output of write.table().  I'm using the default value for digits:



getOption("digits")

[1] 7

I have a bunch of numbers in a data frame that are only a few digits to 
the right of the decimal:




data[c(94,120), c(18,20,21)]

  V18   V20  V21
94  0.008 0.008 0.64
120 0.023 0.023 0.000529


I write the data to a file:


write.table(data, file="data.txt", sep="\t", row.names=F, col.names=F)



Then I look at those same values and this is what I see:

$ gawk -F'\t' 'NR==94 || NR==120 {print $18,$20,$21}' data.txt
0.00801 0.00801 6.41e-05
0.0229 0.0229 0.0005289996


This is the weird thing:  Only those two records get the long, annoyingly 
"precise" 17-digit numbers.  Other records look like this:


0.052 1.052 1.106704
0.178 0.178 0.031684

I understand that binary representations won't reflect decimal 
representations precisely, etc., but why do I get this junk for only two 
records out of 197 records?  Records that contain only integral values 
can't get it wrong, but the other 30 of 32 records with decimals look fine 
-- see below.


Also, if precision should be to 7 digits, why am I getting 17 digits for 
exactly two of the records?  Why does this happen for all three numbers in 
those two records?


If you think this is a bug that I should report elsewhere, let me know.

Thanks.

Mike



$ gawk -F'\t' '{print $18,$20,$21}' data.txt | grep -F .
0.944 0.944 0.891136
0.885 1.885 3.553225
0.052 1.052 1.106704
0.178 0.178 0.031684
1.996 1.996 3.984016
0.86 1.86 3.4596
0.765 1.765 3.115225
0.986 1.986 3.944196
0.998 0.998 0.996004
0.998 0.998 0.996004
0.956 0.956 0.913936
0.99 1.99 3.9601
0.00801 0.00801 6.41e-05
0.99 0.99 0.9801
0.0229 0.0229 0.0005289996
0.938 0.938 0.879844
0.034 1.034 1.069156
0.86 1.86 3.4596
0.911 1.911 3.651921
0.971 0.971 0.942841
0.994 0.994 0.988036
0.418 0.418 0.174724
0.805 1.805 3.258025
0.996 1.996 3.984016
0.998 1.998 3.992004
0.623 1.623 2.634129
0.998 0.998 0.996004
1.628 1.628 2.650384
0.981 0.981 0.962361
0.998 0.998 0.996004
1.676 1.676 2.808976
0.986 1.986 3.944196

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Another question about bootstrapping cor

2014-03-06 Thread Conklin, Mike (GfK)

Read the help on boot.

Specifically in non-parametric bootstrapping the statistic function takes a 
data (the original data) and an index number that shows which rows are taken in 
the bootstrap sample.  So you need to do the following.

x <- 1:15
y <- c(2,4,1,3,5, 7,6, 9,10,8, 14, 13, 11, 15, 12) 
x[3] <- NA; x[11] <- NA; x[8] <- NA; y[2] <- NA; y[8] <- NA; y[12] <- NA
cor(x,y,use="complete.obs",method="kendall")
library(boot)
tmpdf <- data.frame(x,y)
cor1 <- function(data,idx) {
  df<-data[idx,]
  cor(df$x,df$y,use="complete.obs",method="kendall")}

corboot <- boot(tmpdf,cor1,R=999)

Bootstrap Statistics :
 original   biasstd. error
t1* 0.867 -0.001852283   0.1099832

in your previous function the index was also getting passed to the cor1 
function .

To see the individual results use:

corboot$t

t is a matrix of the bootstrap replicates.

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of David Parkhurst
Sent: Thursday, March 06, 2014 12:28 PM
To: r-help@r-project.org
Subject: [R] Another question about bootstrapping cor

When I use the fix that Arun K. provided to my earlier example, I wonder how to 
find out where in the 999 bootstrap repetitions the value for the actual data 
fits.

Here is the fixed code:
x <- 1:15
y <- c(2,4,1,3,5, 7,6, 9,10,8, 14, 13, 11, 15, 12) x[3] <- NA; x[11] <- NA; 
x[8] <- NA y[2] <- NA; y[8] <- NA; y[12] <- NA
cor(x,y,use="complete.obs",method="kendall")
library(boot)
tmpdf <- data.frame(x,y)
##  corboot <- boot(tmpdf,
cor(x,y,use="complete.obs",method="kendall"),R=999)
cor1 <- function(x,y) {cor(x,y,use="complete.obs",method="kendall")}
corboot <- boot(tmpdf,cor1,R=999)

When I run that, I get
 > corboot

ORDINARY NONPARAMETRIC BOOTSTRAP


Call:
boot(data = tmpdf, statistic = cor1, R = 999)


Bootstrap Statistics :
  original biasstd. error
t1* 1.000 -1.0062026   0.2490860
t2* 0.867 -0.8694414   0.2491383

t2* is the estimate for the actual data.  What is t1*?  And again, how do I 
find where in the distribution of the 999 reps does that true value fall?
I would expect print(corboot) to give me more information, but it just repeats 
that same output.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] AIX 7.1 and R build problems

2014-03-05 Thread Mike Beddo

Has anyone managed to build R-3.0.2 from source on AIX 7.1 using gcc 4.2.0. The 
configure script finishes with:

...
checking whether wctrans exists and is declared... no
checking whether iswblank exists and is declared... no
checking whether wctype exists and is declared... no
checking whether iswctype exists and is declared... no
configure: error: Support for MBCS locales is required.

Scanning through the config.log the configure script seems to be happy with the 
C99 compliance at hand. I have tried reading NEWS, README, and R-Help archives 
but I can't get past this. I'm all Google'd out. Is my compiler too old?

_
Michael Beddo
Senior Scientist

Data Ventures, Inc.
1475 Central Ave. Suite 230  |  Los Alamos, NM 87544
tel  505.695.2132 
http://www.dataventures.com  |  "Advanced - Effective - Actionable - Proven. 
Analytics"

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] r noobie, reading my text file into r

2014-02-06 Thread Conklin, Mike (GfK)

When starting out I sometimes find it easier to do the following:

Ceosalary<-read.table(file.choose(),sep="\t")

This will give you a dialog box to find the file you want and you won't have to 
worry about getting the full path exactly right. 

Hth,

Mike

W. Michael Conklin
Executive Vice President | Marketing Science
GfK Custom Research, LLC | 8401 Golden Valley Road | Minneapolis, MN, 55427 
T +1 763 417 4545 | M +1 612 567 8287 
www.gfk.com 



-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of frankreynolds
Sent: Thursday, February 06, 2014 12:01 PM
To: r-help@r-project.org
Subject: [R] r noobie, reading my text file into r

Hi everyone, this is my first time using r and I think I'm overlooking 
something small and I just need some help seeing it. I have a file in notepad 
that I need to read into r.
> ceosalary<-read.table(file="C:/Users/mz106_010/Desktop/ceosalary.csv",
> header
> = TRUE,sep="\t")
Error in file(file, "rt") : cannot open the connection In addition: Warning 
message:
In file(file, "rt") :
  cannot open file 'C:/Users/mz106_010/Desktop/ceosalary.csv': No such file or 
directory
> ceosalary<-read.table(file="C:/Users/mz106_010/Desktop/ceosalary.txt",
> header
> = TRUE,sep="\t")
Error in file(file, "rt") : cannot open the connection In addition: Warning 
message:
In file(file, "rt") :
  cannot open file 'C:/Users/mz106_010/Desktop/ceosalary.txt': No such file or 
directory

am I writing one of those incorrectly? What can I do to fix the problem? Any 
help would be greatly appreciated.  Thanks for your time everyone!



--
View this message in context: 
http://r.789695.n4.nabble.com/r-noobie-reading-my-text-file-into-r-tp4684871.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] help with ggplot legend specification

2013-10-31 Thread Conklin, Mike (GfK)

I am creating a scatterplot with the following code.

  pl<-ggplot(df,aes(x=Importance,y=Performance,fill=PBF,size=gapsize))+
  
geom_point(shape=21,colour="black")+scale_size_area(max_size=pointsizefactor)

points are plotted where the size of the point is related to a metric variable 
gapsize and the fill color on the point is related to the variable PBF which is 
a 4 level factor.  This works exactly as I want with the points varying in size 
based on the metric and being color coded.  I get 2 legends on the side of the 
plot, one related to the size of the dot and the other showing the color 
coding. The problem is that the dots on the color coding legend are so small 
that it is impossible to discern what color they are. The dots in the plot are 
large, so it is clear what colors they are, but the legend is useless.  How can 
I increase the size of the points in the color legend.

pointsizefactor<-5

df

Importance Performance gapsize labels   PBF
q50451   0.7079463  -0.7213622   2  a W
q50452   0.4489164  -0.5552116   1  b G
q50453   0.7714138  -0.6940144   5  c F
q50454   0.6284830  -0.6011352   3  d S
q50455   0.7131063  -0.6800826   4  e G
q50456   0.7038184  -0.6026832   6  f S
q50457   0.5201238  -0.3539732   8  g G
q50458   0.9195046  -0.8214654   2  h F
q50459   0.3797730  -0.4184727   1  i W
q504510  0.8065015  -0.6305470   7  j G
q504511  0.6062951  -0.4442724   6  k S
q504512  0.6253870  -0.4478844   8  l G
q504513  0.3813209  -0.4102167   2  m W
q504514  0.3813209  -0.3436533   3  n F
q504515  0.5185759  -0.4365325   5  o G
q504516  0.5872033  -0.4556244   6  p S
q504518  0.5397317  -1.000   1  q S
q504519  0.5882353  -0.4674923   9  r S
q504520  0.4205366  -0.4164087   4  s W
q504521  0.7616099  -0.3323013  10  t F
q504522  0.7213622  -0.6088751   7  u G
q504523  0.6780186  -0.6130031   8  v G
q504524  0.6904025  -0.3937049  10  w W
q504525  0.4143447  -0.4669763   4  x W
q504526  0.5779154  -0.2982456   9  y F
q504527  0.6718266  -0.3457172  10  z G


Thanks all

//Mike

W. Michael Conklin
Executive Vice President | Marketing Science
GfK Custom Research, LLC | 8401 Golden Valley Road | Minneapolis, MN, 55427
T +1 763 417 4545 | M +1 612 567 8287
www.gfk.com<http://www.gfk.com/>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] automatic history file append with every command?

2013-09-17 Thread Mike Miller

In the bash shell we can use PROMPT_COMMAND="history -a" to tell bash to 
always append the last command to the history file.  It will do this with 
every command so that if bash crashes or an ssh connection is lost, the 
command history will still be available in the history file.


With R, I see there is a savehistory() command (link below), but I have 
some questions:


(1) does savehistory() append to the current history file or overwrite it?

(2) if it appends, when savehistory() is evoked repeatedly, does it only 
append commands that haven't already been appended?  Or does it append the 
entire history to the file?


(3) Is there a way to evoke history saving automatically so that the file 
is always updated?



I have the impression that savehistory() only overwrites and cannot 
append.  If that's true, I might still get what I want, but I'd be doing a 
lot of unnecessary writing.  Luckily, my R history is usually quite short. 
So if I were to enter 100 commands, appending would write 100 lines, but 
overwriting would write 5050 lines.



savehistory:

http://stat.ethz.ch/R-manual/R-devel/library/utils/html/savehistory.html


Best,
Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] netlogo r-extension loadlibrary() failures

2013-08-28 Thread Mike Landis

Trying to access R from Netlogo5 (using the NetLogo R-Extension), 
running the configuration validation tests in 
NetLogo5/extensions/r/Systemcheck.nlogo, I get several loadlibrary() 
errors ...


in rJava Check2,
> library(rJava); .path.package('rJava')
Error : .onLoad failed in loadNamespace() for 'rJava', details:
  call: inDL(x, as.logical(local), as.logical(now), ...)
  error: unable to load shared object 
'D:/Programs/R/R-3.0.1/library/rJava/libs/x64/rJava.dll':

  LoadLibrary failure:  The specified module could not be found.
[ though that is in fact where the rJava.dll is located ]

in the JavaGD Check,
> library(JavaGD); .path.package('JavaGD')
Error : .onLoad failed in loadNamespace() for 'JavaGD', details:
  call: inDL(x, as.logical(local), as.logical(now), ...)
  error: unable to load shared object 
'D:/Programs/R/R-3.0.1/library/JavaGD/libs/x64/JavaGD.dll':

  LoadLibrary failure:  The specified module could not be found.
[ the exact location of the JavaGD.dll ]

In the Check R test, I get...
java.lang.UnsatisfiedLinkError: no R in java.library.path
 at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1878)
 at java.lang.Runtime.loadLibrary0(Runtime.java:849)
 at java.lang.System.loadLibrary(System.java:1087)
 at org.nlogo.extension.r.systemcheck.Entry$RCheck.perform(Entry.java:208)
 at org.nlogo.prim._extern.perform(_extern.java:54)
 at org.nlogo.nvm.Context.stepConcurrent(Context.java:91)
 at org.nlogo.nvm.ConcurrentJob.step(ConcurrentJob.java:82)
 at 
org.nlogo.job.JobThread.org$nlogo$job$JobThread$$runPrimaryJobs(JobThread.scala:143)

 at org.nlogo.job.JobThread$$anonfun$run$1.apply$mcV$sp(JobThread.scala:78)
 at org.nlogo.job.JobThread$$anonfun$run$1.apply(JobThread.scala:76)
 at org.nlogo.job.JobThread$$anonfun$run$1.apply(JobThread.scala:76)
 at scala.util.control.Exception$Catch.apply(Exception.scala:88)
 at org.nlogo.util.Exceptions$.handling(Exceptions.scala:41)
 at org.nlogo.job.JobThread.run(JobThread.scala:75)

NetLogo 5.0
main: org.nlogo.app.AppFrame
thread: AWT-EventQueue-0
Java HotSpot(TM) 64-Bit Server VM 1.7.0_25 (Oracle Corporation; 
1.7.0_25-b16)

operating system: Windows 7 6.1 (amd64 processor)
Scala version 2.9.1.final
JOGL: (3D View not initialized)
OpenGL Graphics: (3D View not initialized)
model: Systemcheck

11:29:31.892 AddJobEvent (org.nlogo.window.ButtonWidget) AWT-EventQueue-0
11:29:31.784 InputBoxLoseFocusEvent (org.nlogo.window.ButtonWidget) 
AWT-EventQueue-0
11:29:31.757 PeriodicUpdateEvent (org.nlogo.app.App$$anon$1 
(org.nlogo.window.GUIWorkspace)) AWT-EventQueue-0
11:29:31.557 PeriodicUpdateEvent (org.nlogo.app.App$$anon$1 
(org.nlogo.window.GUIWorkspace)) AWT-EventQueue-0
11:29:31.357 PeriodicUpdateEvent (org.nlogo.app.App$$anon$1 
(org.nlogo.window.GUIWorkspace)) AWT-EventQueue-0
11:29:31.157 PeriodicUpdateEvent (org.nlogo.app.App$$anon$1 
(org.nlogo.window.GUIWorkspace)) AWT-EventQueue-0
11:29:30.957 PeriodicUpdateEvent (org.nlogo.app.App$$anon$1 
(org.nlogo.window.GUIWorkspace)) AWT-EventQueue-0
11:29:30.757 PeriodicUpdateEvent (org.nlogo.app.App$$anon$1 
(org.nlogo.window.GUIWorkspace)) AWT-EventQueue-0
11:29:30.557 PeriodicUpdateEvent (org.nlogo.app.App$$anon$1 
(org.nlogo.window.GUIWorkspace)) AWT-EventQueue-0
11:29:30.345 PeriodicUpdateEvent (org.nlogo.app.App$$anon$1 
(org.nlogo.window.GUIWorkspace)) AWT-EventQueue-0

[ though $PATH includes %R_HOME%\bin\x64\ where there is an r.dll ]

All three are loadlibrary() errors, where the target is located exactly 
where it can't be loaded from.  What can I do to fix that?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] descriptive stats by cells in factorial design

2013-08-07 Thread Mike Miller

 MZ Female 11notES 42.3  5.362   30.338.6
41.846.0 56.6   172   0
21 Parent   MZ Female 17notES 46.4  5.177   34.942.4
46.049.5 63.2   155   0
22 Parent   MZ   Male 11   ES 43.4  5.351   31.340.0
43.446.5 64.7   197   0
23 Parent   MZ   Male 11notES 41.6  4.656   32.138.0
41.444.6 65.3   331   0
24 Parent   MZ   Male 17notES 46.7  5.242   34.543.1
45.949.0 63.8   131   0


Thanks to all.

Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] descriptive stats by cells in factorial design

2013-08-06 Thread Mike Miller

, N=sum(!is.na(x)), NAs=sum(is.na(x)))}

That will allow me to feed the na.rm=T argument to the mean, sd and 
quantile functions.  By not naming the quantile function (e.g., not using 
q=quantile(x, ...)) I allow the builtin column names to be used unaltered 
(i.e., 0%, 25%, 50%, 75%, 100%).  I also did not use length() because it 
will count NA values and I want to see the sample sizes used for mean, sd 
and quantile.  To deal with that problem I created a function with output 
named "N" to count those sample sizes and one with output named "NAs" to 
count the number of NAs.  Then I do this:



summaryBy(Age ~ Generation + Zygosity + Sex + Cohort + ESstatus, data=x, 
FUN=descriptivefun, na.rm=T)

   Generation ZygositySex Cohort ESstatus Age.meanAge.sd Age.0% Age.25% 
Age.50% Age.75% Age.100% Age.N Age.NAs
1   Offspring   DZ Female 11   ES 17.78528 0.3535863  16.93 17.6000 
 17.775 17.965018.92   106   0
2   Offspring   DZ Female 11notES 18.13679 0.968  16.76 17.8525 
 18.190 18.457519.50   162   0
3   Offspring   DZ Female 17notES 17.47529 0.4569588  16.56 17.0700 
 17.590 17.870018.29   191   0
4   Offspring   DZ   Male 11   ES 17.76149 0.3467540  17.18 17.5150 
 17.715 18.18.71   134   0
5   Offspring   DZ   Male 11notES 17.87667 0.5187333  16.83 17.4600 
 17.860 18.240019.02   153   0
6   Offspring   DZ   Male 17notES 17.50418 0.3915823  16.73 17.1900 
 17.530 17.830018.52   165   0
7   Offspring   MZ Female 11   ES 17.87628 0.4506530  16.86 17.6775 
 17.805 18.100019.12   196   0
8   Offspring   MZ Female 11notES 18.05739 0.6103713  16.76 17.6300 
 18.050 18.420019.70   291   0
9   Offspring   MZ Female 17notES 17.41061 0.4956190  16.55 16.9700 
 17.340 17.820018.45   395   0
10  Offspring   MZ   Male 11   ES 17.77174 0.3236917  16.84 17.5800 
 17.790 17.970019.02   195   0
11  Offspring   MZ   Male 11notES 17.87718 0.6472397  16.56 17.3300 
 17.855 18.210020.01   284   0
12  Offspring   MZ   Male 17notES 17.49114 0.3961757  16.65 17.1775 
 17.500 17.810018.35   332   0
13 Parent   DZ Female 11   ES 44.61512 5.1246314  32.17 41.3400 
 44.680 48.280057.95   121   0
14 Parent   DZ Female 11notES 42.54346 4.3670998  34.03 39.3450 
 42.110 45.550057.06   107   0
15 Parent   DZ Female 17notES 46.30559 4.9177705  36.10 42.7275 
 45.765 48.335062.6968   0
16 Parent   DZ   Male 11   ES 44.60206 4.5605484  34.31 41.4475 
 44.890 47.497558.75   126   0
17 Parent   DZ   Male 11notES 42.71121 4.9600561  32.05 39.2400 
 42.760 45.270058.20   157   0
18 Parent   DZ   Male 17notES 46.77458 4.0226198  40.18 44.1250 
 46.000 48.820061.1259   0
19 Parent   MZ Female 11   ES 44.23476 5.0214627  29.55 40.6925 
 44.125 47.730056.73   206   0
20 Parent   MZ Female 11notES 42.31988 5.3622671  30.31 38.6050 
 41.835 46.017556.58   172   0
21 Parent   MZ Female 17notES 46.36490 5.1770435  34.88 42.4200 
 45.950 49.495063.18   155   0
22 Parent   MZ   Male 11   ES 43.40787 5.3507439  31.28 39.9700 
 43.440 46.480064.65   197   0
23 Parent   MZ   Male 11notES 41.56363 4.6564818  32.10 38.0250 
 41.390 44.645065.29   331   0
24 Parent   MZ   Male 17notES 46.69298 5.2421896  34.45 43.1500 
 45.890 49.005063.80   131   0

I think that output looks very nice.  One thing that I don't understand is 
why my function produces %.5f output for every value but the 
doBy::summaryBy() function uses different formats in different columns. 
Compare the above output with this output:



descriptivefun(x$Age)

  mean sd 0%25%50%75%   100%
  NNAs
  28.49302   13.29077   16.55000   17.65000   18.23000   42.25500   65.29000 
4434.00.0

It's not a big deal, but it would be cool if I could tell 
doBy::summaryBy() how to format the numbers using something like 
format=c(rep("%.2f",7), rep("%d",2)).


Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota



On Mon, 5 Aug 2013, David Carlson wrote:

This is a bit simpler. The function quantile() labels the output whereas 
fivenum() does not:


aggregate(Age ~ Generation + Zygosity + Sex + Cohort +
ESstatus, data=x,
   function(x) c(mean=mean(x), sd=sd(x), quantile(x)))



On Mon, 5 Aug 2013, Dr. Thomas W. MacFarland wrote:


Dear Dr. Miller:

Pasted below is syntax that should mostly answer your recent question to 
the R mailing list:


descriptivefun <

Re: [R] descriptive stats by cells in factorial design

2013-08-04 Thread Mike Miller

t   DZ Female 17notES 46.30559 4.9177705   36.10 
42.7275 45.765 48.3350   62.69 68
16 Parent   DZ   Male 11   ES 44.60206 4.5605484   34.31 
41.4475 44.890 47.4975   58.75126
17 Parent   DZ   Male 11notES 42.71121 4.9600561   32.05 
39.2400 42.760 45.2700   58.20157
18 Parent   DZ   Male 17notES 46.77458 4.0226198   40.18 
44.1250 46.000 48.8200   61.12 59
19 Parent   MZ Female 11   ES 44.23476 5.0214627   29.55 
40.6925 44.125 47.7300   56.73206
20 Parent   MZ Female 11notES 42.31988 5.3622671   30.31 
38.6050 41.835 46.0175   56.58172
21 Parent   MZ Female 17notES 46.36490 5.1770435   34.88 
42.4200 45.950 49.4950   63.18155
22 Parent   MZ   Male 11   ES 43.40787 5.3507439   31.28 
39.9700 43.440 46.4800   64.65197
23 Parent   MZ   Male 11notES 41.56363 4.6564818   32.10 
38.0250 41.390 44.6450   65.29331
24 Parent   MZ   Male 17notES 46.69298 5.2421896   34.45 
43.1500 45.890 49.0050   63.80131


Thanks very much.

Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] rms plot.Predict when type="p"

2013-08-03 Thread Mike Babyak

As always, Frank, thanks for the help.  Much appreciated.

Mike Babyak 
Department of Psychiatry and Behavioral Sciences 
Duke University Medical Center 
R version 3.01  Windows 7 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] descriptive stats by cells in factorial design

2013-08-03 Thread Mike Miller

I'm looking for recommendations for a good way to do this.  There must be 
a good function in some package...


I have a 5-way factorial design, two levels per factor, so 32 cells, and I 
mostly just want the means and standard deviations for the contents of 
every cell.  I can write a for loop, but maybe there is a nice way to show 
the model and get the results I want without having to do that.


Similarly, it would be nice to also have the range and maybe some 
percentiles, if there is a function that would just pump them out.


Finally, if something would draw a series of box plots for the cells of 
the factorial design, that could be nice to have.


It seems like this sort of thing must already have been worked out.

Thanks in advance.

Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Editing code in a function? related to allowing zero weights in lmer()

2013-08-02 Thread Mike Rennie

To close the thread, I wanted to post my solution to this issue.

Searching some other help threads, I found this:

https://stat.ethz.ch/pipermail/r-help/2008-August/171321.html

So, here's what I did:

I got at the script for lmerFrames by doing this:

edit(lme4:::lmerFrames)

then edited the code to allow zero weights, and assigned it to a new
function:

my.lmerFrames<-function (...}

and then did this:

library(R.utils)

reassignInPackage("lmerFrames", pkgName="lme4", my.lmerFrames)


Voila, things execute, no error.



Note also that R.utils requires some additional packages (R.oo, R.methodsS3
before you can use it.


On Fri, Aug 2, 2013 at 12:03 AM, Mike Rennie  wrote:

> Indeed- thanks for the tips to get me going. This just kicks things up a
> notch for me, having happily used packages without wanting to tinker with
> them till now.
>
> Cheers.
>
>
> On Thu, Aug 1, 2013 at 11:57 PM, Bert Gunter wrote:
>
>> 1. ?"::"
>>
>> 2. ?getAnywhere
>>
>> 3. Please consult the "R language definition" manual -- or even "An
>> Introduction to R" -- to learn about R programming. It **is** a
>> programming language, you know!
>>
>> Cheers,
>> Bert
>>
>> On Thu, Aug 1, 2013 at 8:54 PM, Mike Rennie 
>> wrote:
>> > Hi folks,
>> >
>> > I've not before had to edit code right in a function, but I think I need
>> > to. I am using lmer() and want to use a model that uses zero weights. I
>> > found this thread from 2009:
>> >
>> > https://stat.ethz.ch/pipermail/r-sig-mixed-models/2009q1/001995.html
>> >
>> > but I'm unsure of how I would actually go about editing the code in the
>> > package.
>> >
>> > Part of the problem is that I can't even find lmerFrames(). I can see it
>> > being called in lmer, but if I just type in
>> >
>> > lmerFrames
>> >
>> > or
>> >
>> > fix(lmerFrames)
>> >
>> > I get nothing.
>> >
>> > The note from Doug Bates is that the function is "hidden"- is there a
>> trick
>> > to getting at these hidden functions to edit them?
>> >
>> > Any thoughts or suggestions welcome.
>> >
>> > Cheers,
>> >
>> > --
>> > Michael Rennie, Research Scientist
>> > Fisheries and Oceans Canada, Freshwater Institute
>> > Winnipeg, Manitoba, CANADA
>> >
>> > [[alternative HTML version deleted]]
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>> --
>>
>> Bert Gunter
>> Genentech Nonclinical Biostatistics
>>
>> Internal Contact Info:
>> Phone: 467-7374
>> Website:
>>
>> http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm
>>
>
>
>
> --
> Michael Rennie, Research Scientist
> Fisheries and Oceans Canada, Freshwater Institute
> Winnipeg, Manitoba, CANADA
>



-- 
Michael Rennie, Research Scientist
Fisheries and Oceans Canada, Freshwater Institute
Winnipeg, Manitoba, CANADA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] install-packages

2013-08-01 Thread Mike Rennie

Try typing this into google search bar:

[R] install packages

The majority of the results on the first page will help you out.

On Thu, Aug 1, 2013 at 8:43 PM, Said Filahi  wrote:

> hello,
> i am new and I want to know how to install a packare on R
>
> thank you
>
>
> said filahi
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Michael Rennie, Research Scientist
Fisheries and Oceans Canada, Freshwater Institute
Winnipeg, Manitoba, CANADA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Editing code in a function?

2013-08-01 Thread Mike Rennie

Hi folks,

I've not before had to edit code right in a function, but I think I need
to. I am using lmer() and want to use a model that uses zero weights. I
found this thread from 2009:

https://stat.ethz.ch/pipermail/r-sig-mixed-models/2009q1/001995.html

but I'm unsure of how I would actually go about editing the code in the
package.

Part of the problem is that I can't even find lmerFrames(). I can see it
being called in lmer, but if I just type in

lmerFrames

or

fix(lmerFrames)

I get nothing.

The note from Doug Bates is that the function is "hidden"- is there a trick
to getting at these hidden functions to edit them?

Any thoughts or suggestions welcome.

Cheers,

-- 
Michael Rennie, Research Scientist
Fisheries and Oceans Canada, Freshwater Institute
Winnipeg, Manitoba, CANADA

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] rms plot.Predict when type="p"

2013-08-01 Thread Mike Babyak

Hi all,

I'm trying to change the pch, color, lty, and lwd in a type="p" plot
produced by plot.Predict in the rms package.  What I'm shooting for is a
separate plotting symbol symbol color for each level of a factor (and also
to manage the line width for the CIs).

Here's what I hope is a working example

#make up some data
n = 30
x1   = runif(n)
group = factor(rep(1:3, length.out=n))
e = rnorm(n, 0, 1)

#population model
y = as.numeric(group) + 0.2*x1 + e

#make a dataframe
population.df <- data.frame(x1,group,y)
#clean up
rm(n,x1,group,e,y)

#load rms package
require(rms)

#store predictor characteristics
d<-datadist(population.df)
options(datadist="d")

# model.
f1 <- ols(y ~ x1+group, population.df)
#model result
summary(f1)

#get predicted values
pf1<-Predict(f1,group)

#plot
myplot<-plot(pf1,~group,nlines=T,type='p',
ylab='fitted Y',
xlab='Treatment')

#print plot with defaults
myplot

#try to alter settings
plot.symbol.settings<-trellis.par.get("plot.symbol")
#have a look at settings
str(plot.symbol.settings)


plot.symbol.settings$pch<-c(1,2,3)
plot.symbol.settings$col<-c(2,1,3)
plot.symbol.settings$lwd<-3
plot.symbol.settings$cex<-1.5

trellis.par.set("plot.symbol", plot.symbol.settings)

#print again with new settings
myplot

As you can see, only the first pch and color are passed.  So, obviously I
am missing something important.

Also, I notice that the lwd argument has no effect, so I am assuming this
is controlled by something else, but I haven't found it yet.

I'd be grateful if somebody could point me in the right direction.

Thanks,

Mike Babyak
Department of Psychiatry and Behavioral Sciences
Duke University Medical Center
R version 3.01  Windows 7

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] ggmap, hexbin and length distortion in Lat/long

2013-05-24 Thread Mike Bock

I am working with spatial data in ggmap, generally with great success. I have a 
huge data set with the coordinates in NAD 83 UTM Zone 11 (meters).  To map the 
data the coordinates were converted to Lat Long in GIS prior to use in R and 
ggmap/ggplot. I am using hexagonal binning to aggregate the data :
#create bins and calculate stats

hb<-hexbin(DF$lon,DF$lat,xbins=80,IDs=TRUE)
hb.avg<-hexTapply(hb,DF$Res,mean,na.rm=TRUE)
hb.mx<-hexTapply(hb,DF$Res,max,na.rm=TRUE)
hb.p80<-hexTapply(hb,DF$Res,quantile,prob=0.80,na.rm=TRUE)
#create df for ggplot
hx_dat <- data.frame(hcell2xy(hb), count = hb@count,
xo = hb@xcm, yo = hb@ycm, Mean=hb.avg,Max=hb.mx,
p80=hb.p80)

#Base Map
#BBox is the bounding box
Base<-get_map(BBox,source='google')
m_hx<-ggmap(Base,legend = "bottom", base_layer=ggplot(aes(x=x,y=y),data=hx_dat))
#Map of means
a<-0.55
hc<-'grey60'

m_hx+geom_hex(aes(x = x, y = y, fill = Mean),
  color = hc, ,alpha=a,stat = "identity") +
  scale_fill_gradientn("Mean",colours=rev(rainbow(4)),trans='sqrt')

...and so on for other stats
I can also run statistical analyses on hx_dat.
By creating hexbins based on lat/long it seems there will be distortion due to 
the differences in length of a degree at different locations on the earth's 
surface. What is the most efficient way to eliminate this distortion? Should I 
run hexbin in NAD83 and convert the x/y coordinates to Lat Long? Can I get 
ggmap to convert the baselayer to NAD84 and just do everything in NAD(my 
preferred option)?
I have tried converting Lat Long to NAD84 and back but the coordinates are 
coming up in the eastern Pacific and not in California, so I am missing 
something and I am not sure that is the best way to solve the problem anyway. 
Thanks in advance, any help is greatly appreciated
Mike

Michael J. Bock, PhD | Senior Manager
mb...@environcorp.com







This message contains information that may be confidenti...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Behaviors of diag() with character vector in R 3.0.0

2013-04-09 Thread Mike Cheung

Thanks, A.K.

I managed to create diagonal matrices for character vectors. Since this new
behavior broke a package that I have written, I would like to make sure
that this new behavior was not introduced by mistakes. If this new behavior
is final, I will modify my code to fit it.

Mike


On Tue, Apr 9, 2013 at 10:14 PM, arun  wrote:

> Hi,
> You could try this:
> v <- c("a", "b")
> mat1<-diag(length(v))
>  diag(mat1)<- v
>  mat1
> # [,1] [,2]
> #[1,] "a"  "0"
> #[2,] "0"  "b"
>
>
>  v1<- letters[1:5]
> mat2<- diag(length(v1))
>  diag(mat2)<- v1
> mat2
> # [,1] [,2] [,3] [,4] [,5]
> #[1,] "a"  "0"  "0"  "0"  "0"
> #[2,] "0"  "b"  "0"  "0"  "0"
> #[3,] "0"  "0"  "c"  "0"  "0"
> #[4,] "0"  "0"  "0"  "d"  "0"
> #[5,] "0"  "0"  "0"  "0"  "e"
> A.K.
>
>
>
>
> - Original Message -
> From: Mike Cheung 
> To: r-help 
> Cc:
> Sent: Tuesday, April 9, 2013 3:15 AM
> Subject: [R] Behaviors of diag() with character vector in R 3.0.0
>
> Dear all,
>
> According to CHANGES IN R 3.0.0:
> o diag() as used to generate a diagonal matrix has been re-written
>   in C for speed and less memory usage.  It now forces the result
>   to be numeric in the case diag(x) since it is said to have 'zero
>   off-diagonal entries'.
>
> diag(x) does not work for character vector in R 3.0.0 any more. For
> example,
> v <- c("a", "b")
>
> ## R 2.15.3
> diag(v)
>  [,1] [,2]
> [1,] "a"  "0"
> [2,] "0"  "b"
>
> ## R 3.0.0
> diag(v)
>  [,1] [,2]
> [1,]   NA0
> [2,]0   NA
> Warning message:
> In diag(v) : NAs introduced by coercion
>
> Regarding the character matrix, it still works. For example,
> m <- matrix(c("a", "b", "c", "d"), nrow=2)
> diag(m)
> ## Both R 2.15.3 and 3.0.0
> [1] "a" "d"
>
> n <- matrix(0, ncol=2, nrow=2)
> diag(n) <- v
> n
> ## Both R 2.15.3 and 3.0.0
>  [,1] [,2]
> [1,] "a"  "0"
> [2,] "0"  "b"
>
> I understand that the above behavior follows exactly what the manual says.
> It appears to me that the version in 2.15.3 is more general as it works for
> both numeric and character vectors and matrices, whereas the version in
> 3.0.0 works for character matrices but not character vectors.
>
> Would it be possible to retain the behaviors of diag() for character
> vectors? Thanks.
>
> Mike
> --
> -
> Mike W.L. Cheung   Phone: (65) 6516-3702
> Department of Psychology   Fax:   (65) 6773-1843
> National University of Singapore
> http://courses.nus.edu.sg/course/psycwlm/internet/
> -
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
-
 Mike W.L. Cheung   Phone: (65) 6516-3702
 Department of Psychology   Fax:   (65) 6773-1843
 National University of Singapore
 http://courses.nus.edu.sg/course/psycwlm/internet/
-

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Behaviors of diag() with character vector in R 3.0.0

2013-04-09 Thread Mike Cheung

Dear all,

According to CHANGES IN R 3.0.0:
 o diag() as used to generate a diagonal matrix has been re-written
  in C for speed and less memory usage.  It now forces the result
  to be numeric in the case diag(x) since it is said to have 'zero
  off-diagonal entries'.

diag(x) does not work for character vector in R 3.0.0 any more. For example,
v <- c("a", "b")

## R 2.15.3
diag(v)
 [,1] [,2]
[1,] "a"  "0"
[2,] "0"  "b"

## R 3.0.0
diag(v)
 [,1] [,2]
[1,]   NA0
[2,]0   NA
Warning message:
In diag(v) : NAs introduced by coercion

Regarding the character matrix, it still works. For example,
m <- matrix(c("a", "b", "c", "d"), nrow=2)
diag(m)
## Both R 2.15.3 and 3.0.0
[1] "a" "d"

n <- matrix(0, ncol=2, nrow=2)
diag(n) <- v
n
## Both R 2.15.3 and 3.0.0
 [,1] [,2]
[1,] "a"  "0"
[2,] "0"  "b"

I understand that the above behavior follows exactly what the manual says.
It appears to me that the version in 2.15.3 is more general as it works for
both numeric and character vectors and matrices, whereas the version in
3.0.0 works for character matrices but not character vectors.

Would it be possible to retain the behaviors of diag() for character
vectors? Thanks.

Mike
-- 
-
 Mike W.L. Cheung   Phone: (65) 6516-3702
 Department of Psychology   Fax:   (65) 6773-1843
 National University of Singapore
 http://courses.nus.edu.sg/course/psycwlm/internet/
-

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] prcomp() and varimax()

2013-04-07 Thread Mike Amato

Thanks for the reply. Maybe my problem is that prcomp() and varimax() 
are calculating "cumulative proportion of variance" differently?
When I use the tol parameter with prcomp(), it restricts the number of 
components to 3 and reports that the cumulative variance explained by 
the third component is 100%.
But when I try to pass that 3-component analysis to varimax(), the 
cumulative variance of the third component drops to 20%. The cumulative 
proportion of variance explained by a component should not change 
following rotation, so it seems like it should be either 50% (as in the 
original 15 component model pca1) or else 75% (as in the smaller 
unrotated model pca2). But component 3 in the rotated model (pca3) has a 
value that is neither of those.

I suspect maybe I am not using varimax() correctly? Especially because 
it doesn't make sense that all components in the rotated model (pca3) 
would explain an identical amount of variance- this is real data, so the 
first component should explain more variance than the second, and so on.

Thanks for the help,
Mike

On 4/7/2013 6:38 AM, S Ellison wrote:
>> My concern is with the reported proportions of variance for the 3
>> components after varimax rotation. It looks like each of my 3 components
>> explains 1/15 of the total variance, summing to a cumulative proportion
>> of 20% of variance explained. But those 3 components I retained should
>> now be the only components in the analysis, so they should be able to
>> account for 100% of the explained variance.
> Am I misreading what you just wrote? One percentage (20%) is a proportion of 
> the total variance in the data; the other is the proportion of the variance 
> explained by the model. These are different things; they should not usually 
> be the same.
>
> ***
> This email and any attachments are confidential. Any use, copying or
> disclosure other than by the intended recipient is unauthorised. If
> you have received this message in error, please notify the sender
> immediately via +44(0)20 8943 7000 or notify postmas...@lgcgroup.com
> and delete this message and any copies from your computer and network.
> LGC Limited. Registered in England 2991879.
> Registered office: Queens Road, Teddington, Middlesex, TW11 0LY, UK
Hello,
I am attempting to do a principal components analysis on 15 survey 
items. I want to use a varimax rotation on the retained components, but 
I am dubious of the output I am getting, and so I suspect I am doing 
something wrong. I proceed in the following steps:

  1) use prcomp() to inspect all 15 components, and decide which to retain
  2) run prcomp() again, using the "tol" parameter to omit unwanted 
components
  3) pass the output of step 2 to varimax()

My concern is with the reported proportions of variance for the 3 
components after varimax rotation. It looks like each of my 3 components 
explains 1/15 of the total variance, summing to a cumulative proportion 
of 20% of variance explained. But those 3 components I retained should 
now be the only components in the analysis, so they should be able to 
account for 100% of the explained variance.

I am able to get reliable seeming results using principal() from the 
"psych" package, in which the total amount of variance explained by my 
retained components does not differ before or after rotation. But 
principal() uses varimax(), so I suspect I am either doing something 
wrong or misinterpreting the output when using the base package functions.

Am I doing something wrong when attempting to retain only 3 components?
Am I using varimax() incorrectly?
Am I misinterpreting the returned values from varimax()?

Thanks for any help,
Mike

Here is a link to the data file I am using: 
https://www.dropbox.com/s/scypebzy0nnhlwk/pca_sampledata.txt

### step 1 ###
 > d1 = read.table("pca_sampledata.txt", T)
 > m1 = with(d1, ~ g.enjoy + g.look + g.cost + g.fit + g.health + 
g.resale + b.withstand + b.satisfy + b.vegetated + b.everyone + b.harmed 
+ b.eco + b.ingenuity + b.security + b.proud)
 > pca1 = prcomp(m1)
 > summary(pca1) #output truncated for this posting
Importance of components:
   PC1PC2PC3 PC4 PC5 ... PC15
Standard deviation 1.5531 1.3064 1.1695 0.93512 0.92167 ... 0.35500
Proportion of Variance 0.2199 0.1556 0.1247 0.07972 0.07744 ... 0.01149
Cumulative Proportion  0.2199 0.3755 0.5002 0.57988 0.65732 ... 1.0

### step 2 ###
 > pca2 = prcomp(m1, tol=.75)
 > summary(pca2) #full output shown
Importance of components:
   PC1PC2PC3
Standard deviation 1.5531 1.3064 1.1695
Proportion of Variance 0.4397 0.3111 0.2493
Cumulative Proportion  0.4397 0.7507 1.

### step 3 ###
 > pca3 = varimax(pca2$rotation)
 > pca3
 >

[R] prcomp() and varimax()

2013-04-04 Thread Mike Amato


Hello,
I am attempting to do a principal components analysis on 15 survey 
items. I want to use a varimax rotation on the retained components, but 
I am dubious of the output I am getting, and so I suspect I am doing 
something wrong. I proceed in the following steps:


 1) use prcomp() to inspect all 15 components, and decide which to retain
 2) run prcomp() again, using the "tol" parameter to omit unwanted 
components

 3) pass the output of step 2 to varimax()

My concern is with the reported proportions of variance for the 3 
components after varimax rotation. It looks like each of my 3 components 
explains 1/15 of the total variance, summing to a cumulative proportion 
of 20% of variance explained. But those 3 components I retained should 
now be the only components in the analysis, so they should be able to 
account for 100% of the explained variance.


I am able to get reliable seeming results using principal() from the 
"psych" package, in which the total amount of variance explained by my 
retained components does not differ before or after rotation. But 
principal() uses varimax(), so I suspect I am either doing something 
wrong or misinterpreting the output when using the base package functions.


Am I doing something wrong when attempting to retain only 3 components?
Am I using varimax() incorrectly?
Am I misinterpreting the returned values from varimax()?

Thanks for any help,
Mike



Here is a link to the data file I am using: 
https://www.dropbox.com/s/scypebzy0nnhlwk/pca_sampledata.txt


### step 1 ###
> d1 = read.table("pca_sampledata.txt", T)
> m1 = with(d1, ~ g.enjoy + g.look + g.cost + g.fit + g.health + 
g.resale + b.withstand + b.satisfy + b.vegetated + b.everyone + b.harmed 
+ b.eco + b.ingenuity + b.security + b.proud)

> pca1 = prcomp(m1)
> summary(pca1) #output truncated for this posting
Importance of components:
  PC1PC2PC3 PC4 PC5 ...PC15
Standard deviation 1.5531 1.3064 1.1695 0.93512 0.92167 ... 0.35500
Proportion of Variance 0.2199 0.1556 0.1247 0.07972 0.07744 ... 0.01149
Cumulative Proportion  0.2199 0.3755 0.5002 0.57988 0.65732 ... 1.0


### step 2 ###
> pca2 = prcomp(m1, tol=.75)
> summary(pca2) #full output shown
Importance of components:
  PC1PC2PC3
Standard deviation 1.5531 1.3064 1.1695
Proportion of Variance 0.4397 0.3111 0.2493
Cumulative Proportion  0.4397 0.7507 1.


### step 3 ###
> pca3 = varimax(pca2$rotation)
> pca3
> ...
>  PC1   PC2   PC3
> SS loadings1.000 1.000 1.000
> Proportion Var 0.067 0.067 0.067
> Cumulative Var 0.067 0.133 0.200

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question: how to convert raw to numeric

2013-04-03 Thread Mike Chen

I know that there is a function to convert binary data to string named
rawToChar.but  I wander is there any similar function for "Integer" and
"float".I need to read some binary file in "integer" and "float" data.
I can do this job in this way: (as below)
first convert 4 byte raw to bits then pack bits back to "integer", but it
not work for "float",I worry about the performance.

raw4 = raw_buffer[1:4]

bits = rawToBits(raw4)

packBits(bits, type = "integer")


Best wishes

Really hope to get your response

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Finding the knots in a smoothing spline using nknots

2013-03-06 Thread Mike Nielsen

Thanks, David!  That makes sense.  I shall re-read the manual page again.

Regards,

Mike Nielsen


On Wed, Feb 27, 2013 at 12:19 PM, David Winsemius wrote:

>
> On Feb 27, 2013, at 6:39 AM, Mike Nielsen wrote:
>
> > Hi r-helpers.
> >
> > Please forgive my ignorance, but I would like to plot a smoothing spline
> > (smooth.spline) from package "stats", and show the knots in the plot,
> and I
> > can't seem to figure out where smooth.spline has located the knots (when
> I
> > use nknots).  Unfortunately, I don't know a lot about splines, but I know
> > that they provide me an easy way to estimate the location of local maxima
> > and minima on varying time-scales (number of knots) in my original data.
> >
> > I see there is a fit$knot, but it's not clear to me what those values
> are:
> > for some reason I had expected that they would be contained in my
> original
> > y values, but they're not.
>
> It appears they are in the range of [0-1] and the ss$fit$min and
> ss$fit$range provide the scaling data ( for the x-values rather than
> the y-values):
>
> > unique(ss$fit$knot)
>  [1] 0. 0.04095904 0.08291708 0.12487512 0.16583417 0.20779221
> 0.24975025 0.29070929
>  [9] 0.33266733 0.37462537 0.41658342 0.45754246 0.49950050 0.54145854
> 0.58241758 0.62437562
> [17] 0.66633367 0.70829171 0.74925075 0.79120879 0.83316683 0.87412587
> 0.91608392 0.95804196
> [25] 1.
>
> I would think that in your case with x0 being 0 you could just use
> ss$fit$range*unique(ss$fit$knot) as your knot positions. In the more
> geneneral case you would need to add ss$fit$min. I tried confirming this
> hunch by looking "statiscal Models in S", inMASSe4, and at the R code but
> the R code calls a FORTRAN routine, so you would need to pull the source to
> confirm.
>
> --
> David.
>
> >  I tried generating nknots equally spaced points
> > in my x, but when I plotted the points that corresponded to my original y
> > values at those equally-spaced x values, I found that the spline did not
> > pass through them, which, perhaps naively, I thought it might.
> >
> > Also, the manual says that yin comprises "the y values used at the
> unique y
> > values" -- should this read "at the unique x values"?
> >
> > Could someone kindly point to a resource where I can get a slightly
> fuller
> > explanation?  I looked at the code for smooth.spline, but can't readily
> > follow it.
> >
> > Here's a toy example:
> >
> >> x<-seq(from=0,to=4*pi,length=1002)
> >> y<-sin(x)
> >> ss<-smooth.spline(x,y=y,all.knots=F,nknots=25)
> >> ss
> > Call:
> > smooth.spline(x = x, y = y, all.knots = F, nknots = 25)
> >
> > Smoothing Parameter  spar= -0.4573636  lambda= 1.006117e-09 (14
> iterations)
> > Equivalent Degrees of Freedom (Df): 26.99935
> > Penalized Criterion: 3.027077e-06
> > GCV: 3.190666e-09
> >> str(ss)
> > List of 15
> > $ x   : num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ...
> > $ y   : num [1:1002] 2.88e-05 1.26e-02 2.51e-02 3.77e-02 5.02e-02 ...
> > $ w   : num [1:1002] 1 1 1 1 1 1 1 1 1 1 ...
> > $ yin : num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ...
> > $ data:List of 3
> >  ..$ x: num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ...
> >  ..$ y: num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ...
> >  ..$ w: num [1:1002] 1 1 1 1 1 1 1 1 1 1 ...
> > $ lev : num [1:1002] 0.2238 0.177 0.1399 0. 0.0891 ...
> > $ cv.crit : num 3.19e-09
> > $ pen.crit: num 3.03e-06
> > $ crit: num 3.19e-09
> > $ df  : num 27
> > $ spar: num -0.457
> > $ lambda  : num 1.01e-09
> > $ iparms  : Named int [1:3] 1 0 14
> >  ..- attr(*, "names")= chr [1:3] "icrit" "ispar" "iter"
> > $ fit :List of 5
> >  ..$ knot : num [1:31] 0 0 0 0 0.041 ...
> >  ..$ nk   : num 27
> >  ..$ min  : num 0
> >  ..$ range: num 12.6
> >  ..$ coef : num [1:27] 2.88e-05 1.72e-01 5.19e-01 9.04e-01 1.05 ...
> >  ..- attr(*, "class")= chr "smooth.spline.fit"
> > $ call: language smooth.spline(x = x, y = y, all.knots = F, nknots =
> > 25)
> > - attr(*, "class")= chr "smooth.spline"
> >>
> >
> > Many thanks!
> >
> >
> > Regards,
> >
> > Mike Nielsen
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius
> Alameda, CA, USA
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] solving x in a polynomial function

2013-03-01 Thread Mike Rennie

Doh- I'm a moron. I get it now. The last line is the confirmation that
function "realroots" is working. Sorry- late in the day on a friday.

Thanks everyone for your help with this- Uber-useful, and much appreciated.

Mike

On 3/1/13, Mike Rennie  wrote:
> Hi Peter,
>
> With the edit you suggested, now I'm just getting back the value of a
> that I put in, not the expected value of b...
>
>> cbind(a,b)
>a   b
>  [1,]  1 1.0
>  [2,]  2 2.0
>  [3,]  3 2.5
>  [4,]  4 3.0
>  [5,]  5 3.5
>  [6,]  6 4.0
>  [7,]  7 6.0
>  [8,]  8 7.0
>  [9,]  9 7.5
> [10,] 10 8.0
>
> #a of 5 should return b of 3.5
>
>> realroots <- function(model, b){
> + is.zero <- function(x, tol = .Machine$double.eps^0.5) abs(x) <
> tol
> + if(names(coef(model))[1] == "(Intercept)")
> +
> + r <- polyroot(c(coef(model)[1] - b, coef(model)[-1]))
> + else
> + r <- polyroot(c(-b, coef(model)))
> + Re(r[is.zero(Im(r))])
> + }
>>
>> r <- realroots(po.lm, 5)
>> predict(po.lm, newdata = data.frame(b = r))
> 1 2
> 5 5
>
>
> This function just returns what I feed it as written.
>
> Mike
>
> On 3/1/13, Peter Ehlers  wrote:
>> On 2013-03-01 13:06, Mike Rennie wrote:
>>> Hi guys,
>>>
>>> Perhaps on the right track? However, not sure if it's correct. I fixed
>>> the bug that A.K. indicated above (== vs =), but the values don't seem
>>> right. From the original data, an a of 3 should give b of 2.5.
>>>
>>>> realroots <- function(model, b){
>>> + is.zero <- function(x, tol = .Machine$double.eps^0.5) abs(x) <
>>> tol
>>> + if(names(model)[1] == "(Intercept)")
>>> + r <- polyroot(c(coef(model)[1] - b, coef(model)[-1]))
>>> + else
>>> + r <- polyroot(c(-b, coef(model)))
>>> + Re(r[is.zero(Im(r))])
>>> + }
>>>>
>>>> r <- realroots(po.lm, 3)
>>>> predict(po.lm, newdata = data.frame(b = r)) # confirm
>>> 1
>>> 1.69
>>>
>>> So I think there's a calculation error somehwere.
>>
>> You need to replace the following line
>>
>>if(names(model)[1] == "(Intercept)")
>>
>> with
>>
>>if(names(coef(model))[1] == "(Intercept)")
>>
>>
>> Peter Ehlers
>>
>>
>>>
>>>
>>> On 3/1/13, arun  wrote:
>>>> Hi Rui,
>>>>
>>>> Looks like a bug:
>>>> ###if(names(model)[1] = "(Intercept)")
>>>> Should this be:
>>>>
>>>> if(names(model)[1] == "(Intercept)")
>>>>
>>>> A.K.
>>>>
>>>>
>>>>
>>>> - Original Message -
>>>> From: Rui Barradas 
>>>> To: Mike Rennie 
>>>> Cc: r-help Mailing List 
>>>> Sent: Friday, March 1, 2013 3:18 PM
>>>> Subject: Re: [R] solving x in a polynomial function
>>>>
>>>> Hello,
>>>>
>>>> Try the following.
>>>>
>>>>
>>>> a <- 1:10
>>>> b <- c(1, 2, 2.5, 3, 3.5, 4, 6, 7, 7.5, 8)
>>>>
>>>> dat <- data.frame(a = a, b = b)  # for lm(), it's better to use a df
>>>> po.lm <- lm(a~b+I(b^2)+I(b^3)+I(b^4), data = dat); summary(po.lm)
>>>>
>>>>
>>>> realroots <- function(model, b){
>>>>  is.zero <- function(x, tol = .Machine$double.eps^0.5) abs(x) < tol
>>>>  if(names(model)[1] = "(Intercept)")
>>>>  r <- polyroot(c(coef(model)[1] - b, coef(model)[-1]))
>>>>  else
>>>>  r <- polyroot(c(-b, coef(model)))
>>>>  Re(r[is.zero(Im(r))])
>>>> }
>>>>
>>>> r <- realroots(po.lm, 5.5)
>>>> predict(po.lm, newdata = data.frame(b = r))  # confirm
>>>>
>>>>
>>>>
>>>> Hope this helps,
>>>>
>>>> Rui Barradas
>>>>
>>>> Em 01-03-2013 18:47, Mike Rennie escreveu:
>>>>> Hi there,
>>>>>
>>>>> Does anyone know how I solve for x from a given y in a polynomial
>>>>> function? Here's some example code:
>>>>>
>>>>> ##example file
>>>>>
>>>>>

Re: [R] solving x in a polynomial function

2013-03-01 Thread Mike Rennie

Hi Peter,

With the edit you suggested, now I'm just getting back the value of a
that I put in, not the expected value of b...

> cbind(a,b)
   a   b
 [1,]  1 1.0
 [2,]  2 2.0
 [3,]  3 2.5
 [4,]  4 3.0
 [5,]  5 3.5
 [6,]  6 4.0
 [7,]  7 6.0
 [8,]  8 7.0
 [9,]  9 7.5
[10,] 10 8.0

#a of 5 should return b of 3.5

> realroots <- function(model, b){
+ is.zero <- function(x, tol = .Machine$double.eps^0.5) abs(x) < tol
+ if(names(coef(model))[1] == "(Intercept)")
+
+ r <- polyroot(c(coef(model)[1] - b, coef(model)[-1]))
+ else
+ r <- polyroot(c(-b, coef(model)))
+ Re(r[is.zero(Im(r))])
+ }
>
> r <- realroots(po.lm, 5)
> predict(po.lm, newdata = data.frame(b = r))
1 2
5 5


This function just returns what I feed it as written.

Mike

On 3/1/13, Peter Ehlers  wrote:
> On 2013-03-01 13:06, Mike Rennie wrote:
>> Hi guys,
>>
>> Perhaps on the right track? However, not sure if it's correct. I fixed
>> the bug that A.K. indicated above (== vs =), but the values don't seem
>> right. From the original data, an a of 3 should give b of 2.5.
>>
>>> realroots <- function(model, b){
>> + is.zero <- function(x, tol = .Machine$double.eps^0.5) abs(x) <
>> tol
>> + if(names(model)[1] == "(Intercept)")
>> + r <- polyroot(c(coef(model)[1] - b, coef(model)[-1]))
>> + else
>> + r <- polyroot(c(-b, coef(model)))
>> + Re(r[is.zero(Im(r))])
>> + }
>>>
>>> r <- realroots(po.lm, 3)
>>> predict(po.lm, newdata = data.frame(b = r)) # confirm
>> 1
>> 1.69
>>
>> So I think there's a calculation error somehwere.
>
> You need to replace the following line
>
>if(names(model)[1] == "(Intercept)")
>
> with
>
>if(names(coef(model))[1] == "(Intercept)")
>
>
> Peter Ehlers
>
>
>>
>>
>> On 3/1/13, arun  wrote:
>>> Hi Rui,
>>>
>>> Looks like a bug:
>>> ###if(names(model)[1] = "(Intercept)")
>>> Should this be:
>>>
>>> if(names(model)[1] == "(Intercept)")
>>>
>>> A.K.
>>>
>>>
>>>
>>> - Original Message -
>>> From: Rui Barradas 
>>> To: Mike Rennie 
>>> Cc: r-help Mailing List 
>>> Sent: Friday, March 1, 2013 3:18 PM
>>> Subject: Re: [R] solving x in a polynomial function
>>>
>>> Hello,
>>>
>>> Try the following.
>>>
>>>
>>> a <- 1:10
>>> b <- c(1, 2, 2.5, 3, 3.5, 4, 6, 7, 7.5, 8)
>>>
>>> dat <- data.frame(a = a, b = b)  # for lm(), it's better to use a df
>>> po.lm <- lm(a~b+I(b^2)+I(b^3)+I(b^4), data = dat); summary(po.lm)
>>>
>>>
>>> realroots <- function(model, b){
>>>  is.zero <- function(x, tol = .Machine$double.eps^0.5) abs(x) < tol
>>>  if(names(model)[1] = "(Intercept)")
>>>  r <- polyroot(c(coef(model)[1] - b, coef(model)[-1]))
>>>  else
>>>  r <- polyroot(c(-b, coef(model)))
>>>  Re(r[is.zero(Im(r))])
>>> }
>>>
>>> r <- realroots(po.lm, 5.5)
>>> predict(po.lm, newdata = data.frame(b = r))  # confirm
>>>
>>>
>>>
>>> Hope this helps,
>>>
>>> Rui Barradas
>>>
>>> Em 01-03-2013 18:47, Mike Rennie escreveu:
>>>> Hi there,
>>>>
>>>> Does anyone know how I solve for x from a given y in a polynomial
>>>> function? Here's some example code:
>>>>
>>>> ##example file
>>>>
>>>> a<-1:10
>>>>
>>>> b<-c(1,2,2.5,3,3.5,4,6,7,7.5,8)
>>>>
>>>> po.lm<-lm(a~b+I(b^2)+I(b^3)+I(b^4)); summary(po.lm)
>>>>
>>>> (please ignore that the model is severely overfit- that's not the
>>>> point).
>>>>
>>>> Let's say I want to solve for the value b where a = 5.5.
>>>>
>>>> Any thoughts? I did come across the polynom package, but I don't think
>>>> that does it- I suspect the answer is simpler than I am making it out
>>>> to be. Any help would be welcome.
>>>>
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>>
>
>


-- 
Michael Rennie, Research Scientist
Fisheries and Oceans Canada, Freshwater Institute
Winnipeg, Manitoba, CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] solving x in a polynomial function

2013-03-01 Thread Mike Rennie

Hi guys,

Perhaps on the right track? However, not sure if it's correct. I fixed
the bug that A.K. indicated above (== vs =), but the values don't seem
right. From the original data, an a of 3 should give b of 2.5.

> realroots <- function(model, b){
+ is.zero <- function(x, tol = .Machine$double.eps^0.5) abs(x) < tol
+ if(names(model)[1] == "(Intercept)")
+ r <- polyroot(c(coef(model)[1] - b, coef(model)[-1]))
+ else
+ r <- polyroot(c(-b, coef(model)))
+ Re(r[is.zero(Im(r))])
+ }
>
> r <- realroots(po.lm, 3)
> predict(po.lm, newdata = data.frame(b = r)) # confirm
   1
1.69

So I think there's a calculation error somehwere.






On 3/1/13, arun  wrote:
> Hi Rui,
>
> Looks like a bug:
> ###if(names(model)[1] = "(Intercept)")
> Should this be:
>
> if(names(model)[1] == "(Intercept)")
>
> A.K.
>
>
>
> - Original Message -
> From: Rui Barradas 
> To: Mike Rennie 
> Cc: r-help Mailing List 
> Sent: Friday, March 1, 2013 3:18 PM
> Subject: Re: [R] solving x in a polynomial function
>
> Hello,
>
> Try the following.
>
>
> a <- 1:10
> b <- c(1, 2, 2.5, 3, 3.5, 4, 6, 7, 7.5, 8)
>
> dat <- data.frame(a = a, b = b)  # for lm(), it's better to use a df
> po.lm <- lm(a~b+I(b^2)+I(b^3)+I(b^4), data = dat); summary(po.lm)
>
>
> realroots <- function(model, b){
>     is.zero <- function(x, tol = .Machine$double.eps^0.5) abs(x) < tol
>     if(names(model)[1] = "(Intercept)")
>         r <- polyroot(c(coef(model)[1] - b, coef(model)[-1]))
>     else
>         r <- polyroot(c(-b, coef(model)))
>     Re(r[is.zero(Im(r))])
> }
>
> r <- realroots(po.lm, 5.5)
> predict(po.lm, newdata = data.frame(b = r))  # confirm
>
>
>
> Hope this helps,
>
> Rui Barradas
>
> Em 01-03-2013 18:47, Mike Rennie escreveu:
>> Hi there,
>>
>> Does anyone know how I solve for x from a given y in a polynomial
>> function? Here's some example code:
>>
>> ##example file
>>
>> a<-1:10
>>
>> b<-c(1,2,2.5,3,3.5,4,6,7,7.5,8)
>>
>> po.lm<-lm(a~b+I(b^2)+I(b^3)+I(b^4)); summary(po.lm)
>>
>> (please ignore that the model is severely overfit- that's not the point).
>>
>> Let's say I want to solve for the value b where a = 5.5.
>>
>> Any thoughts? I did come across the polynom package, but I don't think
>> that does it- I suspect the answer is simpler than I am making it out
>> to be. Any help would be welcome.
>>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


-- 
Michael Rennie, Research Scientist
Fisheries and Oceans Canada, Freshwater Institute
Winnipeg, Manitoba, CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] solving x in a polynomial function

2013-03-01 Thread Mike Rennie

Hi there,

Does anyone know how I solve for x from a given y in a polynomial
function? Here's some example code:

##example file

a<-1:10

b<-c(1,2,2.5,3,3.5,4,6,7,7.5,8)

po.lm<-lm(a~b+I(b^2)+I(b^3)+I(b^4)); summary(po.lm)

(please ignore that the model is severely overfit- that's not the point).

Let's say I want to solve for the value b where a = 5.5.

Any thoughts? I did come across the polynom package, but I don't think
that does it- I suspect the answer is simpler than I am making it out
to be. Any help would be welcome.

-- 
Michael Rennie, Research Scientist
Fisheries and Oceans Canada, Freshwater Institute
Winnipeg, Manitoba, CANADA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Finding the knots in a smoothing spline using nknots

2013-02-27 Thread Mike Nielsen

Hi r-helpers.

Please forgive my ignorance, but I would like to plot a smoothing spline
(smooth.spline) from package "stats", and show the knots in the plot, and I
can't seem to figure out where smooth.spline has located the knots (when I
use nknots).  Unfortunately, I don't know a lot about splines, but I know
that they provide me an easy way to estimate the location of local maxima
and minima on varying time-scales (number of knots) in my original data.

I see there is a fit$knot, but it's not clear to me what those values are:
for some reason I had expected that they would be contained in my original
y values, but they're not.  I tried generating nknots equally spaced points
in my x, but when I plotted the points that corresponded to my original y
values at those equally-spaced x values, I found that the spline did not
pass through them, which, perhaps naively, I thought it might.

Also, the manual says that yin comprises "the y values used at the unique y
values" -- should this read "at the unique x values"?

Could someone kindly point to a resource where I can get a slightly fuller
explanation?  I looked at the code for smooth.spline, but can't readily
follow it.

Here's a toy example:

> x<-seq(from=0,to=4*pi,length=1002)
> y<-sin(x)
> ss<-smooth.spline(x,y=y,all.knots=F,nknots=25)
> ss
Call:
smooth.spline(x = x, y = y, all.knots = F, nknots = 25)

Smoothing Parameter  spar= -0.4573636  lambda= 1.006117e-09 (14 iterations)
Equivalent Degrees of Freedom (Df): 26.99935
Penalized Criterion: 3.027077e-06
GCV: 3.190666e-09
> str(ss)
List of 15
 $ x   : num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ...
 $ y   : num [1:1002] 2.88e-05 1.26e-02 2.51e-02 3.77e-02 5.02e-02 ...
 $ w   : num [1:1002] 1 1 1 1 1 1 1 1 1 1 ...
 $ yin : num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ...
 $ data:List of 3
  ..$ x: num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ...
  ..$ y: num [1:1002] 0 0.0126 0.0251 0.0377 0.0502 ...
  ..$ w: num [1:1002] 1 1 1 1 1 1 1 1 1 1 ...
 $ lev : num [1:1002] 0.2238 0.177 0.1399 0. 0.0891 ...
 $ cv.crit : num 3.19e-09
 $ pen.crit: num 3.03e-06
 $ crit: num 3.19e-09
 $ df  : num 27
 $ spar: num -0.457
 $ lambda  : num 1.01e-09
 $ iparms  : Named int [1:3] 1 0 14
  ..- attr(*, "names")= chr [1:3] "icrit" "ispar" "iter"
 $ fit :List of 5
  ..$ knot : num [1:31] 0 0 0 0 0.041 ...
  ..$ nk   : num 27
  ..$ min  : num 0
  ..$ range: num 12.6
  ..$ coef : num [1:27] 2.88e-05 1.72e-01 5.19e-01 9.04e-01 1.05 ...
  ..- attr(*, "class")= chr "smooth.spline.fit"
 $ call: language smooth.spline(x = x, y = y, all.knots = F, nknots =
25)
 - attr(*, "class")= chr "smooth.spline"
>

Many thanks!


Regards,

Mike Nielsen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple - Finding vector in a vector

2012-10-08 Thread Mike Spam

Great, works perfekt now!
Thanks for your fast help!

Nico


2012/10/8 David L Carlson :
> Building on Jessica and Steve's use of embed:
>
>> X <- c(NA, 1, NA, 1, 1, 1, 1, 1, 1, NA, 1)
>> Z <- rowSums(embed(X, 3))
>> Z[is.na(Z)] <- 1
>> Z
> [1] 1 1 1 3 3 3 3 1 1
>
> --
> David L Carlson
> Associate Professor of Anthropology
> Texas A&M University
> College Station, TX 77843-4352
>
>> -Original Message-
>> From: r-help-boun...@r-project.org [mailto:r-help-bounces@r-
>> project.org] On Behalf Of Jessica Streicher
>> Sent: Monday, October 08, 2012 9:19 AM
>> To: Mike Spam
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Simple - Finding vector in a vector
>>
>> > x<-c(NA , 1 ,NA,  1 , 1 , 1 , 1 , 1  ,1 ,NA , 1)
>> > embed(x,3)
>>   [,1] [,2] [,3]
>>  [1,]   NA1   NA
>>  [2,]1   NA1
>>  [3,]11   NA
>>  [4,]111
>>  [5,]111
>>  [6,]111
>>  [7,]111
>>  [8,]   NA11
>>  [9,]1   NA1
>>
>> > which(rowSums(embed(x,3))==3)
>> [1] 4 5 6 7
>>
>> gives you the first of the 3
>>
>> On 08.10.2012, at 15:38, Mike Spam wrote:
>>
>> > Sorry, i just realized, that it output the sum of all vectors. I can
>> > work with this function but it would be much faster and easier if it
>> > would be possible to get the positions of evry match.
>> >
>> > example:
>> >
>> > NA  1 NA  1  1  1  1  1  1 NA  1
>> >
>> > rle returns
>> > lengths: int [1:6] 1 1 1 6 1 1
>> >
>> > what i need would be something like,
>> > 1 1 1 3 3 3 3 1 1
>> >
>> > but anyway i can work with rle, if there is no suitable function.
>> >
>> > thanks,
>> > Nico
>> >
>> >
>> >
>> > 2012/10/8 Mike Spam :
>> >> Hey Rui,
>> >>
>> >> Perfect! Thanks!! :)
>> >>
>> >> Nico
>> >>
>> >> 2012/10/8 Rui Barradas :
>> >>> Hello,
>> >>>
>> >>> See ?rle
>> >>>
>> >>> Hope this helps,
>> >>>
>> >>> Rui Barradas
>> >>> Em 08-10-2012 13:55, Mike Spam escreveu:
>> >>>>
>> >>>> Hi,
>> >>>>
>> >>>> just a simple question.
>> >>>> Assumed i have a vector,
>> >>>>
>> >>>> FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE FALSE
>> >>>> or
>> >>>> NA  1  1  1 NA  1 NA  1 NA
>> >>>>
>> >>>> what i need is the position where an element is the same - three
>> (or
>> >>>> in general multiple) times in a row.
>> >>>>
>> >>>> in this case: i want to get the position where it is TRUE TRUE
>> TRUE or 1 1
>> >>>> 1
>> >>>> it doesn't matter if it is the first, middle or last element. So
>> the
>> >>>> output could be 2 or 3 or 4
>> >>>>
>> >>>> My idea would be to lag the vector and calc differences... but i
>> would
>> >>>> prefer any build in (or time saving) function. :)
>> >>>>
>> >>>> thanks,
>> >>>>
>> >>>> Nico
>> >>>>
>> >>>> __
>> >>>> R-help@r-project.org mailing list
>> >>>> https://stat.ethz.ch/mailman/listinfo/r-help
>> >>>> PLEASE do read the posting guide
>> >>>> http://www.R-project.org/posting-guide.html
>> >>>> and provide commented, minimal, self-contained, reproducible code.
>> >>>
>> >>>
>> >
>> > __
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-
>> guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple - Finding vector in a vector

2012-10-08 Thread Mike Spam

Sorry, i just realized, that it output the sum of all vectors. I can
work with this function but it would be much faster and easier if it
would be possible to get the positions of evry match.

example:

NA  1 NA  1  1  1  1  1  1 NA  1

rle returns
lengths: int [1:6] 1 1 1 6 1 1

what i need would be something like,
1 1 1 3 3 3 3 1 1

but anyway i can work with rle, if there is no suitable function.

thanks,
Nico



2012/10/8 Mike Spam :
> Hey Rui,
>
> Perfect! Thanks!! :)
>
> Nico
>
> 2012/10/8 Rui Barradas :
>> Hello,
>>
>> See ?rle
>>
>> Hope this helps,
>>
>> Rui Barradas
>> Em 08-10-2012 13:55, Mike Spam escreveu:
>>>
>>> Hi,
>>>
>>> just a simple question.
>>> Assumed i have a vector,
>>>
>>> FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE FALSE
>>> or
>>> NA  1  1  1 NA  1 NA  1 NA
>>>
>>> what i need is the position where an element is the same - three (or
>>> in general multiple) times in a row.
>>>
>>> in this case: i want to get the position where it is TRUE TRUE TRUE or 1 1
>>> 1
>>> it doesn't matter if it is the first, middle or last element. So the
>>> output could be 2 or 3 or 4
>>>
>>> My idea would be to lag the vector and calc differences... but i would
>>> prefer any build in (or time saving) function. :)
>>>
>>> thanks,
>>>
>>> Nico
>>>
>>> __
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Simple - Finding vector in a vector

2012-10-08 Thread Mike Spam

Hey Rui,

Perfect! Thanks!! :)

Nico

2012/10/8 Rui Barradas :
> Hello,
>
> See ?rle
>
> Hope this helps,
>
> Rui Barradas
> Em 08-10-2012 13:55, Mike Spam escreveu:
>>
>> Hi,
>>
>> just a simple question.
>> Assumed i have a vector,
>>
>> FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE FALSE
>> or
>> NA  1  1  1 NA  1 NA  1 NA
>>
>> what i need is the position where an element is the same - three (or
>> in general multiple) times in a row.
>>
>> in this case: i want to get the position where it is TRUE TRUE TRUE or 1 1
>> 1
>> it doesn't matter if it is the first, middle or last element. So the
>> output could be 2 or 3 or 4
>>
>> My idea would be to lag the vector and calc differences... but i would
>> prefer any build in (or time saving) function. :)
>>
>> thanks,
>>
>> Nico
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simple - Finding vector in a vector

2012-10-08 Thread Mike Spam

Hi,

just a simple question.
Assumed i have a vector,

FALSE  TRUE  TRUE  TRUE FALSE  TRUE FALSE  TRUE FALSE
or
NA  1  1  1 NA  1 NA  1 NA

what i need is the position where an element is the same - three (or
in general multiple) times in a row.

in this case: i want to get the position where it is TRUE TRUE TRUE or 1 1 1
it doesn't matter if it is the first, middle or last element. So the
output could be 2 or 3 or 4

My idea would be to lag the vector and calc differences... but i would
prefer any build in (or time saving) function. :)

thanks,

Nico

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R process must die - can I save history? [SOLVED]

2012-10-02 Thread Mike Miller

This is a solution for UNIX/Linux-type OS users and a lot of it is only 
related to R in the sense that it can allow you to reconnect to an R 
session.  I managed to do it and to save history, thanks to helpful 
messages from Ista Zahn and Brian Ripley.


To explain this, I will call my two Linux boxes Desktop and Server.  I 
logged into Desktop, an Ubuntu box, and ran this command:


ps aux | grep ssh

That showed me these ssh processes and a few extraneous things:

USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
mbmiller  3520  0.0  0.0  46944  4668 pts/5S+   Jul25   0:23 ssh -X Server
mbmiller  4602  0.0  0.0  47720  5544 pts/9S+   Aug09   1:07 ssh -X Server
mbmiller 25614  0.0  0.0  45584  3344 pts/11   S+   Sep24   0:00 ssh -X Server

I launched an xterm from which I would try to recapture those ssh 
sessions.  (Spoiler: This worked for every ssh process.)  I was missing 
the reptyr binary, so I installed it like so:


sudo apt-get install reptyr

The reptyr man page told me that I had to do this to allow reptyr to work 
(I could change the 0 back to 1 after finishing):


$ sudo su
root# echo 0 > /proc/sys/kernel/yama/ptrace_scope
root# exit

After that I just ran reptyr for each process, for example:

reptyr 3520

A few lines of text would appear (the last saying "Set the controlling 
tty"), I'd hit "enter" and it would give my my command prompt from my old 
shell, or the R prompt if R was running in that shell.  I was then able to 
save command histories, etc.  When I tried to exit from R using q("yes") 
it accepted the command, but it did not return my bash prompt.  To deal 
with that, I tried Brian Ripley's recommendation:


I logged into Server via ssh and ran this command:

ps aux | grep R

Which returned this along with some irrelevant stuff:

USER   PID %CPU %MEMVSZ   RSS TTY  STAT START   TIME COMMAND
mbmiller  5156  0.0  0.1 213188  9244 pts/1S+   Aug06   1:00 
/share/apps/R-2.15.1/lib64/R/bin/exec/R -q

I tried the kill command...

kill -USR1 5156

...and that returned my bash prompt immediately in the other xterm.  R was 
done, the .Rhistory looked perfect, and .RData also was there.  I logged 
into Server, went to the appropriate directory, ran R and found that all 
of the objects were there and working correctly.


So that was amazing.  I could reattach to the R session and also kill it 
without losing history and data.  This is a big deal for me because I get 
stuck like that about once a year and it's always a huge pain.


Mike


On Tue, 2 Oct 2012, Ista Zahn wrote (off-list):

If you can find the process ID you can try connecting the process to 
another terminal with reptyr (https://github.com/nelhage/reptyr), and 
then just use savehistory() as usual. You don't say what flavor of Linux 
you're dealing with, but it looks like there are packaged versions for 
at least Ubuntu, Fedora, and Arch.



On Tue, 2 Oct 2012, Prof Brian Ripley wrote:

Maybe not.  On a Unix-alike see ?Signals.  If you can find the pid of 
the R process and it is still running (and not e.g. suspended),


kill -USR1 

will save the workspace and history.




Original query:

On Tue, 2 Oct 2012, Mike Miller wrote:

I connected from my desktop Linux box to a Linux server using ssh in an 
xterm, but that xterm was running in Xvnc.  I'm running R on the server 
in that xterm (over ssh).  Something went wrong with Xvnc that has 
caused it to hang, probably this bug:


https://bugs.launchpad.net/ubuntu/+source/vnc4/+bug/819473

So I can't get back to that ssh session or to R.  I had done a bunch of 
work in R but the command history hasn't been written out.  If I kill R, 
I assume the command history is gone.  I wish I could somehow cause R to 
dump the command history.  Is there any way to tell the running R 
process to write the history somewhere?


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R process must die - can I save history?

2012-10-02 Thread Mike Miller

I connected from my desktop Linux box to a Linux server using ssh in an 
xterm, but that xterm was running in Xvnc.  I'm running R on the server in 
that xterm (over ssh).  Something went wrong with Xvnc that has caused it 
to hang, probably this bug:


https://bugs.launchpad.net/ubuntu/+source/vnc4/+bug/819473

So I can't get back to that ssh session or to R.  I had done a bunch of 
work in R but the command history hasn't been written out.  If I kill R, I 
assume the command history is gone.  I wish I could somehow cause R to 
dump the command history.  Is there any way to tell the running R process 
to write the history somewhere?


Thanks in advance.

Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] effective way to return only the first argument of "which()"

2012-09-20 Thread Mike Spam

Thank you very much, especially Milan and Bert!
I will do some speedtests and fit the function to my needs.

I think the best way would be a modified function in C...
But i am not familiar enough with C. Perhaps this would be a simple
but useful extension.
If someone has a solution, i would appreciate a post in this mailing list.

Cheers and thanks to all,
Nico


2012/9/19 Bert Gunter :
> Well, following up on this observation, which can be put under the
> heading of "Sometimes vectorization can be much slower than explicit
> loops" , I offer the following:
>
>  firsti  <-function(x,k)
> {
>   i <- 1
>   while(x[i]<=k){i <- i+1}
>   i
> }
>
>> system.time(for(i in 1:100)which(x>.99)[1])
>user  system elapsed
>19.1 2.422.2
>
>> system.time(for(i in 1:100)which.max(x>.99))
>user  system elapsed
>   30.456.75   37.46
>
>> system.time(for(i in 1:100)firsti(x,.99))
>user  system elapsed
>0.030.000.03
>
> ## About a 500 - 1000 fold speedup !
>
>> firsti(x,.99)
> [1] 122
>
> It doesn't seem to scale too badly, either (whatever THAT means!):
> (Of course, the which() versions are essentially identical in timing,
> and so are omitted)
>
>> system.time(for(i in 1:100)firsti(x,.))
>user  system elapsed
>2.700.002.72
>
>> firsti(x,.)
> [1] 18200
>
> Of course, at some point, the explicit looping is awful -- with k =
> .99, the index was about 36, and the timing test took 54
> seconds.
>
> So I guess the point is -- as always -- that the optimal approach
> depends on the nature of the data. Prudence and robustness clearly
> demands the vectorized which() approaches if you have no information.
> But if you do know something about the data, then you can often write
> much faster tailored solutions. Which is hardly revelatory, of course.
>
> Cheers to all,
> Bert
>
> On Wed, Sep 19, 2012 at 8:55 AM, Milan Bouchet-Valat  
> wrote:
>> Le mercredi 19 septembre 2012 à 15:23 +, William Dunlap a écrit :
>>> The original method is faster than which.max for longish numeric vectors
>>> (in R-2.15.1), but you should check time and memory usage on your
>>> own machine:
>>>
>>> > x <- runif(18e6)
>>> > system.time(for(i in 1:100)which(x>0.99)[1])
>>>user  system elapsed
>>>   11.641.05   12.70
>>> > system.time(for(i in 1:100)which.max(x>0.99))
>>>user  system elapsed
>>>   16.382.94   19.35
>> If you the probability that such an element appears at the beginning of
>> the vector, a custom hack might well be more efficient. The problem with
>> ">", which() and which.max() is that they will go over all the elements
>> of the vector even if it's not needed at all. So you can start with a
>> small subset of the vector, and increase its size in a few steps until
>> you find the value you're looking for.
>>
>> Proof of concept (the values of n obviously need to be adapted):
>> x <-runif(1e7)
>>
>> find <- function(x, lim) {
>> len <- length(x)
>>
>> for(n in 2^(14:0)) {
>> val <- which(x[seq.int(1L, len/n)] > lim)
>>
>> if(length(val) > 0) return(val[1])
>> }
>>
>> return(NULL)
>> }
>>
>>> system.time(for(i in 1:100)which(x>0.999)[1])
>> utilisateur système  écoulé
>>   9.740   5.795  15.890
>>> system.time(for(i in 1:100)which.max(x>0.999))
>> utilisateur système      écoulé
>>  14.288   9.510  24.562
>>> system.time(for(i in 1:100)find(x, .999))
>> utilisateur système  écoulé
>>   0.017   0.002   0.019
>>> find(x, .999)
>> [1] 1376
>>
>> (Looks almost like cheating... ;-)
>>
>>
>>
>>
>>
>>> Bill Dunlap
>>> Spotfire, TIBCO Software
>>> wdunlap tibco.com
>>>
>>>
>>> > -Original Message-
>>> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] 
>>> > On Behalf
>>> > Of Jeff Newmiller
>>> > Sent: Wednesday, September 19, 2012 8:06 AM
>>> > To: Mike Spam; r-help@r-project.org
>>> > Subject: Re: [R] effective way to return only the first argument of 
>>> > "which()"
>>> >
>>> > ?which.max
>>> > ---
>>> > Jeff NewmillerThe

Re: [R] effective way to return only the first argument of "which()"

2012-09-19 Thread Mike Spam

Hi,

Thanks Michael, but i think this is even slower.

 x <-sample(2000)
 which(x < 5)[1]
 which.max(x < 5)
 system.time(for(i in 1:100) which.max(x < 5))
   User  System verstrichen
  60.84   13.70   86.33
 system.time(for(i in 1:100) which(x < 5)[1])
   User  System verstrichen
  40.458.25   48.95



Thanks,
Nico

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] effective way to return only the first argument of "which()"

2012-09-19 Thread Mike Spam

Hi,

I was looking for a function like "which()" but only returns the first argument.
Compare:

x <- c(1,2,3,4,5,6)
y <- 4
which(x>y)

returns:
5,6

which(x>y)[1]
returns:
5

which(x>y)[1] is exactly what i need. I did use this but the dataset
is too big (~18 mio. Points).
That's why i need a more effective way to get the first element of a
vector which is bigger/smaller than a specific number.

I found "match()" but this function only works for equal numbers.



Thanks,
Nico

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Compiling R2.15.1 on ubuntu with x86-64 architecture and shared library

2012-09-17 Thread Conklin, Mike (GfK Custom Research NA)

I am sure I am providing insufficient information, please ask for more.

I installed R 2.14.2 on my Ubuntu laptop with and AMD64 processor and also 
installed RStudio  and everything worked fine.

Now, I tried to build R 2.15.1 from source and installed it using defaults. 
RStudio now complained that R was not built as a shared library.

Went back and uninstalled, and configured with -enable-R-shlib

Now make fails at:
/usr/bin/ld: CConverters.o: relocation R_X86_64_325 against '.rodata' can not 
be used when making a shared object; recompile with -fPIC

Being new to Linux, I have no idea how to recompile with -fPIC

Any help would be appreciated.

Mike

W. Michael Conklin
Executive Vice President | GfK Marketing Science | Consumer Experiences North 
America
GfK Custom Research, LLC | 8401 Golden Valley Road | Minneapolis, MN, 55427
T +1 763 417 4545 | M +1 612 567 8287
www.gfk.com<http://www.gfk.com/>


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] split plot experiment with a repeated measure

2012-09-13 Thread Mike Palmer

Hello,

I would like advice on if and how I might viably analyse this dataset: The 
experiment has 3 blocks with a three level main plot treatment, split to a two 
level treatment. In this case most parameters are also measured repeatedly at 
1-3 monthly intervals. Ideally I would like to test the full 
response~treatment_1*treatment_2*sample_time model, but don't know what the 
structure should be, or even whether this is a viable analysis. I should 
probably add that neither the main or sub plots are randomised.

Thanks

Mike Palmer

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Setwd to a directory that requires authentication

2012-09-11 Thread Mike Smith

I work at a company where we log on to windows using a username and
password.  There is a global server with files that I need to use R to do
some analysis on.  That server requires my windows credentials to logon.
When I access the server from internet explorer it automatically uses my
windows credentials to logon.  When I use R it doesn't work.  I have tried:

setInternet2(TRUE)
con = url(description="http://username:passw...@server.net";, open="r")
open(con)

which gave me a timeout error.  I have also tried:

setwd("server.net")

which gives me the error:

Error in setwd("server.net") : cannot change working directory

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Pseudo R squared in gls model

2012-08-24 Thread Mike Miller


On Thu, 23 Aug 2012, Gary Dong wrote:

I'm wondering if the gls function reports pseudo R. I do not see it by 
summary(). If the package does not report, can I calculate it in this 
way?


Adjusted pseudo R squared = 1 - [(Loglik(beta) - k ) / Loglik(null)] where
k is the number of IVs.



We've been using this R² instead, but I'm not sure if it is available in 
any package:


Buse, A. (1973), "Goodness of Fit in Generalized Least Squares 
Estimation," American Statistician, 27, 106-108.


http://www.jstor.org/stable/10.2307/2683631

Equation 15.  Make sure to compute y-bar correctly (it is the regression 
coefficient in the intercept-only model).


To do GLS, we have been using OLS after transformation (multiplying by the 
"T" matrix in Buse's notation) where T'T = inv(V).  So the equation on the 
final page works for us.


Let me know if you can't access that PDF and I'll send a copy.

Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Question on meaning of '%+%', '%?%' (? = various single letter) in code

2012-05-31 Thread Mike Hilt

Hello.  I’ve seen several uses of '%?%' in R code (with ? = a single
letter or other characters used as the middle character in a 3
character string with '%' as the 1st and 3rd characters).

I’ve also recently seen ‘%+%’ usage at:

http://vita.had.co.nz/papers/ggplot2-wires.pdf

on p. 7 of the document the following code appears:

(Note the '%+%' string at end of 1st line.)

ggplot(threept, aes(x, y)) %+%
lineup(permute("result"), n = 8) +
geom_point(aes(colour = result, group = 1), alpha = 1/4) +
scale_colour_manual(values = c("made" = mnsl("10B 6/6"),
"missed" = mnsl("10R 6/6"))) +
xlab(NULL) + ylab(NULL) + coord_equal() +
facet_wrap(˜ .sample, nrow = 2)


I've looked long and hard on internet and on the CRAN and can't
find a reference to what '%+%' or '%?%' or ‘%?%’ (? = any letter)
does in an R function/script.

I've also asked several friends who have significant R experience and
all of us are stumped by what '%+%', or more generally '%?%' does in an
R function/script.

x
On p. 12 of:
R Language Definitions
appears:

‘%x% Special binary operators, x can be replaced by any valid name’

and

‘R deals with entire vectors of data at a time, and most of the
elementary operators and basic mathematical functions like log are
vectorized (as indicated in the table above).  This means that e.g.
adding two vectors of the same length will create a vector containing
the element-wise sums, implicitly looping over the vector index.  This
applies also to other operators like -, *, and / as well as to higher
dimensional structures.  Notice in particular that multiplying two
matrices does not produce the usual matrix product (the %*% operator
exists for that purpose).  Some finer points relating to vectorized
operations will be discussed in Section 3.3 [Elementary arithmetic
operations], page 16.’

Page 16 doesn’t specifically address ‘%?%’, at least in a way I can
understand.

xxx
I tried an experiment with %+%, and it failed to work!

I loaded xts package (must have xts and zoo installed b/c xts has a
dependency on zoo.)

Entered the following:

> library(xts)
> data(sample_matrix)
> x1 <- sample_matrix
> x1
> x3 <- x1 %/% x1
> x3
> x4 <- x1 %+% x1

MH (Mike Hilt comment):
After entering ‘x3’, the data was listed with all values = 1,
which is what should happen when binary division is done – CHECK!

MH:
After entering ‘x4 <- x1 %+% x1

R returns:
‘Error: could not find function "%+%"’

MH:
So %+% is not a Special binary operator as %x% is
savdescribed on p. 12 of the CRAN doc:
R Language Definitions.



Could someone help me out and let me know what ‘%?%’
(where ? = a single letter in a 3 character string with ‘%’
being the 1st and 3rd characters), and/or ‘%+%’ does in
R code/function?


Thank you,
Mike Hilt
mikehil...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Multiple rms summary plots in a single device

2012-05-25 Thread Mike Harwood

I would like to incorporate multiple summary plots from the rms
package into a single device and to control the titles, and also to
open a new device when I reach a specified number of plots.  Currently
I am only getting a single "plot(summary(" graph in the upper left-
hand corner of each successive device.  However, in the rms
documention I see instances of a loop being used with "par(mfrow(" for
multiple plots in a single device(e.g. residuals.lrm), and these
examples work on my system.  Please advise regarding options that must
be specified to "plot(summary(", or in the construction of my loop.
Below are sample code and my sessionInfo().  Please note that I am
using data.table to facilitate my "real analysis", but I can replicate
the issue with tData as a data.frame (using seg <- subset(tData,
groups == segment) logic), but I included the data.table logic in case
it may be having some influence.  Thank you!

Mike


tData <- data.frame(groups=as.factor(1:8), low=as.factor(1:4)
,high=as.factor(seq(100, 400, 100)),  rand=runif(400))
tData <- data.table(tData)
setkeyv(tData, 'groups')


dd <- datadist(tData)
options(datadist = 'dd')

doSumPlot <- function(segment){
seg <<- tData[groups == segment,]
plot(summary(rand ~
+ low
+ high
,data = seg
), main=paste('Group:', segment))
}


for(i in 1:length(levels(tData$groups))){
cat('Group: ', i, '\n')
if(i == 1 ){
dev.new()
par(mfrow=c(2,2))
}
if(i/5 == round(i/5, 0)){
dev.new()
par(mfrow=c(2,2))
}
# dev.new()
doSumPlot(levels(tData$groups)[i])
}


> sessionInfo()
R version 2.15.0 (2012-03-30)
Platform: i386-pc-mingw32/i386 (32-bit)

locale:
[1] LC_COLLATE=English_United States.1252
[2] LC_CTYPE=English_United States.1252
[3] LC_MONETARY=English_United States.1252
[4] LC_NUMERIC=C
[5] LC_TIME=English_United States.1252

attached base packages:
[1] splines   stats graphics  grDevices utils datasets
methods
[8] base

other attached packages:
[1] rms_3.5-0Hmisc_3.9-3  survival_2.36-14
data.table_1.8.0

loaded via a namespace (and not attached):
[1] cluster_1.14.2 grid_2.15.0lattice_0.20-6 tools_2.15.0

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Reading Excel Formulas as values

2012-05-15 Thread Mike Smith

When I read excel files using the read.xlsx() command any cells that have
formulas in them come up as NA.

Is there a way to read just the numeric value of the cell without using the
"paste value" command in Excel?  I need to read in hundreds of Excel
spreadsheets and compile them into one large super spreadsheet
automatically.  Hence the reason I cannot reformat each sheet manually.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Can't read xlsx file into R. Seem, Seem to have XLConnect loaded.

2012-05-09 Thread Mike Hilt

I have spent hours on R in Windows 7.  Just installed 2 days ago so the R
package should be current.



Currently I am using the RGui (64-bit) for Windows.



I can not read an Excel file into R from my computer.  Have hours on this.
 Completely crazy!!



I have the XLConnect package loaded.  I think it is loaded because when I
enter:



> loadedNamespaces()

 [1] "base"  "datasets"  "graphics"  "grDevices"

 [5] "methods"   "rJava" "stats" "tools"

 [9] "utils" "XLConnect" "XLConnectJars"



XLConnect is listed.  Does that mean it is loaded?



> getwd()

[1] "F:/RwkCB3/R OG SBCD 1205v01"


> list.files()

 [1] "~$ R OG SBCD 1205 MH1.docx"   "~WRL2121.tmp"

 [3] "01 R OG SBCD 1205 MH1 BU1.docx"   "01 R OG SBCD 1205 MH1.docx"

 [5] "01 R OG SBCD 1205v01 BU2" "Dta1.txt"

 [7] "Gas Dly YTD1204 v01 BU.xlsm"  "Gas Dly YTD1204 v01.xlsm"

 [9] "Oil Dly YTD1204 v01 BU.xlsm"  "Oil Dly YTD1204 v01.xlsm"

[11] "Oil Dly YTD1204 v01.xlsx" "R OG SBCD 1205v01 sCt.lnk"

[13] "R setwd to C Prog R sCt OtDt.lnk"



So apparently "Oil Dly YTD1204 v01.xlsx" exists in my working directory.



SO WHY DOES THE FOLLOWING BEHAVE THE WAY IT DOES?



> OlPrcFl <- loadWorkbook(Oil Dly YTD1204 v01.xlsx, create = FALSE)

Error: unexpected input in "OlPrcFl <- loadWorkbook("




I can read an xlsx file in when I do:


> OlPrcFl <- loadWorkbook(file.choose())

That is not a real, long-term solution.


Have same problem installing packages -- Can't get R to load a package when
I specify a path.  Works when I use file.choose()




I would really appreciate it if you could tell me what is going on.



Thanks,

Mike Hilt

mikehil...@gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] question on jitter in plot.Predict in rms

2012-04-30 Thread Mike Babyak

Dear colleagues,

I have a question regarding controlling the jitter when plotting
predictions in the rms package.  Below I've simulated some data that
reflect what I'm working with.  The model predicts a continuous variable
with an ordinal score, a two-level group, and a continuous covariate.  Of
primary interest is a plot of the group by score interaction, where the
score is the ordinal variable, and the group Ns are quite disparate.

When I produce the plot for the predicted values with the data=llist
argument, as expected I get datadensity hatch marks.  However, in the group
with the smaller N, I get jittered datadensity points, while in the group
with the larger N, the jitter apparently defaults to single vertical lines,
which I assume is because the jitter would look like a black blob.  Some of
my co-authors are a bit worried about how that looks, so here I am.

Apart from abandoning data=llist altogether, is there a way to modify the
jitter in the smaller group so it behaves like the larger one?

Of secondary importance, anything you can tell me about improving my clumsy
little simulation code would be welcome--I have little to no idea what I'm
doing there.  for example, would there be a way to produce the group
variable with the disparate Ns more directly?

Thanks,

Mike Babyak
Behavioral Medicine Research Center
Duke University Medical Center



#question about jitter/llist in rms
#R v 2.14.1 under windows 7


#question about jitter/llist in rms
require(rms)
#simulate some data
n = 5000
age   = runif(n)
score = runif(n) + 0.5*age
group<- as.numeric(sample(c(FALSE,TRUE), 5000, replace=T, prob=c(.1, .9)))
ordscore = as.numeric(factor(rep(1:5, length.out=n)))
table(group,ordscore)
e = rnorm(n, 0, 0.1)

#true model
y = group + 0.6*ordscore + group*ordscore  + .2*age + e

#convert group to factor
group.f<-as.factor(group)

#save data characterics
dd1<-datadist(age,ordscore,group.f)
options(datadist="dd1")

#estimate model
reg1<-ols(y~group.f+ordscore+group.f*ordscore+age,x=T,y=T)

#plot results
preg<-Predict(reg1,ordscore,group.f)

#produces interaction plot with datadensity hatch marks
plot(preg,data=llist(ordscore,group.f))

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] generate random numbers for lotteries

2012-04-29 Thread Mike Miller


On Sun, 29 Apr 2012, Daniel Nordlund wrote:

I don't know what the OP is really trying to accomplish yet, and I am 
not motivated (yet) to try to figure it out.  However, all this 
"flooring" and "ceiling) and "rounding" is not necessary for generating 
uniform random integers.  For N integers in the range from A to B it is 
as simple as


sample(A:B, N, replace=TRUE)


That is easier to work with.  Thanks.  Jim Holtman also was pointing to 
the sample() function.  I'm used to making all random numbers from runif() 
because I use that kind of tactic in a lot of different languages.


Sample() is faster, too.

Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] generate random numbers for lotteries

2012-04-29 Thread Mike Miller


On Mon, 30 Apr 2012, Vale Fara wrote:

ok, what to do is to generate two sets (x,y) of integer uniform random 
numbers so that the following condition is satisfied: the sum of the 
numbers obtained in x,y matched two by two (first number obtained in x 
with first number obtained in y and so on) is in mean equal to a value 
z, mean value that I can decide before the randomization.


Hope this is more clear than before...


It isn't very clear to me.  If you generate random X,Y pairs such that 
(X+Y)/2=z, then you have a only one random number and a nother that is 
completely dependent on it:


X = random
Y = 2z - X.

I'll just tell you one thing you might be able to use, but I don't have 
time for this.


To make a vector of N uniformly-distributed random integers in the range 
from integer A to integer B, inclusive, you can do this:


floor( runif(N, min=A, max=B+1) )

The floor() function rounds down to the nearest integer.  Depending on the 
exact nature of the algorithm, it might be possible for B+1 to happen, but 
it would be extremely unlikely, if it really is possible.


This should do the same thing:

floor((B-A+1)*runif(N)+A)

The ceiling function can accomplish the same thing.  To make random 
integers from 1 to K, do this:


ceiling( K*runif(N) )

Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] generate random numbers for lotteries

2012-04-28 Thread Mike Miller


On Fri, 27 Apr 2012, Vale Fara wrote:


I am working with lotteries and I need to generate two sets of uniform
random numbers.

Requirements:
1) each set has 60 random numbers


random integers?


2) random numbers in the first set are taken from an interval (0-10),
whereas numbers in the second set are taken from a higher interval
(15-25)


Depends on if you mean integers.  R has functions.  Here's one:

http://www.astrostatistics.psu.edu/su07/R/html/stats/html/Uniform.html



3) numbers generated in the first set should be matched to numbers in
the second set (row by row) so that the expected value of each couple
of random numbers (i.e. of each lottery) is around to a given value
(12.5 +/- 5, where 12.5 is the median value of the interval extremes).


Do you mean that the mean for the pair of numbers must be between 7.5 and 
17.5, inclusive?  That means the sum must be from 15 to 35.  Well, you are 
in luck because if you make the numbers as you suggested above, that will 
happen -- you don't have to do anything special to make it happen.




For the computation of the expected value, the probabilities in each
lottery are ½ and ½.


For what outcome?  You lost me.



How do this? Any help given would be greatly appreciated.


I hope that helps.

Mike__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Completely Off Topic:Link to IOM report on use of "-omics" tests in clinical trials

2012-03-27 Thread Mike Marchywka






Thanks, I had totally missed this controversy but from quick read of summary 
the impact on open source analysis was unclear.Can you explain the punchline? I 
think many users of R have concluded the biggest problem in most analyses 
isfirst getting the data and then verfiying any results you derive, both issues 
that sound related to your post.
( The jumble below is illustrative of what hotmail has been doing with plain 
text, getting plain data withoutall the formatting junk is a recurring problem 
LOL).






> Date: Mon, 26 Mar 2012 22:38:56 +0100
> 
From: iaingallagher@btopenworld.com
> To: 
gunter.berton@gene.com; r-help@r-project.org
> 
Subject: Re: [R] Completely Off Topic:Link to IOM report on 
use of "-omics" tests in clinical trials
>
> 
I followed this case while it was 
ongoing.
>
>
> It was a very interesting 
example of basic mistakes but also (for me) of journal 
politicking.
>
>
> Keith Baggerly and 
Kevin Coombes wrote a great paper - "DERIVING CHEMOSENSITIVITY FROM CELL 
LINES: FORENSIC BIOINFORMATICS AND REPRODUCIBLE RESEARCH IN HIGH-THROUGHPUT 
BIOLOGY" in The Annals of Applied Statistics (2009, Vol. 3, No. 4, 
1309–1334) which explains some of the background and investigative 
work they had to do to bring those mistakes to light.!
 
>
>
> 
Best
>
> 
iain
>
>
>
> - Original 
Message -
> From: Bert Gunter 

> To: 
r-help@r-project.org
> Cc:
> Sent: 
Monday, 26 March 2012, 19:12
> Subject: [R] 
Completely Off Topic:Link to IOM report on use of "-omics" tests in 
clinical trials
>
> Warning: This has little 
directly to do with R, although R and related
> tools (e.g. 
sweave and other reproducible research tools) have a
> natural 
role to play.
>
> The IOM 
report:
>
> 
http://www.iom.edu/Reports/2012/Evolution-of-Translational-Omics.aspx
>
>
 that arose out of the Duke Univ. genomics testing scandal ha!
 s been
> released. My thanks to Keith Baggerly for forwar
ding this. I believe
> that many R users in the medical research 
community will find this
> interesting, and I hope I do not 
venture too far out of line by
> passing on the link to readers of 
this list. It **will** have an
> important impact 
on so-called Personalized Health Care (which I guess
> affects 
all of us), and open source analytical (statistical)
> 
methodology is a central issue.
>
> For those 
interested, try the summary first.
>
> Best to 
all,
> Bert
>
>
> 
--
>
> Bert Gunter
> Genentech 
Nonclinical Biostatistics
>
> Internal Contact 
Info:
> Phone: 467-7374
> 
Website:
> 
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pd!
 b-biostatistics/pdb-ncb-home.htm
>
> 
__
> 
R-help@r-project.org mailing list
> 
https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read 
the posting guide 
http://www.R-project.org/posting-guide.html
> and provide 
commented, minimal, self-contained, reproducible 
code.
>
>
> 
__
> 
R-help@r-project.org mailing list
> 
https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read 
the posting guide 
http://www.R-project.org/posting-guide.html
> and provide 
commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sqrt(-x) vs. -x^0.5

2012-03-21 Thread Mike Williamson

Thanks Sarah, All,

I guess I never thought of a negative sign as an "operation", but
knowing that it is considered an operation explains everything nicely.
 Somewhere in it's underbelly, I suppose -4 is represented as "0 - 4"?
 Either way, I'm glad it is consistent & accurate, so that I didn't find
myself in another pickle like "weak" typing and attempts to use time /date
classes in 'R' have brought me.

   Thanks!
  Mike

---
[The theory of gravity] is to me so great an absurdity that I believe no
Man who has in philosophical matters a competent faculty of thinking can
ever fall into it.  -- Isaac Newton

On Wed, Mar 21, 2012 at 4:37 PM, Rolf Turner  wrote:

> On 22/03/12 12:17, Sarah Goslee wrote:
>
>> It's order of operations, and a good reason to always use
>> parentheses: which is evaluated first, the unary minus or
>> the raising-to-powers?
>>
>> (-4)^0.5
>> -(4^0.5)
>>
>> sqrt(-4)
>> -sqrt(4)
>>
>
> If the OP *really* wants the square root of -4 he could do
> sqrt(-4+0i) or (-4+0i)^0.5 (and get 0+2i in either case).
>
>cheers,
>
>Rolf Turner
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sqrt(-x) vs. -x^0.5

2012-03-21 Thread Mike Williamson

Hi Everyone,

I did a search through the archives and did not find an answer,
although I must admit it is a hard search to do ( ^0.5 is tough to
explicitly search for ).

I am sure there is some mathematically accurate reason to explain the
following, but I guess I either never learned it or have since forgotten it.

In 'R', when I type (for instance):

sqrt(-4)

I get

NaN

but if I type in:

-4 ^ 0.5

I get

-2

I presume this is on purpose and correct, but it's the first time I've
come across any theoretical difference between ^0.5 and sqrt.  What is the
theoretical difference / meaning between these two operations?

  Thanks!
   Mike


---
[The theory of gravity] is to me so great an absurdity that I believe no
Man who has in philosophical matters a competent faculty of thinking can
ever fall into it.  -- Isaac Newton

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] confidence intervals in dotplots in a for loop

2012-03-06 Thread Byerly, Mike M (DFG)

# I have some population estimates and confidence intervals for various
size classes 

# of animals captured with two gear types.  I'd like to plot the
estimates along with 

# the 90 and 95% CI's by size class for each gear type.

# The data can be read in as:

 

estimates <-
c(67.42,30.49,32.95,23.53,10.26,6.03,23.53,0.93,50.72,24.2,25.84,18.54,

 
7.16,3.6,9.35,0.33,87.28,37.25,40.16,28.59,13.77,8.92,40.74,1.68,48.28,2
3.09,

 
24.49,17.7,6.63,3.28,7.79,0.26,91.63,38.74,41.6,29.74,14.49,9.51,44.16,1
.88)

estimates.m<- matrix(estimates, 8,5)

colnames(estimates.m) <- list("est","lci90","uci90","lci95","uci95")

id <- c(1,1,2,2,3,3,4,4)

size <- c("All","All","Large","Large","Medium","Medium","Small","Small")

gear <- rep(c("Camera","Dredge"),4)

cds.est <- data.frame(id,size,gear,estimates.m)

 

# The following script works fine for plotting one size class at a time
using dotplot() and is how I 

# would like the plots to look

 

library(lattice)

dat1 <- cds.est[cds.est$id == 1,]

dotplot(gear ~ est , data=dat1, xlim = c(min(dat1$lci95 -10),
max(dat1$uci95 +10)), xlab='Density (num / Nmi^2)',

main = as.vector(unique(dat1$size)),

panel=function(x,y) {

panel.xyplot(x,y,pch=16,cex=1)

 
panel.segments(dat1$lci95,as.numeric(y),dat1$uci95,as.numeric(y), lty=1,
col=1)

 
panel.segments(dat1$lci90,as.numeric(y),dat1$uci90,as.numeric(y), lty=1,
lwd=4, col='grey60')

 
panel.xyplot(x,y,pch=16,cex=1.2,col='white')

panel.xyplot(x,y,pch=1,cex=1.1,
col='black')

})

 

# Since I have multiple size classes and will producing similar plots
for other data sets

# I've written the following script using a loop.  The script loops over
the "id" variable in the 

# cds.est data frame and stores the plots in a list.  Since dotplot() is
part of the

# lattice package, I used grid.arrange to tile the plots.  

 

library(grid)

library(gridExtra)

 

id2 <- 1:max(cds.est$id)

plots <- vector("list", length(id2))

j <- 0

for (i in id2) {

dat <- cds.est[cds.est$id == i,]

plots[[ j <- j+1]] <- 

dotplot(gear ~ est , data=dat, xlim = c(min(dat$lci95
-10), max(dat$uci95 +10)), xlab='Density (num / Nmi^2)',

main = as.vector(unique(dat$size)),

panel=function(x,y) {

panel.xyplot(x,y,pch=16,cex=1)

 
panel.segments(dat$lci95,as.numeric(y),dat$uci95,as.numeric(y), lty=1,
col=1)

 
panel.segments(dat$lci90,as.numeric(y),dat$uci90,as.numeric(y), lty=1,
lwd=4, col='grey60')

 
panel.xyplot(x,y,pch=16,cex=1.2,col='white')

    panel.xyplot(x,y,pch=1,cex=1.1,
col='black')

})

 

}

do.call(grid.arrange, plots)

 

# The script runs and produces all the plots with the correct estimates,
but the CI's are not 

# plotting correctly.  Does anyone have suggestions on what is causing
this?

 

Thanks, Mike

 

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] The Future of R | API to Public Databases

2012-01-14 Thread Mike Marchywka








LOL, I remember posting about this in the past. The US gov agencies vary but 
mostare quite good. The big problem appears to be people who push proprietary 
orcommercial "standards" for which only one effective source exists. Some 
formats,like Excel and PDF come to mind and there is a disturbing trend towards 
theiradoption in some places where raw data is needed by many. The best thing 
to do is contact the informationprovider and let them know you want raw data, 
not images or stuff that worksin limited commercial software packages. Often 
data sources are valuable andthe revenue model impacts availability. 

If you are just arguing over different open formats,  it is usually easy for 
someone towrite some conversion code and publish it- CSV to JSON would not be a 
problem for example. Data of course are quite variable and there is 
nothingwrong with giving provider his choice. 


> Date: Sat, 14 Jan 2012 10:21:23 -0500
> From: ja...@rampaginggeek.com
> To: r-help@r-project.org
> Subject: Re: [R] The Future of R | API to Public Databases
>
> Web services are only part of the problem. In essence, there are at
> least two facets:
> 1. downloading the data using some protocol
> 2. mapping the data to a common model
>
> Having #1 makes the import/download easier, but it really becomes useful
> when both are included. I think #2 is the harder problem to address.
> Software can usually be written to handle #1 by making a useful
> abstraction layer. #2 means that data has consistent names and meanings,
> and this requires people to agree on common definitions and a common
> naming convention.
>
> RDF (Resource Description Framework) and its related technologies
> (SPARQL, OWL, etc) are one of the many attempts to try to address this.
> While this effort would benefit R, I think it's best if it's part of a
> larger effort.
>
> Services such as DBpedia and Freebase are trying to unify many data sets
> using RDF.
>
> The task view and package ideas a great ideas. I'm just adding another
> perspective.
>
> Jason
>
> On 01/13/2012 05:18 PM, Roy Mendelssohn wrote:
> > HI Benjamin:
> >
> > What would make this easier is if these sites used standardized web 
> > services, so it would only require writing once. data.gov is the worst 
> > example, they spun the own, weak service.
> >
> > There is a lot of environmental data available through OPenDAP, and that is 
> > supported in the ncdf4 package. My own group has a service called ERDDAP 
> > that is entirely RESTFul, see:
> >
> > http://coastwatch.pfel.noaa.gov/erddap
> >
> > and
> >
> > http://upwell.pfeg.noaa.gov/erddap
> >
> > We provide R (and matlab) scripts that automate the extract for certain 
> > cases, see:
> >
> > http://coastwatch.pfeg.noaa.gov/xtracto/
> >
> > We also have a tool called the Environmental Data Connector (EDC) that 
> > provides a GUI from with R (and ArcGIS, Matlab and Excel) that allows you 
> > to subset data that is served by OPeNDAP, ERDDAP, certain Sensor 
> > Observation Service (SOS) servers, and have it read directly into R. It is 
> > freely available at:
> >
> > http://www.pfeg.noaa.gov/products/EDC/
> >
> > We can write such tools because the service is either standardized 
> > (OPeNDAP, SOS) or is easy to implement (ERDDAP).
> >
> > -Roy
> >
> >
> > On Jan 13, 2012, at 1:14 PM, Benjamin Weber wrote:
> >
> >> Dear R Users -
> >>
> >> R is a wonderful software package. CRAN provides a variety of tools to
> >> work on your data. But R is not apt to utilize all the public
> >> databases in an efficient manner.
> >> I observed the most tedious part with R is searching and downloading
> >> the data from public databases and putting it into the right format. I
> >> could not find a package on CRAN which offers exactly this fundamental
> >> capability.
> >> Imagine R is the unified interface to access (and analyze) all public
> >> data in the easiest way possible. That would create a real impact,
> >> would put R a big leap forward and would enable us to see the world
> >> with different eyes.
> >>
> >> There is a lack of a direct connection to the API of these databases,
> >> to name a few:
> >>
> >> - Eurostat
> >> - OECD
> >> - IMF
> >> - Worldbank
> >> - UN
> >> - FAO
> >> - data.gov
> >> - ...
> >>
> >> The ease of access to the data is the key of information processing with R.
> >>
> >> How can we handle the flow of information noise? R has to give an
> >> answer to that with an extensive API to public databases.
> >>
> >> I would love your comments and ideas as a contribution in a vital 
> >> discussion.
> >>
> >> Benjamin
> >>
> >> __
> >> R-help@r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide 
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> > **
> > "The contents of this message

Re: [R] how to combine grouped data and ungrouped data in a trellis xyplot

2012-01-09 Thread Mike Dahman

Thanks Deepayan. That did the trick.


xyplot(cpu~dt|zone,data=filt_zone_df,ylim=c(0,100),groups=pool,

 
auto.key=list(columns=min(4,length(unique(filt_zone_df$pool))),lines=T,points=F),
   type="l",
   main="Test Chart",
   ylab="% Utilization",
   panel=function(x,y,groups,subscripts,...){
   panel.xyplot(x,y,groups=groups,subscripts=subscripts,...)

 panel.lines(filt_zone_df$dt[subscripts],filt_zone_df$mem[subscripts],col="red")
}, as.Table=T, subscripts=T)

Regards,

-mike


On Mon, Jan 9, 2012 at 5:06 AM, Deepayan Sarkar
wrote:

> On Sun, Jan 8, 2012 at 9:45 AM, Mike Dahman  wrote:
> > I'm hoping the community knowledge can help me out here. I have found
> great
> > value in R, especially using it to generate charts, but I am still
> scaling
> > the learning curve in a number of ways.
> >
> > I am looking plot one grouped line and one ungrouped line in a lattice
> plot.
> >
> > I can plot one grouped line like this (the line's color in each panel
> > becomes dependent on the newpool value):
> >
> >
> >
> >
> xyp<-xyplot(cpucap~date|zone,data=df,type="l",groups=newpool,auto.key=list(points=F,lines=T),
> >   main=paste(df$server[1]," CPU Caps\n",df$date[1]," to
> > ",df$date[nrow(df)],sep="")
> >)
> >print(xyp)
> >
> >
> > and I can plot two ungrouped lines using a panel=function with subscripts
> > like this (maybe not the best way, but I found an example doing it this
> > way):
> >
> >xyplot(cpu~dt|zone,data=filt_zone_df,ylim=c(0,100),
> >   main=paste(server," - Zone CPU (Blue) & Memory (Red)
> >
> Util\n",filt_zone_df$ts[1],"-",filt_zone_df$ts[nrow(filt_zone_df)],sep=""),
> >   panel=function(x,y,subscripts){
> >   panel.lines(x,y)
> >
> panel.lines(filt_zone_df$dt[subscripts],filt_zone_df$mem[subscripts],col="red")
> >}, as.Table=T, subscripts=T)
> >
> >
> > but I'm struggling with plotting one line that is grouped and one that
> > isn't. When I try to pass group to the first panel.xyplot() function in
> the
> > panel=function it either does nothing or bombs out.
> >
> >
> xyplot(cpu~dt|zone,data=servdf,ylim=c(0,100),groups=pool,auto.key=list(points=F,lines=T),type="l",
> >   main="test",
> >   panel=function(x,y,groups,subscripts,...){
> >   panel.xyplot(x,y,groups,...) # would
> > like this to be colored based on the groups=pool
>
> Try
>
>   panel.xyplot(x, y, groups = groups, subscripts = subscripts,
> ...)
>
> -Deepayan
>
> >
> >  panel.lines(servdf$dt[subscripts],servdf$mem[subscripts],col="red")
> >}, as.Table=T, subscripts=T)
> >
> >
> > A little nudge in the right direction is appreciated. I'm getting tripped
> > up on how to get the groups definition into the panel function and also
> > into the panel.xyplot function within it. I've tried using a number of
> > variations in the arguments with the panel=function definition and the
> call
> > to panel.xyplot() within it, but no success. My assumption was that the
> use
> > of '...' would pass it on down, but that doesn't seem to be the
> > case, especially since most of the examples I can find from googling show
> > folks listing group as an argument, and sometimes have something like
> > groups=groups. I've tried a number of things and thought it is time to
> ask
> > for help.
> >
> > Regards,
> >
> > -mike
> >
> >[[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] glmmPQL and predict

2012-01-09 Thread Mike Harwood

Is the labeling/naming of levels in the documentation for the
predict.glmmPQL function "backwards"?  The documentation states "Level
values increase from outermost to innermost grouping, with level zero
corresponding to the population predictions".  Taking the sample in
the documentation:

fit <- glmmPQL(y ~ trt + I(week > 2), random = ~1 |  ID,
   family = binomial, data = bacteria)

> head(predict(fit, bacteria, level = 0, type="response"))
[1] 0.9680779 0.9680779 0.8587270 0.8587270 0.9344832 0.9344832
> head(predict(fit, bacteria, level = 1, type="response"))
  X01   X01   X01   X01   X02   X02
0.9828449 0.9828449 0.9198935 0.9198935 0.9050782 0.9050782
> head(predict(fit, bacteria, type="response")) ## population prediction
  X01   X01   X01   X01   X02   X02
0.9828449 0.9828449 0.9198935 0.9198935 0.9050782 0.9050782

The returned values for level=1 and level= match, which
is not what I expected based upon the documentation. Exponentiating
the intercept coefficients from the fitted regression, the level=0
values match when the random effect intercept is included

> 1/(1+exp(-3.412014)) ## only the fixed effect
[1] 0.9680779
> 1/(1+exp(-1*(3.412014+0.63614382))) ## fixed and random effect intercepts
[1] 0.9828449

Thanks!

Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] how to combine grouped data and ungrouped data in a trellis xyplot

2012-01-07 Thread Mike Dahman

I'm hoping the community knowledge can help me out here. I have found great
value in R, especially using it to generate charts, but I am still scaling
the learning curve in a number of ways.

I am looking plot one grouped line and one ungrouped line in a lattice plot.

I can plot one grouped line like this (the line's color in each panel
becomes dependent on the newpool value):



xyp<-xyplot(cpucap~date|zone,data=df,type="l",groups=newpool,auto.key=list(points=F,lines=T),
   main=paste(df$server[1]," CPU Caps\n",df$date[1]," to
",df$date[nrow(df)],sep="")
)
print(xyp)


and I can plot two ungrouped lines using a panel=function with subscripts
like this (maybe not the best way, but I found an example doing it this
way):

xyplot(cpu~dt|zone,data=filt_zone_df,ylim=c(0,100),
   main=paste(server," - Zone CPU (Blue) & Memory (Red)
Util\n",filt_zone_df$ts[1],"-",filt_zone_df$ts[nrow(filt_zone_df)],sep=""),
   panel=function(x,y,subscripts){
   panel.lines(x,y)
   
panel.lines(filt_zone_df$dt[subscripts],filt_zone_df$mem[subscripts],col="red")
}, as.Table=T, subscripts=T)


but I'm struggling with plotting one line that is grouped and one that
isn't. When I try to pass group to the first panel.xyplot() function in the
panel=function it either does nothing or bombs out.

xyplot(cpu~dt|zone,data=servdf,ylim=c(0,100),groups=pool,auto.key=list(points=F,lines=T),type="l",
   main="test",
   panel=function(x,y,groups,subscripts,...){
   panel.xyplot(x,y,groups,...) # would
like this to be colored based on the groups=pool

 panel.lines(servdf$dt[subscripts],servdf$mem[subscripts],col="red")
}, as.Table=T, subscripts=T)


A little nudge in the right direction is appreciated. I'm getting tripped
up on how to get the groups definition into the panel function and also
into the panel.xyplot function within it. I've tried using a number of
variations in the arguments with the panel=function definition and the call
to panel.xyplot() within it, but no success. My assumption was that the use
of '...' would pass it on down, but that doesn't seem to be the
case, especially since most of the examples I can find from googling show
folks listing group as an argument, and sometimes have something like
groups=groups. I've tried a number of things and thought it is time to ask
for help.

Regards,

-mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dropping columns from data frame

2012-01-06 Thread Mike Harwood

Thank you, David.  I was merely using "head" to limit the code/
output.  My question remains, because a created data frame has the
same columns as was output from "head":

> head(orig.df,3)
  num1.10 num11.20 lc1.10 lc11.20 uc1.10 uc11.20
1   1   11  a   k  A   K
2   2   12  b   l  B   L
3   3   13  c   m  C   M
> # Illustration 1: contiguous columns at beginning of data frame
> head(orig.df[,-c(1:3)],2)
  lc11.20 uc1.10 uc11.20
1   k  A   K
2   l  B   L
> new.df <- orig.df[,-c(1:3)]
> head(new.df,2)
  lc11.20 uc1.10 uc11.20
1   k  A   K
2   l  B   L
>
> # Illustration 2: non-contiguous columns
> head(orig.df[,-c(1,3,5)],2)
  num11.20 lc11.20 uc11.20
1   11   k   K
2   12   l   L
> new.df <- orig.df[,-c(1,3,5)]
> head(new.df,2)
  num11.20 lc11.20 uc11.20
1   11   k   K
2   12   l   L




On Jan 6, 9:49 am, David Winsemius  wrote:
> On Jan 6, 2012, at 10:00 AM, Mike Harwood wrote:
>
> > How does R do it, and should I ever be worried?  I always remove
> > columns by index, and it works exactly as I would naively expect - but
> > HOW?  The second illustration, which deletes non contiguous columns,
> > represents what I do all the time and have some trepidation about
> > because I don't know the mechanics (e.g. why doesn't the column
> > formerly-known-as-4 become 3 after column 1 is dropped: doesn't vector
> > removal from a df/list invoke a loop in C?).
>
> You are NOT "removing columns". You are returning (to `head` and then
> to `print`) an extract from the dataframe, but that does not change
> the original dataframe. To effect a change you would need to assign
> the value back to the same name as the original daatframe.
>
> --
> David
>
>
>
>
>
>
>
>
>
> >  Can I delete a named
> > list of columns, which are examples 4 and 5 and which generate the
> > "unary error' mesages, without resorting to "orig.df$num1.10 <- NULL"?
>
> > Thanks!
>
> > orig.df <- data.frame(cbind(
> >    1:10
> >    ,11:20
> >    ,letters[1:10]
> >    ,letters[11:20]
> >    ,LETTERS[1:10]
> >    ,LETTERS[11:20]
> >    ))
> > names(orig.df) <- c(
> >    'num1.10'
> >    ,'num11.20'
> >    ,'lc1.10'
> >    ,'lc11.20'
> >    ,'uc1.10'
> >    ,'uc11.20'
> >    )
> > # Illustration 1: contiguous columns at beginning of data frame
> > head(orig.df[,-c(1:3)])
>
> > # Illustration 2: non-contiguous columns
> > head(orig.df[,-c(1,3,5)])
>
> > # Illustration 3: contiguous columns at end of data frame
> > head(orig.df[,-c(4:6)])    ## as expected
>
> > # Illustrations 4-5: unary errors
> > head(orig.df[,-c(as.list('num1.10', 'lc1.10', 'uc1.10'))])
> > head(orig.df[,-c('num1.10', 'lc1.10', 'uc1.10')])
>
> > Mike
>
> > __
> > r-h...@r-project.org mailing list
> >https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
> __
> r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Dropping columns from data frame

2012-01-06 Thread Mike Harwood

How does R do it, and should I ever be worried?  I always remove
columns by index, and it works exactly as I would naively expect - but
HOW?  The second illustration, which deletes non contiguous columns,
represents what I do all the time and have some trepidation about
because I don't know the mechanics (e.g. why doesn't the column
formerly-known-as-4 become 3 after column 1 is dropped: doesn't vector
removal from a df/list invoke a loop in C?).  Can I delete a named
list of columns, which are examples 4 and 5 and which generate the
"unary error' mesages, without resorting to "orig.df$num1.10 <- NULL"?

Thanks!

orig.df <- data.frame(cbind(
1:10
,11:20
,letters[1:10]
,letters[11:20]
,LETTERS[1:10]
,LETTERS[11:20]
))
names(orig.df) <- c(
'num1.10'
,'num11.20'
,'lc1.10'
,'lc11.20'
,'uc1.10'
,'uc11.20'
)
# Illustration 1: contiguous columns at beginning of data frame
head(orig.df[,-c(1:3)])

# Illustration 2: non-contiguous columns
head(orig.df[,-c(1,3,5)])

# Illustration 3: contiguous columns at end of data frame
head(orig.df[,-c(4:6)]) ## as expected

# Illustrations 4-5: unary errors
head(orig.df[,-c(as.list('num1.10', 'lc1.10', 'uc1.10'))])
head(orig.df[,-c('num1.10', 'lc1.10', 'uc1.10')])


Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Can levelplot colorkeys display a logarithmic scale evenly?

2012-01-05 Thread Mike MacFerrin

I'm using the {lattice} "levelplot" function to make a (more or less) 2-d 
histogram, and for the most part it's working fine with my data.  However, I 
can't get the color key to do what I need.  I can give it labels and custom 
cutoffs, but my cutoff lines (and hence my labels) aren't evenly spaced, 
instead 
they're more-or-less logarithmic, starting at [0,20,50,100...] and continuing 
on 
up to 5 million.

Levelplot scales the ticks/labels linearly on the color key, leaving most my 
labels scrunched down atop each other at the bottom and only the last few 
(..."500K","1M","2M","5M") really visible on the rest of the key.  I want each 
gap (no matter its numerical range) to occupy one evenly-spaced "block" on the 
color key, so they're all readable by the user.  I want it to display value on 
a 
logarithmic scale (0,1,10,100,1000,...).  The colorkey only seems to support 
linear scales (0,2,4,6,8,10,...), at least by default, and I don't see any 
options in the help to change that default.

Does levelplot support this?  Or am I destined to draw the colorkey manually 
(very painfully using polygon calls and user coordinates)?  Or are there other 
levelplot-like functions that would facilitate this?  Any help is appreciated,

- Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] empty files created with trellis xyplot jpeg device

2012-01-01 Thread Mike Dahman

Thanks Michael.

That did the trick. Despite googling most of the day yesterday, I didn't
quite have the right search string to find that one. Almost feels like the
answer was hiding in plain sight, now that you point me to it.

I added some code to save the xyplots to a variable and then print it at
the end of the function.

before:

xyplot(...)

after:

xyp<-xyplot(...)
print(xyp)

Works great.

Regards,

-mike


On Sun, Jan 1, 2012 at 1:40 AM, R. Michael Weylandt <
michael.weyla...@gmail.com> wrote:

> I'm guessing R FAQ 7.22: http://cran.r-project.org/doc/FAQ/R-FAQ.html
>
> The subtlety is that in an interactive session print is automatically
> called at the final evaluation of most everything, but you have to
> prompt it in interactive use (and depending on details, in some
> function calls)
>
> Michael Weylandt
>
> On Sat, Dec 31, 2011 at 6:50 PM, Mike Dahman 
> wrote:
> > New years greetings.
> >
> > I have been setting up a function to generate multiple jpeg charts. When
> > the calls are issued at the interactive console, the jpeg files are
> > generated without an issue. When I try to issue the same calls from a
> > function, some chart files are empty. It appears to only be related to
> > trellis charts. Any help to troubleshoot this is appreciated.
> >
> > Regards,
> >
> > -mike
> >
> >
> > R version 2.14.0 (2011-10-31)
> > Copyright (C) 2011 The R Foundation for Statistical Computing
> > ISBN 3-900051-07-0
> > Platform: x86_64-unknown-linux-gnu (64-bit)
> >
> >
> > # validate devices
> >> capabilities()
> >jpeg  png tifftcltk  X11 aqua http/ftp  sockets
> >TRUEFALSEFALSEFALSE TRUEFALSE TRUE TRUE
> >  libxml fifo   clediticonv  NLS  profmemcairo
> >TRUE TRUE TRUE TRUE TRUEFALSEFALSE
> >
> > # Example functionality from the interactive console
> >
> > # I am going to use a zone variable to help duplicate the code in the
> > function
> >> zone
> > [1] "isoranp-z1"
> >
> > # call a function to pull in data and assign it to a data frame
> >> testz<-get_zonedata_url(2011,51,zone)
> >
> > # validate the data frame
> >> str(testz)
> > 'data.frame':   2016 obs. of  14 variables:
> >  $ ts : Factor w/ 2016 levels "12/18/2011 00:00",..: 1 2 3 4 5 6 7 8
> 9
> > 10 ...
> >  $ server : Factor w/ 1 level "phx1npf4sn2": 1 1 1 1 1 1 1 1 1 1 ...
> >  $ zone   : Factor w/ 1 level "isoranp-z1": 1 1 1 1 1 1 1 1 1 1 ...
> >  $ pool   : Factor w/ 1 level "ORA-S1": 1 1 1 1 1 1 1 1 1 1 ...
> >  $ cpucap : num  4 4 4 4 4 4 4 4 4 4 ...
> >  $ memcap : int  26 26 26 26 26 26 26 26 26 26 ...
> >  $ swapcap: int  26 26 26 26 26 26 26 26 26 26 ...
> >  $ cpu: num  78.2 206.8 198.4 366.4 112.1 ...
> >  $ poolsz : int  42 42 42 42 42 42 42 42 42 42 ...
> >  $ mem: num  75.5 75.3 75.6 74.5 74.3 ...
> >  $ swp: num  80.2 80.1 79.6 79 78.9 ...
> >  $ dates  : chr  "12/18/2011" "12/18/2011" "12/18/2011" "12/18/2011" ...
> >  $ times  : chr  "00:00:00" "00:05:00" "00:10:00" "00:15:00" ...
> >  $ dt :Classes 'chron', 'dates', 'times'  atomic [1:2016] 15326 15326
> > 15326 15326 15326 ...
> >  .. ..- attr(*, "format")= chr [1:2] "m/d/y" "h:m:s"
> >  .. ..- attr(*, "origin")= Named num [1:3] 1 1 1970
> >  .. .. ..- attr(*, "names")= chr [1:3] "month" "day" "year"
> >
> > # set up a jpeg device
> >> trellis.device(jpeg,file=paste("charts/",zone,".zone_util.jpg",sep=""))
> >
> > # call a function to generate the chart
> > # the plot_zone_util function uses xyplot
> >> plot_zone_util(testz)
> >
> > #close the device
> >> dev.off()
> >
> > The jpeg file is generated as expected:
> > (User and group has been removed)
> >
> > ls -l isoranp-z1.zone_util.jpg
> > -rw-rw-r-- 1 <> <> 32123 Dec 31 16:13 isoranp-z1.zone_util.jpg
> >
> > # here is the plot_zone_util function for reference
> > plot_zone_util <- function(zone_df){
> >
> >xyplot(cpu~dt|server,data=zone_df,ylim=c(0,100),
> >   main=paste(zone_df$zone[1]," CPU (Blue) & Memory (Red)
> > Util\n",zone_df$ts[1],"-",zone_df$ts[nrow(zone_df)],sep=""),
> >   xlab=&qu

[R] empty files created with trellis xyplot jpeg device

2011-12-31 Thread Mike Dahman

New years greetings.

I have been setting up a function to generate multiple jpeg charts. When
the calls are issued at the interactive console, the jpeg files are
generated without an issue. When I try to issue the same calls from a
function, some chart files are empty. It appears to only be related to
trellis charts. Any help to troubleshoot this is appreciated.

Regards,

-mike


R version 2.14.0 (2011-10-31)
Copyright (C) 2011 The R Foundation for Statistical Computing
ISBN 3-900051-07-0
Platform: x86_64-unknown-linux-gnu (64-bit)


# validate devices
> capabilities()
jpeg  png tifftcltk  X11 aqua http/ftp  sockets
TRUEFALSEFALSEFALSE TRUEFALSE TRUE TRUE
  libxml fifo   clediticonv  NLS  profmemcairo
TRUE TRUE TRUE TRUE TRUEFALSEFALSE

# Example functionality from the interactive console

# I am going to use a zone variable to help duplicate the code in the
function
> zone
[1] "isoranp-z1"

# call a function to pull in data and assign it to a data frame
> testz<-get_zonedata_url(2011,51,zone)

# validate the data frame
> str(testz)
'data.frame':   2016 obs. of  14 variables:
 $ ts : Factor w/ 2016 levels "12/18/2011 00:00",..: 1 2 3 4 5 6 7 8 9
10 ...
 $ server : Factor w/ 1 level "phx1npf4sn2": 1 1 1 1 1 1 1 1 1 1 ...
 $ zone   : Factor w/ 1 level "isoranp-z1": 1 1 1 1 1 1 1 1 1 1 ...
 $ pool   : Factor w/ 1 level "ORA-S1": 1 1 1 1 1 1 1 1 1 1 ...
 $ cpucap : num  4 4 4 4 4 4 4 4 4 4 ...
 $ memcap : int  26 26 26 26 26 26 26 26 26 26 ...
 $ swapcap: int  26 26 26 26 26 26 26 26 26 26 ...
 $ cpu: num  78.2 206.8 198.4 366.4 112.1 ...
 $ poolsz : int  42 42 42 42 42 42 42 42 42 42 ...
 $ mem: num  75.5 75.3 75.6 74.5 74.3 ...
 $ swp: num  80.2 80.1 79.6 79 78.9 ...
 $ dates  : chr  "12/18/2011" "12/18/2011" "12/18/2011" "12/18/2011" ...
 $ times  : chr  "00:00:00" "00:05:00" "00:10:00" "00:15:00" ...
 $ dt :Classes 'chron', 'dates', 'times'  atomic [1:2016] 15326 15326
15326 15326 15326 ...
  .. ..- attr(*, "format")= chr [1:2] "m/d/y" "h:m:s"
  .. ..- attr(*, "origin")= Named num [1:3] 1 1 1970
  .. .. ..- attr(*, "names")= chr [1:3] "month" "day" "year"

# set up a jpeg device
> trellis.device(jpeg,file=paste("charts/",zone,".zone_util.jpg",sep=""))

# call a function to generate the chart
# the plot_zone_util function uses xyplot
> plot_zone_util(testz)

#close the device
> dev.off()

The jpeg file is generated as expected:
(User and group has been removed)

ls -l isoranp-z1.zone_util.jpg
-rw-rw-r-- 1 <> <> 32123 Dec 31 16:13 isoranp-z1.zone_util.jpg

# here is the plot_zone_util function for reference
plot_zone_util <- function(zone_df){

xyplot(cpu~dt|server,data=zone_df,ylim=c(0,100),
   main=paste(zone_df$zone[1]," CPU (Blue) & Memory (Red)
Util\n",zone_df$ts[1],"-",zone_df$ts[nrow(zone_df)],sep=""),
   xlab="date",
   ylab="utilization (%)",
   panel=function(x,y,subscripts){
   panel.lines(x,y)

 panel.lines(zone_df$dt[subscripts],zone_df$mem[subscripts],col="red")
}, as.Table=T, subscripts=T)
}


##

# Try and do the same thing within a function:::

> gen_zone_charts(zone,2011,51)

# Note the zone_util.jpg file is zero length. It is a lattice chart.
The other two charts generate ok. User and group has been removed.

-rw-rw-r-- 1 <> <> 22376 Dec 31 16:20 isoranp-z1.zone_cpu.jpg
-rw-rw-r-- 1 <> <> 18910 Dec 31 16:20 isoranp-z1.zone_mem.jpg
-rw-rw-r-- 1 <> <> 0 Dec 31 16:20 isoranp-z1.zone_util.jpg

# here is the gen_zone_charts function:

> gen_zone_charts
function(zone,year,wk){

data_frame<-get_zonedata_url(year,wk,zone)

# this results in a 0 length file
# i have tried using jpeg(), and trellis.device() with the same results
#jpeg(file=paste("charts/",zone,".zone_util.jpg",sep=""))

trellis.device(jpeg,file=paste("charts/",zone,".zone_util.jpg",sep=""))
#uses xyplot - works fine being called from the console
plot_zone_util(data_frame)
dev.off()

# this works ok
jpeg(file=paste("charts/",zone,".zone_cpu.jpg",sep=""))
# uses combination of boxplot and plot with a preceeding par()
plot_zone_cpu(data_frame)
dev.off()

# this works ok
jpeg(file=paste("charts/",zone,".zone_mem.jpg",sep=""))
# uses combination of plot and plot with a preceeding par()
plot_zone_mem(data_frame)
dev.off()

}

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] HELP!! - PHP calling R to execute a r-code file (*.r)

2011-12-31 Thread Mike Marchywka













> Date: Fri, 30 Dec 2011 16:04:08 -0600
> From: xiuquan.w...@gmail.com
> To: r-help@r-project.org
> Subject: [R] HELP!! - PHP calling R to execute a r-code file (*.r)
>
> Hi,
>
> I have met a tough problem when using PHP to call R to generate some plots.
> I tested it okay in my personal computer with WinXP. But when I was trying
> to update to my server (Win2003 server), I found it did not work. Below is
> the details:

I've run into lots of problems like this. Generally first check the php error 
log file, I have noidea where it is on your machine, and see if you can get 
your script to dump output somewhere,possibly with absolute path so you know 
where to look for it LOL. Often the change in user creates unexpected problems 
with file permissions and libraries and paths. You need to checkthe specific 
direcories for permissions not just top level. 
I would also point out that there is Rapache available as well as 
Rserver. Curious if people are using R with any other unique situations server 
side. We have a java webserver which I use to invoke R via bash scripts and 
generate rathercomplicated files. These could take very long to generate but if 
you have flexible caching system,it can be easy to re use output files or even 
generate them ahead of time. Starting "R" or any otherprocess is not 
instantaneous and often image generation is quite time consuming. Thereare a 
lot of issues making it work well in a server setting in real time. Scale 
up has also been an issue. Apache threading or process model is quite expensive 
if you careabout performance. We were able to use "netty" front end and so far 
that has worked very well.PHP AFAIK is not thread safe however. 

>
> 1> r-code file (E:/mycode.r):
> --
> jpeg("E:/mytest.jpg")
> plot(1:10)
> dev.off()
> --
>
> 2> php code:
> ---
> exec("R CMD BATCH --vanilla --slave --no-timing E:/mycode.r && exit");
> ---
> 3> results:
> for WinXP: the image can be generated successfully.
> for Server(win2003): can not be generated.
>
> BTW, I have added a user "everyone" with full control permission to the E
> disk.
>
[[elided Hotmail spam]]
>
> Thanks.
>
>
> All the best,
> Xiuquan Wang
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quotes inside char string

2011-12-20 Thread Mike Pfeiff

Duncan, Thank you for referring me to Uwe's answer.  

-Original Message-
From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com] 
Sent: Tuesday, December 20, 2011 1:14 PM
To: Mike Pfeiff
Cc: 'R-help@r-project.org'
Subject: Re: [R] Quotes inside char string

On 20/12/2011 2:10 PM, Mike Pfeiff wrote:
> How do I return a character string with quotes inside string?
>
> For example, what logic do I use if I want to return the following:
>
> Test Score="A"
>
> I tried the following  
>
> Score<-paste("Test Score=","A",sep='"')
>
> But it returned a "\" inside:

No it didn't, as Uwe explained to you.

Duncan Murdoch


>
> "Test Score=\"A"
>
>
>
> Any assistance would be greatly appreciated.
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Quotes inside char string

2011-12-20 Thread Mike Pfeiff

Thank you Uwe.  I can't get my sqlQuery to work and thought that it must be the 
"\" embedded in the text string, I guess not if R doesn't pass the "\".  Thanks 
for your response. 

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] 
Sent: Tuesday, December 20, 2011 1:20 PM
To: Mike Pfeiff
Cc: 'R-help@r-project.org'
Subject: Re: [R] Quotes inside char string

On 20.12.2011 20:10, Mike Pfeiff wrote:
> How do I return a character string with quotes inside string?
>
> For example, what logic do I use if I want to return the following:
>
> Test Score="A"
>
> I tried the following  
>
> Score<-paste("Test Score=","A",sep='"')
>
> But it returned a "\" inside:
>
>
> "Test Score=\"A"
>
>
>
> Any assistance would be greatly appreciated.

That is the print()ed R representation of it. The underlying value
*within* the quoted string is actually:
  Test Score="A
and that is always true. E.g. in your example with the SQL query this question 
probably is derived from.
cat() shows the actual version (but does not return) it.

Uwe Ligges

>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Quotes inside char string

2011-12-20 Thread Mike Pfeiff

How do I return a character string with quotes inside string?

For example, what logic do I use if I want to return the following:

Test Score="A"

I tried the following  

   Score<-paste("Test Score=","A",sep='"')

But it returned a "\" inside:


   "Test Score=\"A"



Any assistance would be greatly appreciated.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] RODBC Error: 'getCharCE' must be called on a CHARSXP

2011-12-20 Thread Mike Pfeiff

Thanks.   But if I do as you suggested...

txt<-'SELECT Date,  Region, Price FROM TableXYZ WHERE Type="Domestic"'

returns...

"SELECT TradeDate, Hub, SettlementPrice FROM PowerSettlements WHERE 
Contract=\"ERN\""

Which because of the extra "\"' is improper SQL form.

Maybe I should be asking what is the proper way to return "" inside of a char 
string?

-Original Message-
From: Uwe Ligges [mailto:lig...@statistik.tu-dortmund.de] 
Sent: Tuesday, December 20, 2011 8:46 AM
To: Mike Pfeiff
Cc: 'r-help@r-project.org'
Subject: Re: [R] RODBC Error: 'getCharCE' must be called on a CHARSXP



On 20.12.2011 14:55, Mike Pfeiff wrote:
> I am trying to connect to an internal database and use the sqlQuery command 
> to reduce and retrieve data using the following code:
>
> channel<-odbcConnect("some_dsn", uid="", pwd="") txt<-'SELECT Date, 
> Region, Price FROM TableXYZ WHERE Type="Domestic"'
> sqlQuery(channel, cat(txt,"\n"),errors=TRUE,)

sqlQuery(channel, txt, errors=TRUE)

seems more plausible (since cat returns NULL).

Uwe Ligges



> close(channel)
>
> However, I get the following error immediately after sqlQuery command:
>
> Error in odbcQuery(channel, query, rows_at_time) :
>'getCharCE' must be called on a CHARSXP
>
> I believe my connection is good because I used the following commands to 
> successfully view the columns:
>
> sqlColumns(channel, TableXYZ)
>
> There doesn't seem to be much info on "getCharCE" and/or "CHARSXP.  
> Any guidance the group could provide this vey new user to R, would be 
> greatly appreciated
>
>
>
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] RODBC Error: 'getCharCE' must be called on a CHARSXP

2011-12-20 Thread Mike Pfeiff

I am trying to connect to an internal database and use the sqlQuery command to 
reduce and retrieve data using the following code:

channel <-odbcConnect("some_dsn", uid="", pwd="")
txt<-'SELECT Date, Region, Price FROM TableXYZ WHERE Type="Domestic"'
sqlQuery(channel, cat(txt,"\n"),errors=TRUE,)
close(channel)

However, I get the following error immediately after sqlQuery command:

Error in odbcQuery(channel, query, rows_at_time) :
  'getCharCE' must be called on a CHARSXP

I believe my connection is good because I used the following commands to 
successfully view the columns:

sqlColumns(channel, TableXYZ)

There doesn't seem to be much info on "getCharCE" and/or "CHARSXP.  Any 
guidance the group could provide this vey new user to R, would be greatly 
appreciated




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Simple R server for SQL Server?

2011-12-14 Thread Mike Beddo

I have a big SQL Server application that needs some help from R, and this R 
needs to load a large workspace in order to service the database request. I'd 
like to keep an R process running continually because loading the workspace 
takes about 1 minute. The data in the workspace doesn't change very often.

Rserve doesn't seem to align with my needs - I don't want a new workspace with 
each new connection, because I just have to load the large workspace I need 
each time.

So I want an R process with its workspace pre-loaded, sitting and waiting for 
requests from SQL Server. These would be simple requests, such as 
"make-forecast". The R process crunches and puts the answers back into the 
database using RODBC.

I think I need two pieces?


1)  Some R code to set up a small server listening for the "make-forecast" 
command

2)  An intermediate program that SQL Server talks to, which in turn talks 
to the running R process. I don't know if SQL server can communicate to R 
through a socket, that's why I think I need an intermediate program.

All of this (R, SQL Server) would run on the same Windows server box.

Some small code examples for 1) would be helpful, but advice would be better.


-  Mike Beddo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Fwd: tcplot documentation in evd package

2011-12-13 Thread Mike Harwood

This issue occurs only when both the evd and ismev packages are loaded.
 Please retract this posting, if possible.  Thank you in advance!

Mike

-- Forwarded message --
From: Mike Harwood 
Date: Mon, Dec 12, 2011 at 7:47 PM
Subject: tcplot documentation in evd package
To: r-help@r-project.org

Hello, and please advise regarding any errors/omissions on my part.
However, the documentation for R's tcplot function in the evd package
appears to contain an error.  I am using evd version 2.2-4 in R 2.14.0
with Windows XP.

> data(portpirie)
> mrlplot(portpirie)  ## No Error

> tlim <- c(3.6, 4.2)

> tcplot(portpirie, tlim) ## Line from documentation
Error in fpot(data, u[1], model = model, cmax = cmax, r = r, ulow =
ulow,  :
 `x' must be a non-empty numeric vector

> tcplot(portpirie, tlim=c(3.6, 4.2)) ## explicitly specifying the
threshold limits
Error in fpot(data, u[1], model = model, cmax = cmax, r = r, ulow =
ulow,  :
 `x' must be a non-empty numeric vector

tcplot(portpirie$SeaLevel, tlim) ## Resolves Issue

gpd.fitrange(portpirie$SeaLevel, 3.6, 4.2) ## An alternative, still
naming the SeaLevel vector

Please advise.  Thanks!

Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] tcplot documentation in evd package

2011-12-12 Thread Mike Harwood

Hello, and please advise regarding any errors/omissions on my part.
However, the documentation for R's tcplot function in the evd package
appears to contain an error.  I am using evd version 2.2-4 in R 2.14.0
with Windows XP.

> data(portpirie)
> mrlplot(portpirie)  ## No Error

> tlim <- c(3.6, 4.2)

> tcplot(portpirie, tlim) ## Line from documentation
Error in fpot(data, u[1], model = model, cmax = cmax, r = r, ulow =
ulow,  :
  `x' must be a non-empty numeric vector


> tcplot(portpirie, tlim=c(3.6, 4.2)) ## explicitly specifying the threshold 
> limits
Error in fpot(data, u[1], model = model, cmax = cmax, r = r, ulow =
ulow,  :
  `x' must be a non-empty numeric vector

tcplot(portpirie$SeaLevel, tlim) ## Resolves Issue

gpd.fitrange(portpirie$SeaLevel, 3.6, 4.2) ## An alternative, still
naming the SeaLevel vector

Please advise.  Thanks!

Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] any updates w.r.t. lapply, sapply, apply retaining classes

2011-11-03 Thread Mike Williamson

Hi Joshua,

Thank you for the input!

I agree that it is non-trivial to solve the cases you & I have posed.
 However, I would wholeheartedly support having an error spit back for any
function that does not explicitly support a class.  In this case, if I
attempt to do   sapply(x, class), and 'x' is of class "difftime", then I
should receive an error "sapply cannot function upon class 'difftime' ".
 Why do I take this stance?  There are at least 2 strong reasons:

   - Most importantly, an incorrect answer is far more dangerous than no
   answer.  E.g., if I ask "what is 3 + 3?", I would far prefer to receive "I
   don't know" than "5".  The former lets me know I need to choose another
   path, the latter mistakenly makes me think I have an answer, when I do not,
   and I continue with analyses on the assumption that answer is correct.  In
   the case of dates, this happens often.  E.g., is the numeric that is
   returned from sapply, for instance, the # of seconds since 1970-01-01, or
   the number of days since 1970-01-01.  This depends upon how 'R' internally
   attempts to fix any incongruities.
   - But also very significantly, an error will get me in the habit of
   avoiding any marginalized class types.  I keep thinking, for instance, that
   I can use the "Dates" class, since 'R' says that it supports them.  But if
   I got into the habit of converting all dates into numerics myself
   beforehand (maybe counting the number of seconds from 1970-01-01, since
   that seems a magic date), then I would be guaranteed that a function will
   either (a) cause an error (e.g., if I try a character function on it), or
   (b) function properly.  However, since I don't overtly receive errors when
   attempting to use dates (or difftimes, or factors, or whatever), I keep
   using them, instead of relying solely upon the true & trusted classes.
  - the trickiest here is really factors.  Factors are, by most
  accounts, considered a core class.  In some cases, you can only use
  factors.  E.g., when you want some sort of ordinal categorical variable.
   Therefore, the fact that factors also barf similarly to other
classes like
  difftime (albeit much more rarely), is especially dangerous.

There are, of course, habits that we can create to make ourselves
better programmers, and I will recognize that I can improve.  However, this
issue of functions generating "wrong" answers is such a *huge* problem with
'R', and other languages are catching up to 'R' so quickly, as far as their
capability to handle higher level math, that this issue is making 'R' a
less desirable language to use, as time progresses.  I don't mean to claim
that my opinion is the end-all-be-all, but I would like to hear others
chime in, whether this is a large concern, or whether there is a very small
minority of folks impacted by it.

  Regards,
 Mike

---
XKCD <http://www.xkcd.com>



On Thu, Nov 3, 2011 at 2:51 PM, Joshua Wiley  wrote:

> Hi Mike,
>
> This isn't really an answer to your question, but perhaps will serve
> to continue discussion.  I think that there are some fundamental
> issues when working special classes.  As a thought example, suppose I
> wrote a class, "posreal", which inherits from the numeric class.  It
> is only valid for positive, real numbers.  I use it in a package, but
> do not develop methods for it.  A user comes along and creates a
> vector, x that is a posreal.  Then tries: mean(x * -3).  Since I never
> bothered to write a special method for mean for my class, R falls back
> to the inherited numeric, but gives a value that is clearly not valid
> for posreal.  What should happen?  S3 methods do not really have
> validation, so in principle, one could write a function like:
>
> f <- function(x) {
>  vclass <- class(x)
>  res <- mean(x)
>  class(res) <- vclass
>  return(res)
> }
>
> which "retains" the appropriate class, but in name only.  R core
> cannot possibly know or imagine all classes that may be written that
> inherit from more basic types but with possible special aspects and
> requirements.  I think the inherited is considered to be more generic
> and that is returned.  It is usually up to the user to ensure that the
> function (whose methods were not specific to that special class but
> the inherited) is valid for that class and can manually convert it
> back:
>
> res <- as.posreal(res)
>
> What about lapply and sapply?  Neither are generic or have methods for
> difftime, and so do some unexpected/desirable things.  Again, without
> methods defined for a particular c

[R] any updates w.r.t. lapply, sapply, apply retaining classes

2011-11-03 Thread Mike Williamson

Hi All,

I don't have a "I need help" question, so much as a query into any
update whether 'R' has made any progress with some of the core functions
retaining classes.  As an example, because it's one of the cases that most
egregiously impacts me & my work and keeps pushing me away from 'R' and
into other numerical languages (such as NumPy in python), I will use sapply
/ lapply to demonstrate, but this behavior is ubiquitous throughout 'R'.

Let's say I have a class which is theoretically supported, but not one
of the core "numeric" or "character" classes (and, to some degree, "factor"
classes).  Many of the basic functions will convert my desired class into
either numeric or character, so that my returned answer is gibberish.

E.g.:

test= as.difftime(c(1, 1, 8, 0.25, 8, 1.25), units= "days")  ## create a
small array of time differences
class(test)  ## this will return the proper class, "difftime"
class(test[1] ) ## this will also return the proper class, "difftime"
sapply(test, class)  ## this will return *numerics* for all of the classes.
 Ack!!

In the example I give above, the impact might seem small, but the
implications are *huge*.  This means that I am, in effect, not allowed to
use *any* of the vectoring functions in 'R', which avoid performing loops
thereby speeding up process time extraordinarily.  Many can sympathize that
'R' is ridiculously slow with "for" loops, compared to other languages.
 But that's theoretically OK, a good statistician or data analyst should be
able to work comfortably with matrices and vectors.  However, *'R' cannot
work comfortably* with matrices or vectors, *unless* they are using the
numeric or character classes.  Many of the classes suffer the problem I
just described, although I only used "difftime" in the example.  Factors
seem a bit more "comfortable", and can be handled most of the time, but not
as well as numerics, and at times functions working on factors can return
the numerical representation of the factor instead of the original factor.

Is there any progress in guaranteeing that all core functions either
(a) ideally return exactly the classes, and hierarchy of classes, that they
received (e.g., a list of data frames with difftimes & dates & characters
would return a list of data frames with difftimes & dates & characters), or
(b) barring that, the function should at least error out with a clear error
explaining that sapply, for example, cannot vectorize on the class being
used?  Returning incorrect answers is far worse than returning an error,
from a perspective of stability.

This is, by far, the largest Achilles' heel to 'R'.  Personally, as my
career advances and I work on more technical things, I am finding that I
have to leave 'R' by the wayside and use other languages for robust
numerical calculations and programming.  This saddens me, because there are
so many wonderful packages developed by the community.  The example above
came up because I am using the "forecast" library to great effect in
predicting how long our product cycle time will be.  However, I spend much
of my time fighting all these class & typing bugs in 'R' (and we have to
start recognizing that they are bugs, otherwise they may never get
resolved), such that many of the improvements in my productivity due to all
the wonderful computational packages are entirely offset by the time
I spend fighting this issue of poor classes.

 Thanks & Regards!
  Mike

---
XKCD <http://www.xkcd.com>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] R shell line width

2011-10-06 Thread Mike P

Thank you very much for your responses! This is exactly what I needed.

On Fri, Sep 16, 2011 at 8:13 PM, Joshua Wiley  wrote:
> Hi Mike,
>
> Look at ?options  particularly something like:
>
> options(width = 120)
>
> 80 is the default, I believe.  On 1920 pixels I can comfortably get
> around 220 (depending on the overhead of the program, full screen,
> etc.).  I imagine it would also be possible to run into limitations
> from the terminal R is running in, though I do not know that for a
> fact.
>
> Cheers,
>
> Josh
>
> On Fri, Sep 16, 2011 at 12:48 PM, Mike P  wrote:
>> Hi,
>>
>> I want to apologize in advance if this has already been asked. I
>> wasn't able to find any information, either on google or from local
>> list search.
>>
>> I'm running an R shell from a linux command line, in an xterm window.
>> Whenever I print a data frame, only the first couple of columns are
>> printed side-by-side, the others are being repositioned below them. It
>> seems something is limiting the line width of the output, even though
>> there is enough horizontal space to fit each row on a single line.
>>
>> For example, this command:
>>
>>> data.frame(matrix(1:30,nrow=1))
>>
>> prints columns 1-21 on the first line, and the rest 22-30 on the second.
>>
>> Is there a way I can configure R to increase the width of my output?
>>
>> Thanks.
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>
>
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> Programmer Analyst II, ATS Statistical Consulting Group
> University of California, Los Angeles
> https://joshuawiley.com/
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] any way to convert back to DateTime class when "accidental" conversion to numeric?

2011-10-06 Thread Mike Williamson

Hi Dr. Ripley, All,

Thanks for the succinct advice!  Perfectly what I needed!

Jeff,

Absolutely I agree that this is a dangerous path, and I would never
consider doing it for something that needs to be robust.  But in 'R' type
casting is a bit messed up, so I've come to accept that sometimes something
I called a string might become a factor, a date became a numeric, etc.  Then
I stick in enough catches throughout my functions to deal with it, worst
case.  It is ugly, but I cannot figure out a way to have tight types and
still use 'R'.  Yet, 'R' has so many cool functions (and more added every
day); I'd be silly to throw the baby out with the bathwater.
So my typical best case or robust solution is to write a parent script
in python (I simply know python best) that handles all data typing, etc.,
then call 'R' once I know that everything is clean.  In this particular case
above where I was asking, this is really for exploratory work.  Once I get a
solution, I will likely handle typing outside of 'R'.

Thanks for the advice!

  Regards,
 Mike

---
XKCD <http://www.xkcd.com>

On Wed, Oct 5, 2011 at 11:41 PM, Prof Brian Ripley wrote:

> A more portable way (that function only works in some versions of R) is
>
> as.POSIXct(1317857320, origin="1970-01-01")
>
> possibly with a 'tz' argument if you need to restore the timezone.
>
>
> On Wed, 5 Oct 2011, jim holtman wrote:
>
>  Here is what I use:
>>
>> unix2POSIXct(1317857320)
>> [1] "2011-10-05 19:28:40 EDT"
>>
>>
>> unix2POSIXct  <-  function (time) structure(time, class = c("POSIXt",
>> "POSIXct"))
>>
>>
>> On Wed, Oct 5, 2011 at 7:38 PM, Mike Williamson 
>> wrote:
>>
>>> Hi,
>>>
>>>In short, I would like to know if there is any way to convert a
>>> numeric
>>> into a date, similar to how strptime() can convert a string to a date
>>> time
>>> class?
>>>
>>>There are some functions, etc. which don't work well with dates, and
>>> tend to force them into numerics.  I understand that the number it spits
>>> back is the number of seconds since the beginning of 1970 (see the first
>>> few
>>> sentences of the "Details" portion of ?DateTimeClasses).
>>>However, it's a bit of a hassle to convert that by hand.  I can create
>>> a
>>> function to do this, and it isn't so hard, but I found it hard to believe
>>> such a function didn't already exist, so I wanted to ask the community.
>>>
>>>As an example, today (Oct 5th 2011 at approximately 4:30pm, Pacific
>>> time) is approximately 1317857320 as a numeric, but I would like to know
>>> how
>>> to go from that number back to the "2011-10-05 16:28:39 PDT" date time
>>> class
>>> which originally generated it.
>>>
>>>Thanks!
>>>  Mike
>>>
>>> ---
>>> XKCD <http://www.xkcd.com>
>>>
>>>[[alternative HTML version deleted]]
>>>
>>> __**
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>>> PLEASE do read the posting guide http://www.R-project.org/**
>>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>>
>>
>>
>> --
>> Jim Holtman
>> Data Munger Guru
>>
>> What is the problem that you are trying to solve?
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
> --
> Brian D. Ripley,  rip...@stats.ox.ac.uk
> Professor of Applied Statistics,  
> http://www.stats.ox.ac.uk/~**ripley/<http://www.stats.ox.ac.uk/~ripley/>
> University of Oxford, Tel:  +44 1865 272861 (self)
> 1 South Parks Road, +44 1865 272866 (PA)
> Oxford OX1 3TG, UKFax:  +44 1865 272595

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] any way to convert back to DateTime class when "accidental" conversion to numeric?

2011-10-05 Thread Mike Williamson

Hi,

In short, I would like to know if there is any way to convert a numeric
into a date, similar to how strptime() can convert a string to a date time
class?

There are some functions, etc. which don't work well with dates, and
tend to force them into numerics.  I understand that the number it spits
back is the number of seconds since the beginning of 1970 (see the first few
sentences of the "Details" portion of ?DateTimeClasses).
However, it's a bit of a hassle to convert that by hand.  I can create a
function to do this, and it isn't so hard, but I found it hard to believe
such a function didn't already exist, so I wanted to ask the community.

As an example, today (Oct 5th 2011 at approximately 4:30pm, Pacific
time) is approximately 1317857320 as a numeric, but I would like to know how
to go from that number back to the "2011-10-05 16:28:39 PDT" date time class
which originally generated it.

    Thanks!
  Mike

---
XKCD <http://www.xkcd.com>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] read .csv from web from password protected site

2011-10-03 Thread Mike Pfeiff

Yes, I meant to reply to all (sorry still new at asking for help)

1. No, I am not able to open the file when I insert "userID:password@"  between 
"http://"; and "www"


http://userid:passw...@www.frontierweather.com/degreedays/L15N15PowerRegionAverages_10weeks.txt
(replacing userid and password with my actual information that is known 
to work)

2. Yes, the webpage where the data is stored does require me to my userid and 
password and gives me the option to remember password for future use which I 
have selected.  Subsequent visits to the website do not require me to reenter 
info.  



-Original Message-
From: Sarah Goslee [mailto:sarah.gos...@gmail.com] 
Sent: Monday, October 03, 2011 3:07 PM
To: Mike Pfeiff; r-help
Subject: Re: [R] read .csv from web from password protected site

Hi,

I've assumed that you meant to send this to the R-help list, and not just me.

On Mon, Oct 3, 2011 at 3:54 PM, Mike Pfeiff  wrote:
> Sarah,  Thanks for the suggestion.  Although, 
> read.table("http://userid:passw...@my.url/file.csv";) did not work as it 
> returned the following:
>
>        Error in file(file, "rt") : cannot open the connection
>        In addition: Warning message:
>        In file(file, "rt") : unable to resolve 'userid'
>
> (where 'usesid is my actual userid)
>
> I've tried the following RCurl commands...
>
>        myURL   
> ="http://www.frontierweather.com/degreedays/L15N15PowerRegionAverages_10weeks.txt";
>        h=getURL(myURL, userpw = "userid:passwod", followlocation=TRUE)
>        test=read.table(h,header=TRUE,sep=",")
>
> ..and I can't get the data to read and get the following errors:
>
>        Error in file(file, "rt") : cannot open the connection
>        In addition: Warning message:
>        In file(file, "rt") : unable to resolve 'userid'
>
> I'm at a total loss.  Any assistance anyone could provide would be greatly 
> appreciated.

You can get to the file using your web browser and that userid/password combo, 
right?

Do you have to go through a dialog box? Press a button to login? Any of those 
have the potential to complicate the task.

If so, you'll need to work through the Forms section at 
http://www.omegahat.org/RCurl/philosophy.html

Sarah

>
>
> -Original Message-
> From: Sarah Goslee [mailto:sarah.gos...@gmail.com]
> Sent: Monday, October 03, 2011 2:26 PM
> To: Mike Pfeiff
> Cc: r-help@r-project.org
> Subject: Re: [R] read .csv from web from password protected site
>
> Hi Mike,
>
> On Mon, Oct 3, 2011 at 12:31 PM, Mike Pfeiff  wrote:
>> I am very new to R and have been struggling trying to read a basic ".csv" 
>> file from a password protected site with the following code:
>>
>> myURL  
>> ="http://www.frontierweather.com/degreedays/L15N15PowerRegionAverages_10weeks.txt";
>> test2=read.table(url(myURL),header=TRUE,sep=",")
>
>
>> A 'data.frame' is returned into the workspace, however it is not the data 
>> contained in the ".csv" file.   I think this occurs because the website 
>> where I am trying to retrieve the data is password protected.
>>
>> Is there a way to specify the username and password?
>
>
> I'd try first
> read.table("http://userid:passw...@my.url/file.csv";), which is the standard 
> way to do it (hint: try that form in your web browser and see whether you can 
> access the data), and if that doesn't work look into the RCurl package. The 
> list archives have a fair bit of information on this topic.
>
> Sarah
>
>
>> Any guidance would be greatly appreciated.
>>
>> Sincerely,
>>
>> Mike
>>
>

--
Sarah Goslee
http://www.functionaldiversity.org

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] read .csv from web from password protected site

2011-10-03 Thread Mike Pfeiff

I am very new to R and have been struggling trying to read a basic ".csv" file 
from a password protected site with the following code:

myURL  
="http://www.frontierweather.com/degreedays/L15N15PowerRegionAverages_10weeks.txt";
test2=read.table(url(myURL),header=TRUE,sep=",")

A 'data.frame' is returned into the workspace, however it is not the data 
contained in the ".csv" file.   I think this occurs because the website where I 
am trying to retrieve the data is password protected.

Is there a way to specify the username and password?

Any guidance would be greatly appreciated.

Sincerely,

Mike

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] need help with contourplot figure

2011-09-30 Thread Mike Gibson


I can't figure out how to add tick marks on both my X and Y axis.  For example, 
my X axis ranges from 0 to 1 and there are both a tick mark and a number label 
at the X-axis values of 0.2,0.4,0.6. and 0.8.  I want to add tick marks to the 
figure at every 0.1 value.  This will help a viewer determine the values on the 
x axis.
 
 
I included all of my code.  I apologize but it is very long.  I am created a 
contourplot figure that will be a jpg.  I also included my notes after the # 
sign so you can see what I am doing.  Any help would be greatly appreciated.
 
 
 
jpeg(file="C:/Documents and Settings/Michael/My 
Documents/Mike/amberjack/Reefs_Model/YPRlevel.jpg", width=8,height=8, 
unit="in", res=300) #location of file and size
x<-contourplot(YPR~F*Length, data=yprplot2, 
at=c(2.0,3.0,4.0,5.0,5.5,6.0,6.25,6.5,6.75,7.0,7.12,7.25,7.35,7.45,7.5), 
ylim=c(25,40), xlim=c(0,1),xlab = "Fishing Mortality Rate", ylab = "Minimum 
Size (inches)",
panel=function(...){  #This step adds the point for the current YPR value, 
... means to leave the function open to bring in other functions
panel.levelplot(...)
grid.points(0.609,30,pch=8)   #pch is the point character where 19 is a 
closed circle
grid.points(0.333,30,pch=8)   #pch is the point character where 19 is a 
closed circle
grid.points(eumetric$F, eumetric$Length,pch=18,gp=gpar(col="black", 
cex=.9))})  #add the eumetric line and make them points
#now add the text for the current ypr location
print(x)  #this brings up the figure I already made
grid.text('Fcurrent',0.62,0.35,gp=gpar(col="black", cex=1)) #0.42 and 0.40 is 
the location of the text on the figure
grid.text('Fmsy',0.38,0.35,gp=gpar(col="black", cex=1)) #0.42 and 0.40 is the 
location of the text on the figure
dev.off()#it won't send the pdf until this is added.  It turns off the pdf 
function   
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] survexp with large dataframes

2011-09-28 Thread Mike Harwood

Hello, and thank you in advance.

I would like to capture the expected survival from a coxph model for
subjects in an observational study with recurrent events, but the
survexp statement is failing due to memory.  I am using R version
2.13.1 (2011-07-08) on Windows XP.

My objective is to plot the fitted survival with the Kaplan-Meier
plot.  Below is the code with output and [unfortunately] errors. Is
there something wrong in my use of cluster in generating the
proportional hazards model, or is there some syntax to pass it into
survexp?

Mike

> dim(dev)
[1] 899876 25

> mod1 <- coxph(Surv(begin.cp, end.cp, event)
+ ~ age.sex
+   + plan_type
+   + uw_load
+   + cluster(mbr_key)
+   ,data=dev
+   )
>
> summary(mod1)
Call:
coxph(formula = Surv(begin.cp, end.cp, event) ~ age.sex + plan_type +
uw_load + cluster(mbr_key), data = dev)

  n= 899876, number of events= 753324

 coef exp(coef)  se(coef) robust se   z
Pr(>|z|)
age.sex19-34_MALE   -0.821944  0.439576  0.005529  0.023298 -35.280  <
2e-16 ***
age.sex35-49_FEMALE  0.058776  1.060537  0.004201  0.018477   3.181
0.00147 **
age.sex35-49_MALE   -0.515590  0.597148  0.004634  0.019986 -25.798  <
2e-16 ***
age.sex50-64_FEMALE  0.190940  1.210386  0.004350  0.020415   9.353  <
2e-16 ***
age.sex50-64_MALE   -0.127514  0.880281  0.004487  0.021431  -5.950
2.68e-09 ***
age.sexCHILD_CHILD  -0.327522  0.720707  0.004238  0.017066 -19.192  <
2e-16 ***
plan_typeLOW-0.165735  0.847270  0.002443  0.011080 -14.958  <
2e-16 ***
uw_load1-50  0.215122  1.240014  0.006437  0.029189   7.370
1.71e-13 ***
uw_load101-250   0.551042  1.735060  0.003993  0.018779  29.344  <
2e-16 ***
uw_load251+  0.981660  2.668884  0.003172  0.017490  56.126  <
2e-16 ***
uw_load51-1000.413464  1.512046  0.006216  0.027877  14.832  <
2e-16 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

exp(coef) exp(-coef) lower .95 upper .95
age.sex19-34_MALE  0.4396 2.27490.42000.4601
age.sex35-49_FEMALE1.0605 0.94291.02281.0996
age.sex35-49_MALE  0.5971 1.67460.57420.6210
age.sex50-64_FEMALE1.2104 0.82621.16291.2598
age.sex50-64_MALE  0.8803 1.13600.84410.9180
age.sexCHILD_CHILD 0.7207 1.38750.69700.7452
plan_typeLOW   0.8473 1.18030.82910.8659
uw_load1-501.2400 0.80641.17111.3130
uw_load101-250 1.7351 0.57631.67241.8001
uw_load251+2.6689 0.37472.57892.7620
uw_load51-100  1.5120 0.66141.43161.5970

Concordance= 0.643  (se = 0 )
Rsquare= 0.205   (max possible= 1 )
Likelihood ratio test= 206724  on 11 df,   p=0
Wald test= 9207  on 11 df,   p=0
Score (logrank) test = 246358  on 11 df,   p=0,   Robust = 4574  p=0

  (Note: the likelihood ratio and score tests assume independence of
 observations within a cluster, the Wald and robust score tests do
not).

> dev.fit <- survexp( ~ 1, ratetable=mod1, data=dev)
Error in survexp.cfit(cbind(as.numeric(X), R), Y, conditional,
FALSE,  :
  cannot allocate memory block of size 15.2 Gb

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] R shell line width

2011-09-16 Thread Mike P

Hi,

I want to apologize in advance if this has already been asked. I
wasn't able to find any information, either on google or from local
list search.

I'm running an R shell from a linux command line, in an xterm window.
Whenever I print a data frame, only the first couple of columns are
printed side-by-side, the others are being repositioned below them. It
seems something is limiting the line width of the output, even though
there is enough horizontal space to fit each row on a single line.

For example, this command:

> data.frame(matrix(1:30,nrow=1))

prints columns 1-21 on the first line, and the rest 22-30 on the second.

Is there a way I can configure R to increase the width of my output?

Thanks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Problem looping a function to create/add to new dataframe

2011-09-15 Thread Mike Treglia


Dear List Members,

I have created a function to run a simulation based on a given set of 
values within a vector, such that I run the function like this:


new.data<-sapply(vector, function)

In which I run 'function' on every value within a vector that I created. 
The result is a matrix of 1000 rows, and as many columns as the length 
of the vector.


I modified the function to utilize two given values, such that it is: 
function(x,y), and I want to run this function for not only a range of 
values of 'x', but a range of values of 'y' as well, such that for each 
value of 'y', I'd create another 1000 rows in the matrix. I was trying 
to loop this, but in doing so, it looks like I just keep creating 
datasets and replacing them with datasets for subsequent 'y' values, 
rather than adding new rows. The structure of my current loop is below. 
Any suggestions on what to change to accomplish what I want? Would it be 
good to create a dataframe first and then somehow add these new rows to 
the dataframe? Also, is it appropriate to have the 'i' in the sapply 
statement?


for (i in c(seq(10,100,10))){
new.data<-sapply(vector, function, i)
}

Please let me know if more detail on my code would be helpful- I was 
just trying to keep it simple and focus on what I saw as the problem at 
hand for now.

Thank you for your help.
Sincerely,
Mike Treglia

--
Michael Treglia
Applied Biodiversity Sciences NSF-IGERT Program
Wildlife and Fisheries Sciences
Texas A&M University
Lab: (979)862-7245
ml...@tamu.edu
http://people.tamu.edu/~mlt35

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] storage and single-precision

2011-09-08 Thread Mike Miller

On Thu, 8 Sep 2011, William Dunlap wrote:

Use gzcon() to make a compressed connection and any function that write 
to a connection will write compressed data.  E.g.,

 > con <- gzcon(file("tempfile.junk", "wb"))
 > x <- as.integer(rep(c(-127, 1, 127), c(3,2,1)))
 > writeBin(x, con, size=1)
 > close(con)
 > q("no")
 bill:158% zcat tempfile.junk | od --format d1
 000 -127 -127 -12711  127
 006

(In this tiny example the gzip'ed file is bigger than the equivalent 
one, but it is gzip'ed.)

That's a great function.  Thanks for the tip.  Apparently it works in both 
directions.  That is, I would also use gzcon for readBin:

Description:

 ‘gzcon’ provides a modified connection that wraps an existing
 connection, and decompresses reads or compresses writes through
 that connection.  Standard ‘gzip’ headers are assumed.

Perfect.  Thanks again.

Mike__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] storage and single-precision

2011-09-08 Thread Mike Miller


On Thu, 8 Sep 2011, Duncan Murdoch wrote:


On 11-09-07 6:25 PM, Mike Miller wrote:

I'm getting the impression from on-line docs that R cannot work with 
single-precision floating-point numbers, but that it has a pseudo-mode 
for single precision for communication with external programs.


I don't mind that R is using doubles internally, but what about 
storage? If all I need to store is single-precision (32-bit), can I do 
that?  When it is read back into R it can be converted from single to 
double (back to 64-bit).


Furthermore, the data are numbers from 0.000 to 2.000 with no missing 
values that could be stored just as accurately as unsigned 16-bit 
integers from 0 to 2000.  That would be the best plan for me.



writeBin is quite flexible in converting between formats if you just 
want to store them on disk.  To use nonstandard formats in memory will 
require external support; it's not easy.



Thanks.  I can see now that writeBin will store unsigned 16-bit integers, 
which is what I want.  There is one other issue -- with save() I'm allowed 
to use compression (e.g., gzip), but there doesn't seem to be a 
compression option in writeBin.  Is there a way to get the best of both 
worlds?  The data are highly nonrandom and at most 11 bits will be used 
per integer, so the compression ratio should be pretty good, if I can have 
one.


Mike

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] storage and single-precision

2011-09-07 Thread Mike Miller

I'm getting the impression from on-line docs that R cannot work with 
single-precision floating-point numbers, but that it has a pseudo-mode for 
single precision for communication with external programs.


I don't mind that R is using doubles internally, but what about storage? 
If all I need to store is single-precision (32-bit), can I do that?  When 
it is read back into R it can be converted from single to double (back to 
64-bit).


Furthermore, the data are numbers from 0.000 to 2.000 with no missing 
values that could be stored just as accurately as unsigned 16-bit integers 
from 0 to 2000.  That would be the best plan for me.


It looks like the ff package allows for additional formats, so I might try 
to use ff, but I still would like to get a better understanding of R's 
native capabilities in regard to representations of numbers both in RAM 
and in stored data files.


Thanks in advance.

Best,
Mike

--
Michael B. Miller, Ph.D.
Minnesota Center for Twin and Family Research
Department of Psychology
University of Minnesota

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Convert CSV file to FASTA

2011-08-31 Thread Mike Marchywka



















> Date: Wed, 31 Aug 2011 01:36:51 -0700
> From: oliviacree...@gmail.com
> To: r-help@r-project.org
> Subject: [R] Convert CSV file to FASTA
>
> Hi there,
>
> I have large excel files which I can save as CSV files.
>
> Each excel file contains two columns. One contains the chromosome number and
> the second contains a DNA sequence.
>
> I need to convert this into a fasta file that looks like this
> >chromosomenumber
> CGTCGAGCGTCGAGCGGAGCG
>
> Can anyone show me an R script to do this?
>

If you can post a few lines of your "csv" someone can probably give you a bach
script to do it. It may be possible in R but sed/awk probbly work better. IIRC, 
fasta
is just a name line followed by sequence. If your csv looks like "name, 
XX"
it may be possible to change comma to space and use awk with something like 
print ">"$1"\n"$2 
etc.



> Many thanks
>
> x
>
> --
> View this message in context: 
> http://r.789695.n4.nabble.com/Convert-CSV-file-to-FASTA-tp3780498p3780498.html
> Sent from the R help mailing list archive at Nabble.com.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] questions about "metafor" package

2011-08-22 Thread Mike Cheung

Hi, Emilie.

For your second question. You may check Gleser and Olkin (2009). They gave
several formulas to estimate the sampling covariance for dependent effect
sizes. One of them can be applied in your case.

Gleser, L. J., & Olkin, I. (2009). Stochastically dependent effect sizes. In
H. Cooper, L. V. Hedges, & J. C. Valentine (Eds.), The handbook of research
synthesis and meta-analysis. (2nd ed., pp. 357-376). New York: Russell Sage
Foundation.

Regards,
Mike
-- 
-----
 Mike W.L. Cheung   Phone: (65) 6516-3702
 Department of Psychology   Fax:   (65) 6773-1843
 National University of Singapore
 http://courses.nus.edu.sg/course/psycwlm/internet/
-

On Sat, Aug 20, 2011 at 10:19 PM, Michael Dewey wrote:

> At 16:21 17/08/2011, Emilie MAILLARD wrote:
>
>> Hello,
>> Ã
>> I would like to do a meta-analysis with the package ÃÂ«Ã metaforÃ ÃÂ».
>> Ideally I would like to use a mixed model because IÃ¢â¬â¢m interested to 
>> see
>> the effect of some moderators. But the data set I managed to collect from
>> literature presents two limits.
>> Ã
>> -Ã Ã Ã Ã Ã Ã Ã Ã  Firstly, for each observation, I have means for a
>> treatment and for a control, but I donÃ¢â¬â¢t always have corresponding
>> standard deviations (52 of a total of 93 observations donÃ¢â¬â¢t have 
>> standard
>> deviations). Nevertheless I have the sample sizes for all observations so I
>> wonder if it was possible to weight observations by sample size in the
>> package ÃÂ«Ã metaforÃ ÃÂ».
>> -Ã Ã Ã Ã Ã Ã Ã Ã  Secondly, some observations are probably not 
>> independent
>> as I have sometimes several relevant observations for a same design. More
>> precisely, for these cases, the control mean is identical but treatment
>> means varied. Ideally, I would not like to do a weighted average for these
>> non-independent observations because these observations represent levels of
>> a moderator. I know that the package ÃÂ«Ã metaforÃ ÃÂ» is not designed 
>> for the
>> analysis of correlated outcomes. What are the dangers of using the package
>> even if observations are not really independent ? Ã
>>
>
> Emilie,
> I am not sure whether this is the answer to your problem of observations
> which are not independent but you might also look at the metaSEM package
> http://courses.nus.edu.sg/**course/psycwlm/internet/**metaSEM/<http://courses.nus.edu.sg/course/psycwlm/internet/metaSEM/>
> I am still trying to understand his paper on this (see link for reference)
> but he is trying to embed meta-analysis within the structural equation
> framework and it may be possible to cope with lack of independence in that
> way. But as I say I am still trying to come to grips with the paper.
>
>
>  Ã
>>
>> Thank you for your help,
>> Ã
>> Ãâ°milie.
>>[[alternative HTML version deleted]]
>>
>
> Michael Dewey
> i...@aghmed.fsnet.co.uk
> http://www.aghmed.fsnet.co.uk/**home.html<http://www.aghmed.fsnet.co.uk/home.html>
>
>
> __**
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
> PLEASE do read the posting guide http://www.R-project.org/**
> posting-guide.html <http://www.R-project.org/posting-guide.html>
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] write.table extra column

2011-08-15 Thread Mike Hunter

Thanks!

On Mon, Aug 15, 2011 at 12:55 AM, Jeff Newmiller
 wrote:
> Yup.
>
> Read ?write.csv and note the row.names argument.
> ---
> Jeff Newmiller The . . Go Live...
> DCN: Basics: ##.#. ##.#. Live Go...
> Live: OO#.. Dead: OO#.. Playing
> Research Engineer (Solar/Batteries O.O#. #.O#. with
> /Software/Embedded Controllers) .OO#. .OO#. rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
>
> Mike Hunter  wrote:
>>
>> In the following data.frame there are 6 columns, but 7 are written to
>> the CSV file.
>>
>> install.packages("pmlr")
>> library(pmlr)
>> data(enzymes)
>> write.table(enzymes, sep=",", eol="\n",file="albert.csv")
>>
>> 
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] write.table extra column

2011-08-15 Thread Mike Hunter

In the following data.frame there are 6 columns, but 7 are written to
the CSV file.

install.packages("pmlr")
library(pmlr)
data(enzymes)
write.table(enzymes, sep=",", eol="\n",file="albert.csv")

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] S4 classes, some help with the basics

2011-08-09 Thread Mike Williamson

Thanks Duncan, Martin,

You both provided exactly what I needed!

   Regards,
  Mike



---
XKCD <http://www.xkcd.com>



On Mon, Aug 8, 2011 at 5:21 PM, Duncan Murdoch wrote:

> On 08/08/2011 8:04 PM, Mike Williamson wrote:
>
>> Hi All,
>>
>> I have tried to find an answer within documentation, but I cannot:
>>
>> o How can call a class "slot" without knowing the name a priori?
>>
>
> See ?slot.
>
>
>
>> E.g., let's say I use the "pcaMethods" library to create a "pcaRes"
>> object.  How can I call parts of that object without using the specific
>> names of the object in the call?
>>
>> example code:
>>
>> library(pcaMethods)
>> myMatrix<- matrix(runif(1e4), ncol=100)  ## silly, but sufficient for
>> example
>> myPCA<- pca(myMatrix)  ## creates a "pcaRes" S4 class
>>
>> for (i in slotNames(myPCA) ) {
>>  summary(myPCA@i)  ### I know this doesn't work, but this is the
>> question... what grammar could I use?
>>
>
> summary(slot(myPCA, i))
>
> Duncan Murdoch
>
>  }
>>
>> 
>>
>> I would like to be able to print out the summary for each of the
>> components
>> of the class "pcaRes" without knowing a priori the names of those
>> components.  I could, for example in the above case, type in
>> summary(myPCA@completeObs) to get a summary of the input matrix.  But I
>> HAVE
>> TO TYPE "@completeObs".  In the non-S4 world, I could type myPCA[[i]] for
>> lists, where "i" could be looping through either the list names, or the
>> list
>> indices.  Similarly, I could type myPCA[i] for arrays, where "i" again can
>> be either a numeric index or the name.
>>
>> Without this ability to identify portions within an array / loop context,
>> it
>> becomes exceedingly difficult to work in "S4 land".  How is this sort of
>> thing done?
>>
>> Thank you!
>>  Mike
>>
>>
>>
>> ---
>> XKCD<http://www.xkcd.com>
>>
>>[[alternative HTML version deleted]]
>>
>> __**
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help>
>> PLEASE do read the posting guide http://www.R-project.org/**
>> posting-guide.html <http://www.R-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] S4 classes, some help with the basics

2011-08-08 Thread Mike Williamson

Hi All,

I have tried to find an answer within documentation, but I cannot:

o How can call a class "slot" without knowing the name a priori?

E.g., let's say I use the "pcaMethods" library to create a "pcaRes"
object.  How can I call parts of that object without using the specific
names of the object in the call?

example code:

library(pcaMethods)
myMatrix <- matrix(runif(1e4), ncol=100)  ## silly, but sufficient for
example
myPCA <- pca(myMatrix)  ## creates a "pcaRes" S4 class

for (i in slotNames(myPCA) ) {
 summary(myPCA@i)  ### I know this doesn't work, but this is the
question... what grammar could I use?
}



I would like to be able to print out the summary for each of the components
of the class "pcaRes" without knowing a priori the names of those
components.  I could, for example in the above case, type in
summary(myPCA@completeObs) to get a summary of the input matrix.  But I HAVE
TO TYPE "@completeObs".  In the non-S4 world, I could type myPCA[[i]] for
lists, where "i" could be looping through either the list names, or the list
indices.  Similarly, I could type myPCA[i] for arrays, where "i" again can
be either a numeric index or the name.

Without this ability to identify portions within an array / loop context, it
becomes exceedingly difficult to work in "S4 land".  How is this sort of
thing done?

Thank you!
 Mike



---
XKCD <http://www.xkcd.com>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] running slope

2011-07-29 Thread Byerly, Mike M (DFG)

Is there a function in R that will calculate a running linear slope
similar to the way the function filter() will calculate a moving
average?

 

Mike Byerly

Fisheries Biologist

Commercial Fisheries Division

Alaska Dept. of Fish and Game

3298 Douglas Place

Homer, AK 99603

 

mike.bye...@alaska.gov <mailto:mike.bye...@alaska.gov> 

PH: 907-235-1745

FX: 907-235-2448


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is R the right choice for simulating first passage times of random walks?

2011-07-27 Thread Mike Marchywka





Top posting cuz hotmail decided not to highlight...

Personally I would tend to use java or c++ for the inner loops
but you could of course later make an R package out of that.
This is especially true if your code will be used elsewhere
in a performance critical system. For example, I wrote some
c++ code for dealing with graphs nothing fancy but it
let me play with some data structure ideas and I could
then build it into standalone programs or perhaps an R
package. 

Many of these things slow down due to memory incoherence
or IO long before you use up the processor. With c++
in principle anyway you have a lot of control over these things.

Once you have your results and want to anlyze them, that's
when I would use R. Dumping simulation samples to a text file
is easy to also lets you use other things like sed/grep or
vi to explore as needed. 








From: paulepan...@users.sourceforge.net
To: r-help@r-project.org
Date: Thu, 28 Jul 2011 02:00:13 +0200
Subject: Re: [R] Is R the right choice for simulating first passage times of 
random walks?


Dear R folks,


Am Donnerstag, den 28.07.2011, 01:36 +0200 schrieb Paul Menzel:

> I need to simulate first passage times for iterated partial sums. The
> related papers are for example [1][2].
>
> As a start I want to simulate how long a simple random walk stays
> negative, which should result that it behaves like n^(-½). My code looks
> like this.
>
>  8<  code  >8 
> n = 10 # number of simulations
>
> length = 10 # length of iterated sum
> z = rep(0, times=length + 1)
>
> for (i in 1:n) {
>   x = c(0, sign(rnorm(length)))
>   s = cumsum(x)
>   for (i in 1:length) {
>   if (s[i] < 0 && s[i + 1] >= 0) {
>   z[i] = z[i] + 1
>   }
>   }
> }
> plot(1:length(z), z/n)
> curve(x**(-0.5), add = TRUE)
>  8<  code  >8 

Of course the program above is not complete, because it only checks for
the first passage from negativ to positive. `if (s[2] < 0) {}` should be
added before the for loop.

> This code already runs for over half an hour on my system¹.
>
> Reading about the for loop [3] it says to try to avoid loops and I
> probably should use a matrix where every row is a sample.
>
> Now my first problem is that there is no matrix equivalent for
> `cumsum()`. Can I use matrices to avoid the for loop?

I mean the inner for loop.

Additionally I wonder if `cumsum` is really faster or if I should sum
the elements by myself and check after every step if the walk gets
non-negative/0. With a length of 100 this should save some cycles.
On the other hand adding numbers should be really fast and adding checks
in between could potentially be slower.

> My second question is, is R the right choice for such simulations? It
> would be great when R can also give me a confidence interval(?) and also
> try to fit a curve through the result and give me the rule of
> correspondence(?) [4]. Do you have pointers for those?
>
> I glanced at simFrame [5] and read `?simulate` but could not understand
> it right away and think that this might be overkill.
>
> Do you have any suggestions?


Thanks,

Paul


> ¹ AMD Athlon(tm) X2 Dual Core Processor BE-2350, 2,1 GHz
>
>
> [1] http://www-stat.stanford.edu/~amir/preprints/irw.ps
> [2] http://arxiv.org/abs/0911.5456
> [3] http://cran.r-project.org/doc/manuals/R-intro.html#Repetitive-execution
> [4] https://secure.wikimedia.org/wikipedia/en/wiki/Function_(mathematics)
> [5] http://finzi.psych.upenn.edu/R/library/simFrame/html/runSimulation.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Life Cycle Assessment with R.

2011-07-26 Thread Mike Marchywka

Date: Mon, 25 Jul 2011 19:03:08 +0100
From: jbustosm...@yahoo.es
To: r-help@r-project.org
Subject: [R] Life Cycle Assessment with R.

Hello everyone,
There's something really important about climate change and how many 
institutions around the globe are looking for softwares solutions in order to 
achieve they (and everyone) needs to improve life conditions in all the planet.
Currently, they're many comercial softwares working with this important topic 
named as: "Life Cycle Assesment", monitoring carbon emition,  but as many of 
you may know, commercial softwares controlling or managing our planet could be 
another big mistake.
To sum up briefly, it could be a good idea creating a R package doing Life 
Cycle Assessment (if has not being created) in order to gain a better 
understanding and making these important decisions about global warming and how 
can we as humanity control how Carbon Footprint is measured by commercial or 
not commercial propouses.
Who knows if there's people working in Life Cycle Assesment (carbon emition) 
with R? or If there's someone interested in doing a package about it, please 
let me know!
I explain this here, because of  the R philosophy.Best wishes!José 
BustosChilean Biostatistician
 www.aespro.cl

[ my text below]
Well, generally R packages are more general purpose tools than specific 
applications such as this- although there may be an iphone save the world app 
LOL.  
I have no idea but usually the issue here is getting the required 
data in an open form. Many govt agencies think excel is "open" and that you 
would not want
to do an analysis that wasn't supported in such popular software. This comes up 
with financial data all the time, I even asked for account information
in csv and I got a reply, "thank you for asking about exporting to excel" and a 
detailed explanation that was completely irrelevant. Modelling of complicated
systems is often well complicated and predictions about the future involve 
assumptions that are often made to suit the needs of the immediate
analyst. On topics which involve money or emotion, getting unbiased analysis is 
impossible and getting data can be very difficult.  This list is
not designed for advocacy  or even discussion of analysis results but ability 
to get data in form usable by R may be or more general interest
to those seeking help on R.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Is there an R program that produces optimal solution/mix of multiple samples' varying volumes and values

2011-07-26 Thread Mike Marchywka







> Date: Mon, 25 Jul 2011 11:39:22 -0700
> From: lukescore...@gmail.com
> To: r-help@r-project.org
> Subject: [R] Is there an R program that produces optimal solution/mix of 
> multiple samples' varying volumes and values
>
> Sorry about the lengthy subject line.
> Anyone know of an R' program that can look at several sources' varying
> available volumes/amounts and their individual set of values, compared
> to a "target" range/curve for these values, to find the optimal
> mixture(s) of these possible sources for the desired curve, and for a
> specified amount? I hope that makes sense as a reader.

Well, whatever you are talking about you need to write some error as a function
of your fit parameters and decide if you can minimize it analytically or not.
If you can do it analytically, then there are probably matrix functions you 
want.
If not, there are several optimizers that should come up on google.

If you are trying to express some data in terms of nice basis set that
may be different than unknown collection of things. For example, if you
have sin/cos curves fft may work, polynomial something else etc. Often
when people ask questions like this, they try to fit to a collection of things
they think may work and then the optimizer gets stuck since it can't
optimize a and b for a*x + b*x etc so make sure your error function has a 
specific
"best" fit etc. Generally coding mistakes can be addressed on the list if you 
get that far.



>
> Thanks for your time.
> Luke

  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] xml2-config issues

2011-07-23 Thread Mike Marchywka















Date: Fri, 22 Jul 2011 20:06:34 -0600
From: abmathe...@gmail.com
To: r-help@r-project.org
Subject: [R] xml2-config issues


I'm trying to install the XML package on Ubuntu 10.10, and I keep getting
a warning message the XML could not be found and had non-zero exit
status. How can I fix this problem?

> install.packages()
Loading Tcl/Tk interface ... done
--- Please select a CRAN mirror for use in this session ---
Installing package(s) into ‘/home/amathew/R/i686-pc-linux-gnu-library/2.13’
(as ‘lib’ is unspecified)
trying URL '
http://streaming.stat.iastate.edu/CRAN/src/contrib/XML_3.4-0.tar.gz'
Content type 'application/x-gzip' length 896195 bytes (875 Kb)
opened URL
==
downloaded 875 Kb

* installing *source* package ‘XML’ ...
checking for gcc... gcc
checking for C compiler default output file name... a.out
checking whether the C compiler works... yes
checking whether we are cross compiling... no
checking for suffix of executables...
checking for suffix of object files... o
checking whether we are using the GNU C compiler... yes
checking whether gcc accepts -g... yes
checking for gcc option to accept ISO C89... none needed
checking how to run the C preprocessor... gcc -E
No ability to remove finalizers on externalptr objects in this verison of R
checking for sed... /bin/sed
checking for pkg-config... /usr/bin/pkg-config
checking for xml2-config... no
Cannot find xml2-config
ERROR: configuration failed for package ‘XML’
* removing ‘/home/amathew/R/i686-pc-linux-gnu-library/2.13/XML’

[ my text, hotmail won't highlight original  ]

you probably need to get libxml2 and then you should be able to run 
xml2-config from command line to verfiy it is there. This is a specific
pkg-config ( man pkg-config for details ). For some non-R things
yum or apt-get has not gotten the most recent versions but generally
your package manager should be just fine. 




The downloaded packages are in
‘/tmp/Rtmp2V4huR/downloaded_packages’
Warning message:
In install.packages() :
  installation of package 'XML' had non-zero exit status


Thank You,
Abraham

[[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cox model approximaions (was "comparing SAS and R survival....)

2011-07-22 Thread Mike Marchywka













> From: thern...@mayo.edu
> To: abouesl...@gmail.com
> Date: Fri, 22 Jul 2011 07:04:15 -0500
> CC: r-help@r-project.org
> Subject: Re: [R] Cox model approximaions (was "comparing SAS and R 
> survival)
>
> For time scale that are truly discrete Cox proposed the "exact partial
> likelihood". I call that the "exact" method and SAS calls it the
> "discrete" method. What we compute is precisely the same, however they
> use a clever algorithm which is faster. To make things even more
> confusing, Prentice introduced an "exact marginal likelihood" which is
> not implemented in R, but which SAS calls the "exact" method.
>
> Data is usually not truly discrete, however. More often ties are the
> result of imprecise measurement or grouping. The Efron approximation
> assumes that the data are actually continuous but we see ties because of
> this; it also introduces an approximation at one point in the
> calculation which greatly speeds up the computation; numerically the
> approximation is very good.
> In spite of the irrational love that our profession has for anything
> branded with the word "exact", I currently see no reason to ever use
> that particular computation in a Cox model. I'm not quite ready to
> remove the option from coxph, but certainly am not going to devote any
> effort toward improving that part of the code.
>
> The Breslow approximation is less accurate, but is the easiest to
> program and therefore was the only method in early Cox model programs;
> it persists as the default in many software packages because of history.
> Truth be told, unless the number of tied deaths is quite large the
> difference in results between it and the Efron approx will be trivial.
>
> The worst approximation, and the one that can sometimes give seriously
> strange results, is to artificially remove ties from the data set by
> adding a random value to each subject's time.

Care to elaborate on this at all? First of course I would agree that doing
anything to the data, or making up data, and then handing it to an analysis 
tool that doesn't
know you maniplated it can be a problem ( often called interpolation or 
something 
with a legitimate name LOL).  However, it is not unreasonable to do a 
sensitivity
analysis by adding noise and checking the results.  Presumaably adding 
noise to remove things the algorighm doesn't happen to like would
work but you would need to take many samples and examine stats
of how your broke the ties. Now if the model is bad to begin with or
the data is so coarsely binned that you can't get much out of it then ok.

I guess in this case, having not thought about it too much, ties would
be most common either with lots of data or if hazards spiked over time scales 
simlar to your
measurement precision or if the measurement resolution is not comparable to 
hazard
rate. In the latter 2 cases of course the approach is probably quite  limited . 
Consider turning
exponential curves into step functions for example. 

>
> Terry T
>
>
> --- begin quote --
> I didn't know precisely the specifities of each approximation method.
> I thus came back to section 3.3 of Therneau and Grambsch, Extending the
> Cox
> Model. I think I now see things more clearly. If I have understood
> correctly, both "discrete" option and "exact" functions assume "true"
> discrete event times in a model approximating the Cox model. Cox partial
> likelihood cannot be exactly maximized, or even written, when there are
> some
> ties, am I right ?
>
> In my sample, many of the ties (those whithin a single observation of
> the
> process) are due to the fact that continuous event times are grouped
> into
> intervals.
>
> So I think the logistic approximation may not be the best for my problem
> despite the estimate on my real data set (shown on my previous post) do
> give
[[elided Hotmail spam]]
> I was thinking about distributing the events uniformly in each interval.
> What do you think about this option ? Can I expect a better
> approximation
> than directly applying Breslow or Efron method directly with the grouped
> event data ? Finally, it becomes a model problem more than a
> computationnal
> or algorithmic one I guess.
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
  
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Different result of multiple regression in R and SPSS

2011-07-19 Thread Mike Marchywka

> From: dwinsem...@comcast.net
> To: seoulseoulse...@gmail.com
> Date: Tue, 19 Jul 2011 18:45:47 -0400
> CC: r-help@r-project.org
> Subject: Re: [R] Different result of multiple regression in R and SPSS
>
>
> On Jul 19, 2011, at 6:29 PM, J. wrote:
>
> > Thanks for the answer.
> >
> > However, I am still curious about which result I should use? The
> > result from
> > R or the one from SPSS?
>
> It is becoming apparent that you do not know how to use the results
> from either system. The progress of science would be safer if you get
> some advice from a person that knows what they are doing.
>
> > Why the results from two programs are different?
>
> Different parametrizations. If I had to guess I would bet that the
> gender coefficient is R is exactly twice that of the one from SPSS.
> They are probably both correct in the context of their respective
> codings.

I guess I would also suggest, again, run some samples with known data sets
and see what you get(RSSWKDSASWYG). You would want to do this anyway if you 
want to insure
your real data is being used reasonably. You still need to have some way to 
check your
opinion from the expert mentioned above and known data will help there too.  A 
factor
of 2 often shows up from just looking at pictures once you have some intuition. 
I've
often been wrong on intuition, but chasing it down and proving it helps you 
learn a lot :)

>
> --
> David Winsemius, MD
> West Hartford, CT
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

< 1 2 3 4 5 6 7 8 9 >

101 - 200 of 858 matches

Mail list logo