[R] smoothScatter problems

2009-07-27 Thread Jeroen van der Ham

Hello,

I'm having some trouble getting a good result for a smoothScatter plot.
I have some data that I want to log-plot, but when I use smoothScatter
the result is not correct.
The problem seems to be that with the log=x argument smoothScatter
calculates the bins linearly, so the plot will be skewed towards the right.

See for example:

file(http://dckd.nl/~jeroen/drop/example.rdata;)
smoothScatter(d,log=x)
smoothScatter(log(d$x),d$y)

I could also use the latter way to produce the result, but I would like
to use the original units on the x-axis, not the log units.

I also want to use these results in a publication, but the PDFs saved
from these results are not very nice. With Acrobat there are lots of
blocks, and with other viewers there are many white lines through the
blue colours.

Thanks,
Jeroen.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ROC curve using epicalc (after logistic regression) (re-sent)

2009-07-27 Thread Clifford Long
Dear R-help,

I am resending as I believe I screwed up the e-mail address to R-help
earlier.  Sorry for my lack of attention to detail, and for any
inconvenience.

I have also sent the question to the package maintainer, as suggested
in the posting guide.

Regards,

Cliff



-- Forwarded message --
From: Clifford Long gnolff...@gmail.com
Date: Sun, Jul 26, 2009 at 8:46 PM
Subject: Fwd: ROC curve using epicalc (after logistic regression)
To: cvira...@medicine.psu.ac.th


Dear Virasakdi Chongsuvivatwong,

After sending the message below to the R-help mailing list, it
occurred to me that I probably should also have sent a copy to you,
per R posting guidance.

I would be interested in any thoughts or suggestions that you might
have regarding my difficulty using the ROCR routine in the epicalc
package.  (I've used this before, and find it to be a very helpful
package ... thanks.)

Is my issue related to the way the data is structured for the glm
routine - meaning not with individual cases, but instead by counts
(per DOE treatment) of pass, fail, and total?

Or perhaps I've made another error?

I'll understand if you don't have the time to look this over.  In case
you do, any direction/guidance will be appreciated.

Thank you for your time, and for this excellent package.

Regards,

Cliff Long




-- Forwarded message --
From: Clifford Long gnolff...@gmail.com
Date: Sun, Jul 26, 2009 at 3:52 PM
Subject: ROC curve using epicalc (after logistic regression)
To: R-help@r-project.org


Dear R-help list,

I'm attempting to use the ROC routine from the epicalc package after
performing a logistic regression analysis.  My code is included after
the sessionInfo() result.  The datafile (GasketMelt1.csv) is attached.
 I updated both R and the epicalc packages and tried again before
sending this request.

sessionInfo result:

R version 2.9.1 (2009-06-26)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] splines   stats     graphics  grDevices utils     datasets  methods
[8] base

other attached packages:
[1] caret_4.19      lattice_0.17-25 epicalc_2.9.1.2 survival_2.35-4
[5] foreign_0.8-36

loaded via a namespace (and not attached):
[1] grid_2.9.1  tools_2.9.1


Header information from package 'epicalc':
Package:            epicalc
Version:            2.9.1.2
Date:               2009-07-14


My code ...

#
#  Logistic Regression   (the model result is as expected)
#

dfile = 'GasketMelt1.csv'
gmelt.df = read.csv(dfile, header = TRUE, as.is = TRUE)
names(gmelt.df)

gmelt.df$p = gmelt.df$Pass / gmelt.df$Total

gmelt.glm = glm(p ~ Time + Temperature + Depth
                       + Time*Temperature + Time*Depth + Temperature*Depth,
                       family = binomial(link = logit), data=gmelt.df,
weight=Total)
summary(gmelt.glm)

#
#  ROC
#
library(epicalc)

lroc(gmelt.glm, graph = TRUE, line.col = red)


The error message:

 lroc(gmelt.glm, graph = TRUE, line.col = red)
Error in dimnames(x) - dn :
 length of 'dimnames' [2] not equal to array extent



Have I overlooked something?


Many thanks to anyone who might have a suggestion.

Cliff
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Version 0.7 of package tsDyn, nonlinear time series

2009-07-27 Thread Matthieu Stigler

Hi

Version 0.7 of package tsDyn presented at useR! 2009  is now on CRAN, 
extended with several new features.


The package tsDyn is aimed at estimating nonlinear time series models 
which exhibit regime specific properties. The regime switching dynamics 
can either be described by smooth transition (STAR and LSTAR) or 
threshold effects (SETAR). The package furthermore offers nonlinear 
models such as neural networks (NNET), and additive autoregressive (AAR) 
models.


The version 0.7 enhances the functionalities for the threshold 
autoregression models (SETAR) by extending the estimation techniques, 
providing a complete testing framework and finally allowing its use in a 
multivariate framework (generalizing VAR and VECM models).


Those new features are particularly interesting in economic applications 
as they generalize the concept of cointegration: with threshold 
cointegration, variables are still meant to share a long-run 
relationship, but their adjustment need not occur instantaneously but 
only after the deviation exceeds some critical threshold. This allows 
the model to take into account possible effects of transaction costs or 
stickiness of prices. Further, it permits us to capture asymmetries in 
the adjustment process, where positive or negative deviations are not 
corrected to the same extent.


A second vignette available at 
http://code.google.com/p/tsdyn/wiki/ThresholdCointegration offers an 
overview of the threshold cointegration framework and describes the 
implemented functions. It can serve as an ideal introduction for people 
interested in discovering this field.



Main new features include:
-added possibilty to have two thresolds and hence three regimes in setar 
and selectSETAR (arg nthresh)

-new functions for unit roots tests: KapShinTest() and BBCTest()
-new functions for estimating VAR and VECM: lineVar
-new function for estimating TVECM: function TVECM()
-new function for estimating TVAR: function TVAR()
-new function to test for setar: function setarTest()
-new function to test for TVAR: function TVAR.LRtest()
-new function to test for TVECM: functions TVECM.SeoTest() and 
HanSeo_TVECM()

-new function to simulate/bootstrap a TVAR: function TVAR.sim()
-new function to simulate/bootstrap a TVECM: function TVECM.sim()
-new function to simulate/bootstrap a setar: function setar.sim()
-new function to estimate regime-specific variance in setar: function 
resVar()
-new function to extend a bootstrap replication in setarTest: function 
extendBoot()
-added in selectSETAR() and setar() following args: include, common, 
model, trim, MM, ML, MH, model, restriction
-added in selectSETAR(): criterion SSR (sum of squares residual) and 
argument max.iter
-extended arg th in selectSETAR to search inside an intervall or around 
a point or on the whole grid


Note that minor fixes/improvements are expected soon, those will be only 
reported in the specific mailing list:  ts...@googlegroups.com


Any comments/remarks, suggestions and bug reports are sure welcome!

Matthieu Stigler

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2009-07-27 Thread NIVEEN samy
 
 







Hi
 
I am Niveen Samy. I am interesting by Linear Programming problem, and I want a 
program to solve it using any language as Java ,pascal,c++,Miranda functional 
programming language or any language can I learn it. If u have an already 
solution to this problem please send it to me and explain how can i use it. I 
work by Linux and windows.
 
thank u very much.
please replay to me.
with best wishes


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: Qury Related With R

2009-07-27 Thread bed.si...@oracle.com
Hi Romain,

  

  Attached is my R script that script I put into the R work space and 
through source(RScriptToCallJava.R) command I call the script and my java 
application is execute.

  Is it the proper way to call the java application? If not, then please 
can you explain the directory structure that needs for java application?

  In java I used the property file. but when I load the property file using 
.jaddClassPath(D:/R_BTE_Jar/BTE/app.properties) command, it is not load.  

  

Thanks Romain for your response. 

 

Bed Singh 

 

-Original Message-
From: Romain Francois [mailto:romain.franc...@dbmail.com] 
Sent: Saturday, July 25, 2009 6:32 PM
To: bed.si...@oracle.com
Cc: r-help@r-project.org
Subject: Re: [R] FW: Qury Related With R

 

Hi,

 

The file did not make it through the mailing list. Maybe you are looking

for ?read.dcf

Can you describe the way your application interacts with R.

Romain

 

On 07/25/2009 10:35 AM, bed.si...@oracle.com wrote:

 Hi,

 

   I am using the R-2.9.1 with Window XP.

  Queries:

 1.   I am running the java application which needs to load property file 
 in R.

  So can you please tell me how I can load my property file in R 
 session so that my application can find that property file?

  Attached is my property file for sample.

 2.   Is there any directory structure required for java application in R 
 format?

 

 Thanks  Regards,

 Bed Singh

 

 From: ericdov...@gmail.com [mailto:ericdov...@gmail.com] On Behalf Of Eric 
 Doviak

 Sent: Friday, July 24, 2009 8:42 PM

 To: bed.si...@oracle.com

 Subject: Re: Qury Related With R

 

 

 

 Hi Bed,

 

 I'm sorry. I simply don't know.

 Your best bet would be to ask on R-help:  r-help@r-project.org

 

 

 

 Good luck,

 - Eric

 

 bed.si...@oracle.com wrote:

 

 Hi Eric,

 

   I am using the R-2.9.1 with Window XP.

 

   Queries:

 

 I am running the java application which needs to load app.property file in R.

 

 So can you please tell me how I can load my property file in R session so 
 that my application can found that property file?

 

 Attached is my property file.

 

 Is there any directory structure required for java application in R format?

 Please help me. Thanks in Advance Eric.

 

Regards,

 

 Oracle logo.gif

 Bed Singh 

 Oracle Financial Services PrimeSourcing

 Mumbai, India

 Oracle Financial Services Software Limited was formerly i-flex solutions 
 limited.

 

Romain Francois

Independent R Consultant

+33(0) 6 28 91 30 30

http://romainfrancois.blog.free.fr

|- http://tr.im/tlNb : RGG#155, 156 and 157

|- http://tr.im/rw0p : useR! slides

`- http://tr.im/rw0b : RGG#154: demo of atomic functions

 

 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] [R-pkgs] Beta Verson of tikzDevice Released!

2009-07-27 Thread Cameron Bracken
The tikzDevice package provides a new graphics device for R which enables
direct output of graphics in a LaTeX-friendly way. The device output
consists of files containing instructions for the TikZ graphics language and
may be imported directly into LaTeX documents using the \input{} command.

The beta version of tikzDevice is now available here:
https://r-forge.r-project.org/R/?group_id=440

An additional location for downloading source tarballs and windows binaries
is:http://github.com/Sharpie/RTikZDevice/downloads

There are many significant improvements compared to the alpha version:

Features:

- Rd documentation
- A vignette
- Proper string placement (because of string width and character metric
calculations via latex)
- Custom LaTeX headers, footers and typesetting engines
- R-Level Annotation of graphics with TikZ commands (see
http://www.texample.net for great examples of using TikZ commands)



Limitations:

- ASCII character support only
- No recognition of the R symbol font (i.e. no plotmath symbols)
- A bevy of other quirks and personality traits that will make themselves
known in time

The device requires a working installation of LaTeX and the TIkZ package in
order to function. This is because font metrics are currently calculated
through direct calls to the LaTeX compiler. Unfortunately, this results in
some significant computational overhead- it may take several seconds to
create a plot that contains a lot of text. In an attempt to offset this
behavior, the tikzDevice uses the filehash package to store font metrics
that it has already computed. Hopefully the more the device is used, the
faster it will be. We suggest reviewing the package vignette, especially the
section R Options That Affect Package Behavior  for more information on how
the caching process works.

We think the package is quite usable as it is, but there are surely many
bugs that we don't know about. We welcome bug reports at our R-Forge
tracker: https://r-forge.r-project.org/tracker/?group_id=440

Enjoy!

- The tikzDevice Team

[[alternative HTML version deleted]]

___
R-packages mailing list
r-packa...@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-packages
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: Qury Related With R

2009-07-27 Thread Romain Francois


Hi,

I think you can just do something like read the parameters into R, and 
then use the parameters argument of the .jinit function. Something 
like this perhaps:


props - readLines( app.properties )
props - strsplit( gsub( \\\t, , grep( =, props, value = TRUE ) 
), = )

params - sapply( props, function(x){
sprintf( -D%s=%s, x[1], x[2] )
} )
library(rJava)
.jinit(classpath=D:/R_BTE_Jar/BTE/BackTestingApp.jar,
parameters= c( -Xmx512m, params ) )

Let me know if this works.

Romain


On 07/27/2009 06:02 AM, bed.si...@oracle.com wrote:

Hi Romain,

Attached is my R script that script I put into the R work space and
through source(“RScriptToCallJava.R”) command I call the script and my
java application is execute.

Is it the proper way to call the java application? If not, then please
can you explain the directory structure that needs for java application?

In java I used the property file. but when I load the property file
using .jaddClassPath(D:/R_BTE_Jar/BTE/app.properties) command, it is
not load.

Thanks Romain for your response.

Bed Singh

-Original Message-
From: Romain Francois [mailto:romain.franc...@dbmail.com]
Sent: Saturday, July 25, 2009 6:32 PM
To: bed.si...@oracle.com
Cc: r-help@r-project.org
Subject: Re: [R] FW: Qury Related With R

Hi,

The file did not make it through the mailing list. Maybe you are looking

for ?read.dcf

Can you describe the way your application interacts with R.

Romain

On 07/25/2009 10:35 AM, bed.si...@oracle.com wrote:

  Hi,

 

  I am using the R-2.9.1 with Window XP.

  Queries:

  1. I am running the java application which needs to load property
file in R.

  So can you please tell me how I can load my property file in R
session so that my application can find that property file?

  Attached is my property file for sample.

  2. Is there any directory structure required for java application in
R format?

 

  Thanks Regards,

  Bed Singh

 

  From: ericdov...@gmail.com [mailto:ericdov...@gmail.com] On Behalf Of
Eric Doviak

  Sent: Friday, July 24, 2009 8:42 PM

  To: bed.si...@oracle.com

  Subject: Re: Qury Related With R

 

 

 

  Hi Bed,

 

  I'm sorry. I simply don't know.

  Your best bet would be to ask on R-help: r-help@r-project.org

 

 

 

  Good luck,

  - Eric

 

  bed.si...@oracle.com wrote:

 

  Hi Eric,

 

  I am using the R-2.9.1 with Window XP.

 

  Queries:

 

  I am running the java application which needs to load app.property
file in R.

 

  So can you please tell me how I can load my property file in R
session so that my application can found that property file?

 

  Attached is my property file.

 

  Is there any directory structure required for java application in R
format?

  Please help me. Thanks in Advance Eric.

 

Regards,

 

  Oracle logo.gif

  Bed Singh

  Oracle Financial Services PrimeSourcing

  Mumbai, India

  Oracle Financial Services Software Limited was formerly i-flex
solutions limited.

Romain Francois

Independent R Consultant

+33(0) 6 28 91 30 30

http://romainfrancois.blog.free.fr

|- http://tr.im/tlNb : RGG#155, 156 and 157

|- http://tr.im/rw0p : useR! slides

`- http://tr.im/rw0b : RGG#154: demo of atomic functions




--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/tlNb : RGG#155, 156 and 157
|- http://tr.im/rw0p : useR! slides
`- http://tr.im/rw0b : RGG#154: demo of atomic functions

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] help about package reldist (Relative Distribution)

2009-07-27 Thread 謝逸芝
Dear R users:

i try to use package reldist to measure wage distribution.

In package reldist :

y mean sample from comparison distribution

yo mean sample from reference distribution

but I would like to compare more than two years ( total of fifteen years,
from 1979, 1981, 1983..to 2007)

how should i correct my programs, then i could compare fifteen year's wage
distribution?


fig2b - reldist(y=mu1981$b1,yo=mu1979$b1,ci=F,smooth=0.4,
  yowgt=mu1979$weight2,ywgt=mu1981$weight2,
  bar=TRUE,
  yolabs=seq(-1,3,by=0.5),
  ylim=c(0,2.5),cex=0.8,
  ylab=Relative Density,
  xlab=Proportion of the Original Cohort)
  title(main=Fig2(b),cex=0.6)


Any help would be very appreciated !!


Regards, Hsieh

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] smoothScatter problems

2009-07-27 Thread Jeroen van der Ham
My humble apologies for double posting. I made an error with my 
subscription and erroneously thought that my message was not sent to the 
list.


Jeroen.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] 64 bit compiled version of R on windows

2009-07-27 Thread huang min
Did anybody try the REvolution R Enterprise 2.0? Is it worthwhile to buy for
the 64bit windows OS? Thanks.

Huang

On Tue, Mar 31, 2009 at 2:43 AM, Duncan Murdoch murd...@stats.uwo.cawrote:

 On 3/30/2009 12:46 PM, Vadlamani, Satish {FLNA} wrote:

 Hi:
 1) Does anyone have experience with 64 bit compiled version of R on
 windows? Is this available or one has to compile it oneself?
 2) If we do compile the source in 64 bit, would we then need to compile
 any additional modules also in 64 bit?


 R for Windows is compiled using the MinGW port of gcc, and the 64 bit
 version of that compiler is not really ready for general use yet, so
 compiling for 64 bits is not completely straightforward.  Revolution
 Computing has announced on the R-devel list that they are beta testing a
 build, with some information at

 http://www.revolution-computing.com/products/windows-64bit.php

 The page says it is scheduled for release at the end of March, so there
 should be something available soon.

 Duncan Murdoch



 I am just trying to prepare for the time when I will get larger datasets
 to analyze. Each of the datasets is about 1 GB in size and I will try to
 bring in about 16 of them in memory at the same time. At least that is the
 plan.

 I asked a related question in the past and someone recommended the product
 RevolutionR - I am looking into this also. If you can think of any other
 options, please mention. I have not been doing low level programming for a
 while now and therefore, the self compilation on windows would be the least
 preferable (and then I have to worry about how to compile any modules that I
 need). Thanks.

 Thanks.
 Satish

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] problems hist() and density

2009-07-27 Thread Jan Teichmann
Thank you. I'm sorry for my question. Sure, I've to integrate a density
function to get the probabilities... I didn't noticed the small breaks
and that's why I was confused. I expected a histogram with probabilities
for the realizations.

Am Sonntag, den 26.07.2009, 14:27 -0400 schrieb jim holtman:
 sums to one  I should have said.
 
 On Sun, Jul 26, 2009 at 7:43 AM, Jan
 Teichmannjan.teichm...@googlemail.com wrote:
  Hello,
 
  I have a problem with the hist() function and showing densities. The
  densities sum to 50 and not to 1! I use R version 2.9.1 (2009-06-26) and
  I load the seqinR library.
 
  My data is the following vector:
  [1] 0.140 0.200 0.220 0.2828283 0.160 0.160
  0.360
  [8] 0.160 0.220 0.260 0.200 0.300 0.220
  0.2342342
  [15] 0.180 0.220 0.160 0.230 0.200 0.220
  0.240
  [22] 0.200 0.220 0.220 0.260 0.200 0.160
  0.220
  [29] 0.2342342 0.200 0.220 0.200 0.200 0.140
  0.180
  [36] 0.220 0.160 0.160 0.140 0.220 0.200
  0.2871287
  [43] 0.290 0.200 0.1836735 0.200 0.200 0.290
  0.240
  [50] 0.220 0.280 0.200 0.2745098 0.220 0.230
  0.180
  [57] 0.230 0.180 0.260 0.220 0.222 0.220
  0.260
  [64] 0.220 0.220 0.260 0.220 0.200 0.220
 
  I use the following command:
  tmp - hist(data, freq=FALSE, plot=FALSE)
 
  and that's the result:
  $breaks
   [1] 0.14 0.16 0.18 0.20 0.22 0.24 0.26 0.28 0.30 0.32 0.34 0.36
 
  $counts
   [1] 10  4 15 19  8  5  2  5  0  0  1
 
  $intensities
   [1]  7.2463754  2.8985507 10.8695652 13.7681159  5.7971014  3.6231884
   [7]  1.4492754  3.6231884  0.000  0.000  0.7246377
 
  $density
   [1]  7.2463754  2.8985507 10.8695652 13.7681159  5.7971014  3.6231884
   [7]  1.4492754  3.6231884  0.000  0.000  0.7246377
 
  $mids
   [1] 0.15 0.17 0.19 0.21 0.23 0.25 0.27 0.29 0.31 0.33 0.35
 
  $xname
  [1] data
 
  $equidist
  [1] TRUE
 
  attr(,class)
  [1] histogram
 
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.
 
 
 
 
 
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Strange Memory issue

2009-07-27 Thread Noah Silverman
Hi,

I am testing out some things with the kernlab library.

The dataframe is 22,000 rows of 32 columns.

The command I execute is:

model - ksvm(label ~ ., data = traindata, type=C-svc, kernel = 
rbfdot, class.weights= c(0 =1, 1 =3),  kpar = automatic, C = 10, 
cross = 3, prob.model = TRUE)


I have both a Macintosh and  linux machine running Fedora.

The Macintosh has 2GB of RAM.

On the Macintosh, this command runs and completes without any error.

The Linux machine has 4GB of RAM

On the Linux machine, I get a memory error:  Error: cannot allocate 
vector of size 988.7 Mb


Can anybody help me figure out why??

Thanks..



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Rd] How to create a permanent dataset in R.

2009-07-27 Thread Petr PIKAL
Hi

I use saved R file stand.R which looks like this

stand -
list(   b110 = c(49.45000, 21.46, 11.468333, 24.33500,  28.112240),
b120 = c(49.77333, 19.386667,  7.736667, 20.87500,  21.753788),
b130 = c(49.60833, 18.365000,  5.708333, 19.23167,  17.265409),
b140 = c(50.61333, 15.528333,  2.02, 15.66167,   7.419889),
b160 = c(51.25833, 14.026667,  0.50, 14.03667,   2.032885),
b180 = c(55.10167,  9.076667, -3.281667,  9.66000, -19.887711),
b222 = c(56.14500, 14.52,  3.425000, 14.91667,  13.271665),
b225 = c(53.43833, 16.168333,  4.695000, 16.8,  16.189711),
e7898 =c(50.44200, 17.496000,  5.238000, 18.26400,  16.66),
lab.nk  = c(50.7, 20.29000, 14.04, 24.67, 34.68), 
mapico = c(65.89, 13.065, 21.07, 24.79, 58.195), 
tp100.pr = c(47.88250, 21.09500, 12.697500),
tp100old = c(49.61,22.71,13.93,26.64,31.52),
tp100 = c(49.08,22.24,11.19,24.89402335,26.71196071),
tp200 = c(49.46, 18.5, 8.88, 20.52082844, 25.64100582),
tp200ro = c(49.32, 18.60, 9.85, 21.05, 27.90),
tp302 = c(49.5, 19.7, 10.8, 22.47, 28.73), 
tp303.00 = c(49.41000, 16.76000,  6.39, 17.9368, 20.8701), 
tp303.40 = c(48.72000, 15.09000, 5.43))

and is in my personal function package
and call data(stand) in Rprofile.site to recover it in each session

Regards
Petr

r-help-boun...@r-project.org napsal dne 24.07.2009 20:06:57:

 (redirecting to r-help; it seems more appropriate for such a question)
 Hello,
 
 On Fri, Jul 24, 2009 at 8:11 AM, Albert 
EINstEINsateeshvar...@gmail.com wrote:
  Actually, we know that If we create a dataset in R ,after closing the
  session the dataset automatically is closed. I tried for creating 
dataset
  permanently in R.I am facing the problem of creating permanent 
dataset.
  can we create dataset permanently If possible?
 
 I am not sure what exactly you want to do, but you could use
 save.image() to save R's workspace and re-load it when re-opening R
 via load(). Otherwise you can use Rcmdr or JGR to graphically, that is
 via point-and-click, perform these operations.
 Liviu
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Open/Close Console

2009-07-27 Thread Haeru Naemolaes
Dear list,

one of my procedures is highly memory intensive (time-dependent spatially 
distributed data). Unfortunately, I could only tackle this problem with the 
dull idea of opening a new console window, running the procedure, and closing 
the new window.

#previous procedure
#open new console
    gui_path - C:/Programs/R/R-2.9.0/bin/Rgui.exe
    shell.exec(gui_path)
    memory.limit(size = 4000)  
    #run the memory intensive procedure
    quit(save = no)
#subsequent procedure

It works, when I send my code line-by-line from TINN-R to R. But it fails 
(=executes the code in the old window) when I send the entire code at once. I 
am grateful for any hints!

Regards,
Haeru


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] labelling points plotted in a 2D plan

2009-07-27 Thread Petr PIKAL
Hi

r-help-boun...@r-project.org napsal dne 25.07.2009 23:15:04:

 Thanks for the answer Tal!
 But I can't get it to work correctly! :(
 Please bear with me this is the first time I am using R! and I am in a 
rush
 to correct a paper
 in fact on the plane I am plotting a table
  fullpointed=read.table(fullpoints_backup.txt,h=F)
   plot(range(-2.5,0.95),range(0.00,1.00),type=n,axes=TRUE)

Most probably you do not want axes labeled with range(-2.5,.95). Look at 
xlab and ylab parameters and also to xlim and ylim parameters

plot(0, 0, xlim=range(-2.5,0.95),ylim=range(0.00,1.00), xlab=bla, 
ylab=blabla,type=n,axes=TRUE)

 and in this table there are 300 points
 I want to label the first 175 points with A and the others with S
 I couldn't figure how to configure correctly labels.to.plot -
 sample(c(A,B), 100, replace = T) and text(x, y , labels =
 labels.to.plot) ?
 
 for instance:
 0,48875 0,142857143  the point plotted will be labelled a
 0,409 0,142857143  the point plotted will be labelled a
 0,45611 0,25 labelled a
 0,49833 0,2  labelled a

It is hard to tell what type of data you really have. Output from

str(fullpointed)
could help.

In case you have numeric values

text(fullpointed[,1], fullpointed[,2], labels=c(rep(A, 175), 
rep(S,125)))

(untested) could do what you want. But maybe I am completely wrong because 
I do not know what first 175 points means.

Regards
Pter



  #the first 175
 
 0,61158 0,125labelled S
 0,5709 0,125labelled S
 0,53266 0,125labelled S
 
 # the remaining
 Regards
 
 
 
 
 On Sat, Jul 25, 2009 at 5:32 PM, Tal Galili tal.gal...@gmail.com 
wrote:
 
  Sure,
 
  Here is an example:
 
  # get some random data to play with
  x - runif(100)
  y - runif(100)
  labels.to.plot - sample(c(A,B), 100, replace = T)
 
  # set up the window, play them one by one to see what they do
  plot.window(ylim = c(0,1), xlim = c(0,1))
  plot.new()
  axis(1)
  axis(2)
  box()
 
  # plot the things you wished to plot, where you wanted them plotted
  text(x, y , labels = labels.to.plot)
 
 
 
  Cheers,
  Tal
 
 
 
 
 
 
 
 
 
 
  On Sat, Jul 25, 2009 at 7:20 PM, Khaled OUANES koua...@gmail.com 
wrote:
 
  hey
  thanks for the answer but I couldn't achieve it? would you explain a 
bit
  more?
  I have like 300 points to label!
  thanks
 
 
 
 
  --
  --
 
 
  My contact information:
  Tal Galili
  Phone number: 972-50-3373767
  FaceBook: Tal Galili
  My Blogs:
  http://www.r-statistics.com/
  http://www.talgalili.com
  http://www.biostatistics.co.il
 
 
 
 
[[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Rd] How to create a permanent dataset in R.

2009-07-27 Thread Liviu Andronic
On 7/27/09, Albert EINstEIN sateeshvar...@gmail.com wrote:
can we create our own packages in R. It would be very helpful for us, if
  you provide any information regarding this.

Yes. See this [1].
Liviu

[1] http://cran.r-project.org/doc/manuals/R-exts.pdf

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] can we create our own packages in R?

2009-07-27 Thread Albert EINstEIN

Hi,
can someone  provide me information regarding ; Here we are reloading the
data when we start the R, In R there are some default packages with datasets
.Can we create a dataset permanently in any packages so that we can get the
dataset without reloading,like creating dataset in a permanent library in
SAS ?

   can we create our own packages in R. It would be very helpful for us, if
someone provide any information regarding this.

Thank you in advance
-- 
View this message in context: 
http://www.nabble.com/can-we-create-our-own-packages-in-R--tp24677043p24677043.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Rd] How to create a permanent dataset in R.

2009-07-27 Thread Albert EINstEIN


Hi,

Thank you very much all of you for giving me valuable solution
-- 
View this message in context: 
http://www.nabble.com/Re%3A--Rd--How-to-create-a-permanent-dataset-in-R.-tp24649306p24677150.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] creating and populating an environment

2009-07-27 Thread Christian Prinoth
Hi, I often work with R by writing long(ish) Excel-VBA macros
interspersed with calls to R via RExcel. A typical example of this would
be:

Sub VBAMacro()
'fetch some data from an excel sheet
'do some basic stuff on said data
'transfer data from vba to R
'run some R statements
'get data back to vba
'show results on the excel sheet
'clean R by deleting all vars that were created: rrun
rm(a,b,c,)
end sub

This has two obvious disadvantages, as I have to make sure:
1) not to use R variable names which may already exist
2) to remove all variables (garbage collection)

In order to overcome these issues I was wondering if I should execute
all R statements inside the R macro in a separate namespace. I have
looked at new.env() but am not really sure how it is supposed to be
used. If I type temp-new.env(), how do I make sure that all variables
declared from then on end up in the temp environment? Once I am done,
is rm(temp) sufficient to get rid of all its content?

Basically, I would like to replace the above example with:
Sub VBAMacro()
rrun A-new.env()
'fetch some data from an excel sheet
'do some basic stuff on said data
'transfer data from vba to R
'run some R statements
'get data back to vba
'show results on the excel sheet
rrun rm(A)
end sub

Thanks
Christian Prinoth
DISCLAIMER:\ L'utilizzo non autorizzato del presente mes...{{dropped:17}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Forecasting Inflation

2009-07-27 Thread Dilip Bayas
Dear All,

I wanted to forecast Inflation for Indian Economy. please send what
techniques to be used after the variable selection. WPI, CPI, Money supply,
IIP, Interest rate and so on..How i can use R for the same

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] creating and populating an environment

2009-07-27 Thread Christian Prinoth
Hi, I often work with R by writing long(ish) Excel-VBA macros
interspersed with calls to R via RExcel. A typical example of this would
be:

Sub VBAMacro()
'fetch some data from an excel sheet
'do some basic stuff on said data
'transfer data from vba to R
'run some R statements
'get data back to vba
'show results on the excel sheet
'clean R by deleting all vars that were created: rrun
rm(a,b,c,)
end sub

This has two obvious disadvantages, as I have to make sure:
1) not to use R variable names which may already exist
2) to remove all variables (garbage collection)

In order to overcome these issues I was wondering if I should execute
all R statements inside the R macro in a separate namespace. I have
looked at new.env() but am not really sure how it is supposed to be
used. If I type temp-new.env(), how do I make sure that all variables
declared from then on end up in the temp environment? Once I am done,
is rm(temp) sufficient to get rid of all its content?

Basically, I would like to replace the above example with:
Sub VBAMacro()
rrun A-new.env()
'fetch some data from an excel sheet
'do some basic stuff on said data
'transfer data from vba to R
'run some R statements
'get data back to vba
'show results on the excel sheet
rrun rm(A)
end sub

Thanks
Christian Prinoth

DISCLAIMER:\ L'utilizzo non autorizzato del presente mes...{{dropped:16}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] read binary file seek()

2009-07-27 Thread Andreas Posch
I want to read in a binary file using the readBin() function. In order to
skip uninformative parts of the file I use the seek() function, I need to
specify the number of bits to skip rather than the number of bytes to skip. 

 

E.g. seek(to.read,origin=current,blockSize) 

with blockSize giving the number of bits 

 

Does anybody know if this works? Any help would be highly appreciated. Best,
A.

 

 


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dumping data objects

2009-07-27 Thread William Q Meeker


I am using R version 2.9.1 (2009-06-26) on Windows.

I am trying to dump some data objects from R so that I can 
subsequently import them into S-PLUS (version 6.2). Using dump(foo) 
appends Ls to integers, as explained in the documentation for deparseOpts.


Here is a simple example. I tired using

 .deparseOpts(control=S_compatible)
[1] 128
 dump(foo)

But still get:

C:\wqm\Runcat dumpdata.R
foo -
structure(list(Kilocycles = c(94L, 96L, 99L), Status = structure(c(2L,
2L, 1L), .Label = c(Censored, Failed), class = factor),
Weight = c(1L, 1L, 2L)), .Names = c(Kilocycles, Status,
Weight), class = data.frame, row.names = c(1, 2, 3))

I have also tried a number of different combinations of options to 
.deparseOpts, but the resulting file always has the appended Ls.


What am I missing?

Bill Meeker





William Q. Meeker
Department of Statistics
2109 Snedecor Hall
Iowa State University
Ames, Iowa 50011
Phone: 515-294-5336
Fax: 515-294-4040
Home Fax: 515-232-1323
www.public.iastate.edu/~wqmeeker

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] can we create our own packages in R?

2009-07-27 Thread Philipp Pagel
On Mon, Jul 27, 2009 at 02:32:24AM -0700, Albert EINstEIN wrote:
can we create our own packages in R. It would be very helpful for us, if
 someone provide any information regarding this.

Yes, you can. There is an entire manual documenting this:

http://cran.r-project.org/doc/manuals/R-exts.html

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
85350 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Conversion a ts time to another class.

2009-07-27 Thread Chuse chuse
Dear R collegues,

I am trying to change a ts time such as 2009.004 to a str or POSIX
class as 2009-01-01.
Is there any function or method to do it?. Thank you beforehand.

Chuse.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] read binary file seek()

2009-07-27 Thread jim holtman
readBin reads in bytes.  If you want to read starting at a variable
'bit' location, then you will have to write a function that will read
in the bytes and then shift the data the corresponding number of bits.
 Exactly what are you reading in and what processing do you want to do
on it.

On Mon, Jul 27, 2009 at 7:34 AM, Andreas Poschandreas.po...@tugraz.at wrote:
 I want to read in a binary file using the readBin() function. In order to
 skip uninformative parts of the file I use the seek() function, I need to
 specify the number of bits to skip rather than the number of bytes to skip.



 E.g. seek(to.read,origin=current,blockSize)

 with blockSize giving the number of bits



 Does anybody know if this works? Any help would be highly appreciated. Best,
 A.






        [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normal mixture model

2009-07-27 Thread Christian Hennig

Hi Cindy,

you need the summary function

mclustsummary - summary(mclustBICoutputobject,data)
to get all the information. Some (like best model) is given if you just 
print out the summary object. Some other information (like 
estimated parameter values) are accessible as components of the summary 
object, like

mclustsummary$parameters$...
Try

str(mclustsummary)

to see what's there (unfortunately this is not fully documented).

For more detail see the help pages.

Hope this helps,
Christian

On Sun, 26 Jul 2009, cindy Guo wrote:


Hi, Christian,

Thank you for the reply. I just tried. Does the function mclustBIC only give
the best model, or does it also do EM to get the cluster means and variances
according to the best model it picks? I didn't find it.  Is there a way to
automatically select the best number of components and do EM? Because I need
to do the normal mixture model in a loop (one EM at an iteration), so I want
it to do everything automatically.
Thanks,

Cindy

On Sun, Jul 26, 2009 at 3:46 PM, Christian Hennig chr...@stats.ucl.ac.ukwrote:


You can use mclustBIC in package mclust (uses the BIC for deciding about
the number of components and hierarchical clustering for initialisation).

Christian


On Sun, 26 Jul 2009, cindy Guo wrote:

  Hi, All,


I want to fit a normal mixture model. Which package in R is best for this?
I
was using the package 'mixdist', but I need to group the data into groups
before fitting model, and different groupings seem to lead to different
results. What other package can I use which is stable? And are there
packages that can automatically determine the number of components?

Thank you,

Cindy

   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche





*** --- ***
Christian Hennig
University College London, Department of Statistical Science
Gower St., London WC1E 6BT, phone +44 207 679 1698
chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Question about rpart decision trees (being used to predict customer churn)

2009-07-27 Thread Terry Therneau
-- begin included message ---
Hi,

I am using rpart decision trees to analyze customer churn. I am finding that
the decision trees created are not effective because they are not able to
recognize factors that influence churn. I have created an example situation
below. What do I need to do to for rpart to build a tree with the variable
experience? My guess is that this would happen if rpart used the loss matrix
while creating the tree.

 experience - as.factor(c(rep(good,90), rep(bad,10)))
 cancel - as.factor(c(rep(no,85), rep(yes,5), rep(no,5),
rep(yes,5)))
 table(experience, cancel)
  cancel
experience no yes
  bad   5   5
  good 85   5
 rpart(cancel ~ experience)
n= 100
node), split, n, loss, yval, (yprob)
  * denotes terminal node
1) root 100 10 no (0.900 0.100) *

I tried the following commands with no success.
rpart(cancel ~ experience, control=rpart.control(cp=.0001))
rpart(cancel ~ experience, parms=list(split='information'))
rpart(cancel ~ experience, parms=list(split='information'),
control=rpart.control(cp=.0001))
rpart(cancel ~ experience, parms=list(loss=matrix(c(0,1,1,0), nrow=2,
ncol=2)))

--- end inclusion 

  The program works fine with rpart(as.numeric(cancel) ~ experience), which 
does 
a fit to try and predict the probability of cancellation rather than a YES/NO 
decision for each node.  I usually find this more informative, particularly for 
early analysis.  Brieman et al in the original CART book refer to this as odds 
regression.  In this analysis, if a split leads to one child with 30% cancel 
and 
another with 5% cancellation the split is successful.  When using a factor as 
the y variable, this split is scored as useless, since the parent and both 
children are scored as NO.  
By adjusting the losses to be just right you can get your data to split.  
You need to make them such that 85/5 is predicted as 'no cancel' and 5/5 as 
'yes 
cancel'; 1:2 losses would suffice.  In the example where you set losses to 
1:1 both nodes are scored as a 'yes'.   

Terry Therneau

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Conversion a ts time to another class.

2009-07-27 Thread jim holtman
try this:

 x - 2009.004
 # get the number of days in the year
 days - unclass(as.POSIXct(paste(floor(x)+1, -1-1, sep='')) -
+ as.POSIXct(paste(floor(x), '-1-1', sep='')))
 # now compute the number of seconds from start of year
 current - as.POSIXct(paste(floor(x), '-1-1', sep='')) + days * (x %% 1) * 
 86400
 current
[1] 2009-01-02 11:02:23 EST



On Mon, Jul 27, 2009 at 8:00 AM, Chuse chusechus...@gmail.com wrote:
 Dear R collegues,

 I am trying to change a ts time such as 2009.004 to a str or POSIX
 class as 2009-01-01.
 Is there any function or method to do it?. Thank you beforehand.

 Chuse.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem that you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] what to do about face 1 at size 16 could not be loaded

2009-07-27 Thread Oliver Kullmann
Hi,

on some machines (all Linux, same behaviour with versions 2.9.0 and 2.9.1)
I get errors

 plot(E2)
Error in text.default(x, y, txt, cex = cex, font = font) :
  X11 font -adobe-helvetica-%s-%s-*-*-%d-*-*-*-*-*-*-*, face 1 at size 16 could 
not be loaded

(but not on others; the R-installation is always the same (from sources), while 
the Linux-distribution
should also be similar; the above failure occurs for Suse 11.0).

What to do about such errors?

Is it possible to tell R to ignore such errors (likely this is only about
some text, but by using options like ann=F I didn't succeed to get it
working; E2 above by the way is a data frame)?

Or could I make a complete installation of R, which includes all the fonts etc.
R expects to find? (Apparently there is no standard what fonts should be there,
and so failures in this area are to be expected, or?)

Thanks for your consideration.

Oliver

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [Rd] How to create a permanent dataset in R.

2009-07-27 Thread Albert EINstEIN


Thank you very much for your reply.

Yes. See this [1].
Liviu

[1] http://cran.r-project.org/doc/manuals/R-exts.pdf



-- 
View this message in context: 
http://www.nabble.com/Re%3A--Rd--How-to-create-a-permanent-dataset-in-R.-tp24649306p24678083.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] create dataset permanently in package (i.e. default or our own package)

2009-07-27 Thread Albert EINstEIN

Hi,
actually while opening R console and R commander we see some packages like
car and datasets. in this packages we have default datasets are available.
example: women and prestige like that. now i created a sales dataset
importing from excel, xml or text file. now i want to store that dataset
permanently  in any one of the package like i mentioned above (car or
datasets). now i closed my R session. after some time i opened R console and
R commander. Now I will not create again sales dataset.While clicking any
one of package that sales dataset should be found. 
if possible please give me the code it will be very helpful for us.

Thanks in advance.


-- 
View this message in context: 
http://www.nabble.com/create-dataset-permanently-in-package-%28i.e.-default-or-our-own-package%29-tp24679076p24679076.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] FW: Qury Related With R

2009-07-27 Thread bed.si...@oracle.com
Hi Romain,

 Exactly this is the script which I need. Thanks a lot for helping.
 Romain, Is there any way to modified a particular line in the property 
file through R script? If yes then please explain how.

Cheers! 
BS

-Original Message-
From: Romain Francois [mailto:romain.franc...@dbmail.com] 
Sent: Monday, July 27, 2009 12:53 PM
To: bed.si...@oracle.com
Cc: r-help@r-project.org
Subject: Re: [R] FW: Qury Related With R


Hi,

I think you can just do something like read the parameters into R, and
then use the parameters argument of the .jinit function. Something
like this perhaps:

props - readLines( app.properties )
props - strsplit( gsub( \\\t, , grep( =, props, value = TRUE )
), = )
params - sapply( props, function(x){
sprintf( -D%s=%s, x[1], x[2] )
} )
library(rJava)
.jinit(classpath=D:/R_BTE_Jar/BTE/BackTestingApp.jar,
parameters= c( -Xmx512m, params ) )

Let me know if this works.

Romain


On 07/27/2009 06:02 AM, bed.si...@oracle.com wrote:
 Hi Romain,

 Attached is my R script that script I put into the R work space and
 through source(RScriptToCallJava.R) command I call the script and my
 java application is execute.

 Is it the proper way to call the java application? If not, then please
 can you explain the directory structure that needs for java application?

 In java I used the property file. but when I load the property file
 using .jaddClassPath(D:/R_BTE_Jar/BTE/app.properties) command, it is
 not load.

 Thanks Romain for your response.

 Bed Singh

 -Original Message-
 From: Romain Francois [mailto:romain.franc...@dbmail.com]
 Sent: Saturday, July 25, 2009 6:32 PM
 To: bed.si...@oracle.com
 Cc: r-help@r-project.org
 Subject: Re: [R] FW: Qury Related With R

 Hi,

 The file did not make it through the mailing list. Maybe you are looking

 for ?read.dcf

 Can you describe the way your application interacts with R.

 Romain

 On 07/25/2009 10:35 AM, bed.si...@oracle.com wrote:

   Hi,

  

   I am using the R-2.9.1 with Window XP.

   Queries:

   1. I am running the java application which needs to load property
 file in R.

   So can you please tell me how I can load my property file in R
 session so that my application can find that property file?

   Attached is my property file for sample.

   2. Is there any directory structure required for java application in
 R format?

  

   Thanks Regards,

   Bed Singh

  

   From: ericdov...@gmail.com [mailto:ericdov...@gmail.com] On Behalf Of
 Eric Doviak

   Sent: Friday, July 24, 2009 8:42 PM

   To: bed.si...@oracle.com

   Subject: Re: Qury Related With R

  

  

  

   Hi Bed,

  

   I'm sorry. I simply don't know.

   Your best bet would be to ask on R-help: r-help@r-project.org

  

  

  

   Good luck,

   - Eric

  

   bed.si...@oracle.com wrote:

  

   Hi Eric,

  

   I am using the R-2.9.1 with Window XP.

  

   Queries:

  

   I am running the java application which needs to load app.property
 file in R.

  

   So can you please tell me how I can load my property file in R
 session so that my application can found that property file?

  

   Attached is my property file.

  

   Is there any directory structure required for java application in R
 format?

   Please help me. Thanks in Advance Eric.

  

 Regards,

  

   Oracle logo.gif

   Bed Singh

   Oracle Financial Services PrimeSourcing

   Mumbai, India

   Oracle Financial Services Software Limited was formerly i-flex
 solutions limited.

 Romain Francois

 Independent R Consultant

 +33(0) 6 28 91 30 30

 http://romainfrancois.blog.free.fr

 |- http://tr.im/tlNb : RGG#155, 156 and 157

 |- http://tr.im/rw0p : useR! slides

 `- http://tr.im/rw0b : RGG#154: demo of atomic functions



--
Romain Francois
Independent R Consultant
+33(0) 6 28 91 30 30
http://romainfrancois.blog.free.fr
|- http://tr.im/tlNb : RGG#155, 156 and 157
|- http://tr.im/rw0p : useR! slides
`- http://tr.im/rw0b : RGG#154: demo of atomic functions

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] nnet library and FANN package'm

2009-07-27 Thread lucmoulinier

Hello !

I'd like to know to which of the FANN package network corresponds the R nnet
network ?
In more details, what is the R nnet activation function, what is the
training algorithm (rprop, quickprop, ...) ? Also, it seems that the R nnet
decay parameter in nnet corresponds to the learning_rate parameter in
FANN. Correct ?

Many thanks in advance !

Luc Moulinier
IGBMC Strasbourg
-- 
View this message in context: 
http://www.nabble.com/nnet-library-and-FANN-package%27m-tp24680597p24680597.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] downsampling

2009-07-27 Thread Jan M. Wiener
Dear Philipp and R-Users,

thank you very much for the help.

However, both approx() and spline() seem to select the number of
required data points from the original data (at the correct positions,
of course) and ignore the remaining data points, as the following
example demonstrates:

 a= c(1,0,2,1,0)

 approx(a,n=3)
$x
[1] 1 3 5

$y
[1] 1 2 0

Essentially, what approx has done (spline does the same) is to simply
select the first, third, and fifth entry (as we want to downsample a 5
point vector into a three point vector). The second and fourth data
point are completely ignored. This can result in quite dramatic changes
of your data, if the data points selected by approx() or spline() happen
to be outliers and if you downsample data by a rather strong factor.

Best,
Jan



Philipp Pagel wrote:
 On Fri, Jul 24, 2009 at 03:16:58AM -0600, Warren Young wrote:
   
 Michael Knudsen wrote:
 
 On Fri, Jul 24, 2009 at 9:32 AM, Jan Wienerjan.wie...@tuebingen.mpg.de 
 wrote:

   
 x=sample(1:5, 115, replace=TRUE)

 How do I downsample this vector to 100 entries? Are there any R
 functions or packages that provide such functionality.
 
 What exactly do you mean by downsampling?
   
 It means that the original 115 points should be treated as a
 continuous function of x, or t, or whatever the horizontal axis is,
 with new values coming from this function at 100 evenly-spaced
 points along this function.
 

 There probably is a proper function for that and some expert will
 point it out. Until then I'll share my thoughts:

 # make up some data
 foo - data.frame(x= 1:115, y=jitter(sin(1:115/10), 1000))
 plot(foo)

 # use approx for interpolation
 bar - approx(foo, n=30)
 lines(bar, col='red', lwd=2)

 # or use spline for interpolation
 bar - spline(foo, n=30)
 lines(bar, col='green', lwd=2)

 # or fit a loess curve
 # had to play with span to make it look ok
 model - loess(y~x, foo, span=1/2)   
 x - seq(1, 115, length.out=30)
 bar - predict(model, newdata=data.frame(x=x, y=NA))
 lines(x, bar, col='blue', lwd=2)


 Jan, does that help a little?

 cu
   Philipp

   


-- 
Dr. Jan M. Wiener
Centre for Cognitive Science
University of Freiburg, Institute of Computer Science and Social Research (IIG)
Friedrichstr. 50, D-79098 Freiburg, GERMANY
-
e-mail: m...@jan-wiener.net
phone: ++49 (0)761 203 4951
url: www.jan-wiener.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] numbers on barplot

2009-07-27 Thread Mohsen Jafarikia
Hello all,
I have this simple barplot code:

ifn - id.dat
dat - read.table(ifn)
ofn - id.png

bitmap(ofn, type = png256, width = 30, height = 30, pointsize = 30, bg =
white,res=50)
par(mar=c(5, 5, 3, 2),lwd=5)
par(cex.main=1.6,cex.lab=1.6,cex.axis=1.6)

names(dat)-c(NumberOfPeople,Average)
Graph-barplot(dat$Average)
dev.off()

and here is the data (id.dat):

150.08
 60.09
 70.37

I want to write down the NumberOfPeople on top of each of the bars. Can
anybody help me on this?

Thanks,
Mohsen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] numbers on barplot

2009-07-27 Thread Nutter, Benjamin
The only thing you're missing is the midpoints of the bars.  Since you
specified

 Graph - barplot(dat$Average)

You can get the midpoints from the Graph object.  So to put the number
on top of each bar you might use something like:

 text(Graph, dat$Average, dat$Average)


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Mohsen Jafarikia
Sent: Monday, July 27, 2009 10:02 AM
To: r-h...@stat.math.ethz.ch
Subject: [R] numbers on barplot

Hello all,
I have this simple barplot code:

ifn - id.dat
dat - read.table(ifn)
ofn - id.png

bitmap(ofn, type = png256, width = 30, height = 30, pointsize = 30, bg
=
white,res=50)
par(mar=c(5, 5, 3, 2),lwd=5)
par(cex.main=1.6,cex.lab=1.6,cex.axis=1.6)

names(dat)-c(NumberOfPeople,Average)
Graph-barplot(dat$Average)
dev.off()

and here is the data (id.dat):

150.08
 60.09
 70.37

I want to write down the NumberOfPeople on top of each of the bars.
Can
anybody help me on this?

Thanks,
Mohsen

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


===

P Please consider the environment before printing this e-mail

Cleveland Clinic is ranked one of the top hospitals
in America by U.S. News  World Report (2008).  
Visit us online at http://www.clevelandclinic.org for
a complete listing of our services, staff and
locations.


Confidentiality Note:  This message is intended for use\...{{dropped:13}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] downsampling

2009-07-27 Thread Philipp Pagel
On Mon, Jul 27, 2009 at 02:42:33PM +0200, Jan M. Wiener wrote:
 However, both approx() and spline() seem to select the number of
 required data points from the original data (at the correct positions,
 of course) and ignore the remaining data points, as the following
 example demonstrates:
 
  a= c(1,0,2,1,0)
 
  approx(a,n=3)
 $x
 [1] 1 3 5
 
 $y
 [1] 1 2 0
 
 Essentially, what approx has done (spline does the same) is to simply
 select the first, third, and fifth entry (as we want to downsample a 5
 point vector into a three point vector). The second and fourth data
 point are completely ignored.

That seems to be what Warren described as the 'degenerate case'
where approx will 'just throw away every other sample'. If you choose
a differetn n (e.g. n=4) interpolation does happen.

 This can result in quite dramatic changes
 of your data, if the data points selected by approx() or spline() happen
 to be outliers and if you downsample data by a rather strong factor.

Yes, that could affect your downsampled data. For more
robustness it would probably be better to fit a proper model (if you
have one) or a lowess curve (or smooth.spline) and go from there.

cu
Philipp

-- 
Dr. Philipp Pagel
Lehrstuhl für Genomorientierte Bioinformatik
Technische Universität München
Wissenschaftszentrum Weihenstephan
85350 Freising, Germany
http://webclu.bio.wzw.tum.de/~pagel/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] offset and poisson regression

2009-07-27 Thread Renaud Scheifler
Not sure that the list is the best place for this question, but we are 
going mad with this... We are trying to fit a poisson regression to 
count data, eg the number of fledged youngs of blue tits (NPe) as a 
function of the clutch size (GPc) and other environment variables. Here 
are the original data (dumped) (we just omit the environment variables 
to simplify):


tab-
structure(list(NPe = c(3L, 5L, 2L, 6L, NA, 4L, 4L, 4L, 3L, NA,
NA, 4L, 5L, 2L, 0L, 5L, NA, 1L, NA, 2L, 5L, 4L, 0L, 4L, NA, NA,
6L, 4L, 0L, 4L, 4L, 0L, 6L, 5L, 6L, 3L, NA, 6L, 5L, 3L, 6L, 7L,
NA, 7L, 6L, 4L, NA, 1L, NA, NA, 7L, 6L, NA, 5L, NA, NA, NA, 0L,
0L, NA, NA, 5L, NA, 3L, NA, NA, NA, 5L, NA, NA, 6L, NA, NA, NA,
0L, 6L, NA, NA, NA, NA, 5L, 5L, 4L, NA, 4L, 0L, 4L, 5L, 5L, 4L,
0L, 0L, 5L, 6L, 5L, 1L, NA, 0L, 7L, 0L, 0L, 3L, 3L, 7L, NA, 0L,
6L, 4L, 4L, 5L, 0L, 5L, 4L, 7L, 4L, 7L, 5L, 5L, 0L, NA, 5L, 7L,
NA, 8L, 7L, 5L, 0L), GPc = c(5L, 6L, 6L, 7L, NA, 5L, 6L, 5L,
6L, 6L, 4L, 5L, 5L, 6L, 6L, 6L, 4L, 4L, 4L, 3L, 5L, 6L, 3L, 5L,
5L, 7L, 6L, 5L, 5L, 5L, 4L, 5L, 6L, 5L, 6L, 5L, 5L, 7L, 6L, 4L,
7L, 8L, 9L, 7L, 7L, 7L, 4L, 5L, 5L, 4L, 7L, 6L, 5L, 5L, 6L, 2L,
7L, 6L, 8L, NA, NA, 7L, 6L, 6L, NA, 6L, 6L, 5L, 5L, 5L, 7L, 7L,
6L, 6L, 6L, 6L, 7L, 5L, 5L, 7L, 7L, 6L, 6L, 8L, 6L, 7L, 5L, 5L,
8L, 8L, 7L, 7L, 6L, 7L, 6L, 5L, 6L, 7L, 8L, 6L, 7L, 7L, 5L, 7L,
6L, 5L, 9L, 5L, 4L, 7L, 6L, 6L, 5L, 8L, 5L, 7L, 6L, 7L, 7L, 7L,
6L, 7L, 5L, 8L, 7L, 7L, 6L)), .Names = c(NPe, GPc), class = 
data.frame, row.names = c(NA,

-127L))

It seems logical to insert clutch size as an offset term, since we are 
actually interested in the ratio fledged youngs/clutch size. However, 
the final results are quite surprising:


modsr0-glm(NPe~offset(GPc),family=poisson,data=tab)

if we compute the predictions, we get numbers which looks like a gross 
overestimation of the reality (eg 14.6, 39.7, etc...) -including the 
fact that it implies that one can have more fledged youngs than eggs !:


[1]  0.7  2.0  2.0  5.4  0.7  2.0  0.7  2.0  0.7  0.7  2.0  2.0  2.0  
0.3  0.1  0.7  2.0
[18]  0.1  0.7  2.0  0.7  0.7  0.7  0.3  0.7  2.0  0.7  2.0  0.7  5.4  
2.0  0.3  5.4 14.6
[35]  5.4  5.4  5.4  0.7  5.4  2.0  0.7  2.0 14.6  5.4  2.0  0.7  5.4  
2.0  2.0  5.4  2.0
[52]  2.0  2.0  5.4  0.7  0.7 14.6 14.6  5.4  5.4  2.0  5.4  2.0  0.7  
5.4 14.6  2.0  5.4
[69]  5.4  0.7  5.4  0.7 39.7  0.7  0.3  5.4  2.0  2.0  0.7 14.6  0.7  
5.4  2.0  5.4  5.4

[86]  2.0  5.4 14.6  5.4  5.4  2.0

Otherwise, if clutch size is inserted as a variable (and not as an 
offset), predictions are much more realistic, with no extreme values :


modsr0-glm(NPe~GPc,family=poisson,data=tab)
round(exp(predict(modsr0)),1)
[1] 3.2 3.7 3.7 4.4 3.2 3.7 3.2 3.7 3.2 3.2 3.7 3.7 3.7 2.7 2.2 3.2 3.7 
2.2 3.2 3.7 3.2 3.2
[23] 3.2 2.7 3.2 3.7 3.2 3.7 3.2 4.4 3.7 2.7 4.4 5.3 4.4 4.4 4.4 3.2 4.4 
3.7 3.2 3.7 5.3 4.4
[45] 3.7 3.2 4.4 3.7 3.7 4.4 3.7 3.7 3.7 4.4 3.2 3.2 5.3 5.3 4.4 4.4 3.7 
4.4 3.7 3.2 4.4 5.3
[67] 3.7 4.4 4.4 3.2 4.4 3.2 6.2 3.2 2.7 4.4 3.7 3.7 3.2 5.3 3.2 4.4 3.7 
4.4 4.4 3.7 4.4 5.3

[89] 4.4 4.4 3.7

Can any sound statistician provide a hint about what to do or how to 
interprete this ?


Thanks in advance,

Renaud and Patrick




__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] numbers on barplot

2009-07-27 Thread John Kane

names(dat)-c(NumberOfPeople,Average)
Graph-barplot(dat$Average)
barplot(dat$Average, ylim=c(0,max(dat[,2]+.2)))
text(Graph, dat[,2],  dat[,1], pos=3)

The reason for the ylim is so that the number for the righthand bar does not go 
outside the plot area.

--- On Mon, 7/27/09, Mohsen Jafarikia jafari...@gmail.com wrote:

 From: Mohsen Jafarikia jafari...@gmail.com
 Subject: [R] numbers on barplot
 To: r-h...@stat.math.ethz.ch
 Received: Monday, July 27, 2009, 10:01 AM
 Hello all,
 I have this simple barplot code:
 
 ifn - id.dat
 dat - read.table(ifn)
 ofn - id.png
 
 bitmap(ofn, type = png256, width = 30, height = 30,
 pointsize = 30, bg =
 white,res=50)
 par(mar=c(5, 5, 3, 2),lwd=5)
 par(cex.main=1.6,cex.lab=1.6,cex.axis=1.6)
 
 names(dat)-c(NumberOfPeople,Average)
 Graph-barplot(dat$Average)
 dev.off()
 
 and here is the data (id.dat):
 
 15    0.08
  6    0.09
  7    0.37
 
 I want to write down the NumberOfPeople on top of each
 of the bars. Can
 anybody help me on this?
 
 Thanks,
 Mohsen
 
     [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.
 


  __
Looking for the perfect gift? Give the gift of Flickr! 

http://www.flickr.com/gift/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] pairs plot

2009-07-27 Thread Jose Narillos de Santos
Hi all,

I want to plot trough pairs() plot a matrix with 4 columns. I want to make a
trhee plot in a graph. Plotting pairs colum 2,3,4 on y axis and 1 on X axis.

You mean (a plot with three graphs) ommitting the first pair with itself.
And only the pairs with colum 1 with the other not all pairs.

I. e. this matrix

4177 289390 8740 17220
3907 301510 8530 17550
3975 316970 8640 17650
3651 364220 9360 21420
3031 387390 9960 23410
2912 430180 11040 25820
3018 499930 12240 27620
2685 595010 13800 31670
2884 661870 14760 37170

Thanks in advance.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How should i change the SAS Codes into R Codes?

2009-07-27 Thread zhijie zhang
Dear R users,
  I have a SAS codes with several loops in it, and i hope to use R to do the
same task. The SAS codes are as follows,
/*to generate the dataset*/
DATA Single_Simulation;
 DO se=0 to 1 by 0.01;
  DO sp=0 to 1 by 0.01;
   DO DR=0 to 1 by 0.01;
TR=(DR+sp-1)/(se+sp-1+1.0e-12);
Adjust_Factor=TR/(DR+1.0e-12);
OUTPUT;
   END;
  END;
 END;
RUN;

/*to select some data*/
DATA sampledata;
 SET Single_Simulation;
 IF DR=0.02  sp=1;
RUN;

#I tried the following codes with R,failed
num-seq(0, 1, by = 0.01)
for (se in num) {
 for (sp in num) {
   for (DR in num) {
TR=(DR+sp-1)/(se+sp-1+1.0e-12)
Adjust_Factor=TR/(DR+1.0e-12)
  }
 }
}

My questions are,
1. What is the correct codes for R to do the similar task?
2. Sometimes, the simulated dataset are very large, so i need to put the
generated dataset in a file with large disk place, not in the memory, how to
do it? SAS can do it with the 'libname' argument,  do R have the similar
method?
  Thanks a lot.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How should i change the SAS Codes into R Codes?

2009-07-27 Thread Matt Aldridge

singleSim - expand.grid(se = 0:100/100, sp = 0:100/100, DR = 0:100/100)
singleSim - within(singleSim, {
TR - (DR+sp-1)/(se+sp-1+1.0e-12)
AdjustFactor - TR/(DR+1.0e-12)
})
sampleData - subset(singleSim, DR == .02  sp == 1)
write.csv(sampleData, output.csv) 

Hope this helps

Matt
mangosolutions - R and S consultants
Tel   +44 (0)1249 767700 


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of zhijie zhang
Sent: 27 July 2009 16:04
To: r-h...@stat.math.ethz.ch
Subject: [R] How should i change the SAS Codes into R Codes?

Dear R users,
  I have a SAS codes with several loops in it, and i hope to use R to do
the
same task. The SAS codes are as follows,
/*to generate the dataset*/
DATA Single_Simulation;
 DO se=0 to 1 by 0.01;
  DO sp=0 to 1 by 0.01;
   DO DR=0 to 1 by 0.01;
TR=(DR+sp-1)/(se+sp-1+1.0e-12);
Adjust_Factor=TR/(DR+1.0e-12);
OUTPUT;
   END;
  END;
 END;
RUN;

/*to select some data*/
DATA sampledata;
 SET Single_Simulation;
 IF DR=0.02  sp=1;
RUN;

#I tried the following codes with R,failed
num-seq(0, 1, by = 0.01)
for (se in num) {
 for (sp in num) {
   for (DR in num) {
TR=(DR+sp-1)/(se+sp-1+1.0e-12)
Adjust_Factor=TR/(DR+1.0e-12)
  }
 }
}

My questions are,
1. What is the correct codes for R to do the similar task?
2. Sometimes, the simulated dataset are very large, so i need to put the
generated dataset in a file with large disk place, not in the memory,
how to
do it? SAS can do it with the 'libname' argument,  do R have the similar
method?
  Thanks a lot.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] offset and poisson regression

2009-07-27 Thread Renaud Lancelot
You should use offset(log(Gpc)) instead of offset(Gpc)

 options(width = 65)
 fm - glm(NPe ~ 1 + offset(log(GPc)), family = poisson,data = tab)
 fitted(fm)
   1234678
3.181818 3.818182 3.818182 4.454545 3.181818 3.818182 3.181818
   9   12   13   14   15   16   18
3.818182 3.181818 3.181818 3.818182 3.818182 3.818182 2.545455
  20   21   22   23   24   27   28
1.909091 3.181818 3.818182 1.909091 3.181818 3.818182 3.181818
  29   30   31   32   33   34   35
3.181818 3.181818 2.545455 3.181818 3.818182 3.181818 3.818182
  36   38   39   40   41   42   44
3.181818 4.454545 3.818182 2.545455 4.454545 5.090909 4.454545
  45   46   48   51   52   54   58
4.454545 4.454545 3.181818 4.454545 3.818182 3.181818 3.818182
  59   62   64   68   71   75   76
5.090909 4.454545 3.818182 3.181818 4.454545 3.818182 3.818182
  81   82   83   85   86   87   88
4.454545 3.818182 3.818182 3.818182 4.454545 3.181818 3.181818
  89   90   91   92   93   94   95
5.090909 5.090909 4.454545 4.454545 3.818182 4.454545 3.818182
  96   98   99  100  101  102  103
3.181818 4.454545 5.090909 3.818182 4.454545 4.454545 3.181818
 104  106  107  108  109  110  111
4.454545 3.181818 5.727273 3.181818 2.545455 4.454545 3.818182
 112  113  114  115  116  117  118
3.818182 3.181818 5.090909 3.181818 4.454545 3.818182 4.454545
 119  121  122  124  125  126  127
4.454545 3.818182 4.454545 5.090909 4.454545 4.454545 3.818182

All the best,

Renaud

2009/7/27 Renaud Scheifler renaud.scheif...@univ-fcomte.fr:
 Not sure that the list is the best place for this question, but we are going
 mad with this... We are trying to fit a poisson regression to count data, eg
 the number of fledged youngs of blue tits (NPe) as a function of the clutch
 size (GPc) and other environment variables. Here are the original data
 (dumped) (we just omit the environment variables to simplify):

 tab-
 structure(list(NPe = c(3L, 5L, 2L, 6L, NA, 4L, 4L, 4L, 3L, NA,
 NA, 4L, 5L, 2L, 0L, 5L, NA, 1L, NA, 2L, 5L, 4L, 0L, 4L, NA, NA,
 6L, 4L, 0L, 4L, 4L, 0L, 6L, 5L, 6L, 3L, NA, 6L, 5L, 3L, 6L, 7L,
 NA, 7L, 6L, 4L, NA, 1L, NA, NA, 7L, 6L, NA, 5L, NA, NA, NA, 0L,
 0L, NA, NA, 5L, NA, 3L, NA, NA, NA, 5L, NA, NA, 6L, NA, NA, NA,
 0L, 6L, NA, NA, NA, NA, 5L, 5L, 4L, NA, 4L, 0L, 4L, 5L, 5L, 4L,
 0L, 0L, 5L, 6L, 5L, 1L, NA, 0L, 7L, 0L, 0L, 3L, 3L, 7L, NA, 0L,
 6L, 4L, 4L, 5L, 0L, 5L, 4L, 7L, 4L, 7L, 5L, 5L, 0L, NA, 5L, 7L,
 NA, 8L, 7L, 5L, 0L), GPc = c(5L, 6L, 6L, 7L, NA, 5L, 6L, 5L,
 6L, 6L, 4L, 5L, 5L, 6L, 6L, 6L, 4L, 4L, 4L, 3L, 5L, 6L, 3L, 5L,
 5L, 7L, 6L, 5L, 5L, 5L, 4L, 5L, 6L, 5L, 6L, 5L, 5L, 7L, 6L, 4L,
 7L, 8L, 9L, 7L, 7L, 7L, 4L, 5L, 5L, 4L, 7L, 6L, 5L, 5L, 6L, 2L,
 7L, 6L, 8L, NA, NA, 7L, 6L, 6L, NA, 6L, 6L, 5L, 5L, 5L, 7L, 7L,
 6L, 6L, 6L, 6L, 7L, 5L, 5L, 7L, 7L, 6L, 6L, 8L, 6L, 7L, 5L, 5L,
 8L, 8L, 7L, 7L, 6L, 7L, 6L, 5L, 6L, 7L, 8L, 6L, 7L, 7L, 5L, 7L,
 6L, 5L, 9L, 5L, 4L, 7L, 6L, 6L, 5L, 8L, 5L, 7L, 6L, 7L, 7L, 7L,
 6L, 7L, 5L, 8L, 7L, 7L, 6L)), .Names = c(NPe, GPc), class =
 data.frame, row.names = c(NA,
 -127L))

 It seems logical to insert clutch size as an offset term, since we are
 actually interested in the ratio fledged youngs/clutch size. However, the
 final results are quite surprising:

 modsr0-glm(NPe~offset(GPc),family=poisson,data=tab)

 if we compute the predictions, we get numbers which looks like a gross
 overestimation of the reality (eg 14.6, 39.7, etc...) -including the fact
 that it implies that one can have more fledged youngs than eggs !:

 [1]  0.7  2.0  2.0  5.4  0.7  2.0  0.7  2.0  0.7  0.7  2.0  2.0  2.0  0.3
  0.1  0.7  2.0
 [18]  0.1  0.7  2.0  0.7  0.7  0.7  0.3  0.7  2.0  0.7  2.0  0.7  5.4  2.0
  0.3  5.4 14.6
 [35]  5.4  5.4  5.4  0.7  5.4  2.0  0.7  2.0 14.6  5.4  2.0  0.7  5.4  2.0
  2.0  5.4  2.0
 [52]  2.0  2.0  5.4  0.7  0.7 14.6 14.6  5.4  5.4  2.0  5.4  2.0  0.7  5.4
 14.6  2.0  5.4
 [69]  5.4  0.7  5.4  0.7 39.7  0.7  0.3  5.4  2.0  2.0  0.7 14.6  0.7  5.4
  2.0  5.4  5.4
 [86]  2.0  5.4 14.6  5.4  5.4  2.0

 Otherwise, if clutch size is inserted as a variable (and not as an offset),
 predictions are much more realistic, with no extreme values :

 modsr0-glm(NPe~GPc,family=poisson,data=tab)
 round(exp(predict(modsr0)),1)
 [1] 3.2 3.7 3.7 4.4 3.2 3.7 3.2 3.7 3.2 3.2 3.7 3.7 3.7 2.7 2.2 3.2 3.7 2.2
 3.2 3.7 3.2 3.2
 [23] 3.2 2.7 3.2 3.7 3.2 3.7 3.2 4.4 3.7 2.7 4.4 5.3 4.4 4.4 4.4 3.2 4.4 3.7
 3.2 3.7 5.3 4.4
 [45] 3.7 3.2 4.4 3.7 3.7 4.4 3.7 3.7 3.7 4.4 3.2 3.2 5.3 5.3 4.4 4.4 3.7 4.4
 3.7 3.2 4.4 5.3
 [67] 3.7 4.4 4.4 3.2 4.4 3.2 6.2 3.2 2.7 4.4 3.7 3.7 3.2 5.3 3.2 4.4 3.7 4.4
 4.4 3.7 4.4 5.3
 [89] 4.4 4.4 3.7

 Can any sound statistician provide a 

Re: [R] create dataset permanently in package (i.e. default or our own package)

2009-07-27 Thread Steve Lianoglou

Mr. Einstein,

On Jul 27, 2009, at 7:27 AM, Albert EINstEIN wrote:


Hi,
actually while opening R console and R commander we see some  
packages like
car and datasets. in this packages we have default datasets are  
available.

example: women and prestige like that. now i created a sales dataset
importing from excel, xml or text file. now i want to store that  
dataset

permanently  in any one of the package like i mentioned above (car or
datasets). now i closed my R session. after some time i opened R  
console and
R commander. Now I will not create again sales dataset.While  
clicking any

one of package that sales dataset should be found.
if possible please give me the code it will be very helpful for us.


I'm not sure if this is exactly what you're asking, but this is the  
question I'm going to answer:


How can I include/bundle a dataset with a package that I am developing?

Answer:

Simply save your dataset into an *.rda/*.RData file and place it in  
the data directory of the package you're developing.


So, for example, let's say you've done so and the name of your file is  
TheoryOfRelativity.rda.Once the package is installed (and  
loaded(?)), you can load the datafile it by calling:


data(TheoryOfRelativity)



I'm not sure that saving your own data file into *some random*  
package's data directory is a good idea, or what you're asking, but  
you can do that, too. Just specify the package you need to load it from.


For example, if I saved 'TheoryOfRelativity.rda' in the 'nnet'  
package's data directory (for some weird reason). You can load it like  
so:


load(TheoryOfRelativity, package='nnet')

HTH,
-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Creating new data frame using loop

2009-07-27 Thread Andrew Aldersley

Hi all,

I sent a request round last week asking for help with using a for loop to 
read and separate a large dataset. The response I got worked great, but now I 
have another problem with using my loop.

Basically I have a number of different files containing columned data. There 
are 132 datasets, named such that I have something in the form...

precip_colxxx.txt

...where xxx is a number ranging from 1 to 132. What I want to do is read in 
every 13th table and extract the third column, and then place this in a new 
dataset. The new dataset will thus compose of 11 columns of data. I have 
written the following bit of script to read in every 13th table separately, 
however I'm not sure how to do the next step of creating a new data frame and 
dumping the third column of my tables into this data frame. Is there are 
chance I will have to do a nested loop?

for (i in seq(1,120,13)) {

nm - sprintf('precip_col%03d.txt', i)

precip - read.table(nm, header=T)

}

Thanks very much in advance.

Andy

_

icons.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] numbers on barplot

2009-07-27 Thread Greg Snow
Unless you are intentionally trying to distort your data and make the graph 
harder to read (you don't want to do that), it is better to put the numbers in 
the margin rather than at the top of the bars.  Try the following line after 
the barplot:

 mtext( dat$NumberOfPeople, side=1, line=1.5, at=Graph, cex=1.6 )



-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Mohsen Jafarikia
 Sent: Monday, July 27, 2009 8:02 AM
 To: r-h...@stat.math.ethz.ch
 Subject: [R] numbers on barplot
 
 Hello all,
 I have this simple barplot code:
 
 ifn - id.dat
 dat - read.table(ifn)
 ofn - id.png
 
 bitmap(ofn, type = png256, width = 30, height = 30, pointsize = 30,
 bg =
 white,res=50)
 par(mar=c(5, 5, 3, 2),lwd=5)
 par(cex.main=1.6,cex.lab=1.6,cex.axis=1.6)
 
 names(dat)-c(NumberOfPeople,Average)
 Graph-barplot(dat$Average)
 dev.off()
 
 and here is the data (id.dat):
 
 150.08
  60.09
  70.37
 
 I want to write down the NumberOfPeople on top of each of the bars.
 Can
 anybody help me on this?
 
 Thanks,
 Mohsen
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] pairs plot

2009-07-27 Thread Greg Snow
Look at the pairs2 function in the TeachingDemos package.

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Jose Narillos de Santos
 Sent: Monday, July 27, 2009 9:02 AM
 To: r-help@r-project.org
 Subject: [R] pairs plot
 
 Hi all,
 
 I want to plot trough pairs() plot a matrix with 4 columns. I want to
 make a
 trhee plot in a graph. Plotting pairs colum 2,3,4 on y axis and 1 on X
 axis.
 
 You mean (a plot with three graphs) ommitting the first pair with
 itself.
 And only the pairs with colum 1 with the other not all pairs.
 
 I. e. this matrix
 
 4177 289390 8740 17220
 3907 301510 8530 17550
 3975 316970 8640 17650
 3651 364220 9360 21420
 3031 387390 9960 23410
 2912 430180 11040 25820
 3018 499930 12240 27620
 2685 595010 13800 31670
 2884 661870 14760 37170
 
 Thanks in advance.
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-
 guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How should i change the SAS Codes into R Codes?

2009-07-27 Thread John Kane

 I see Matt Aldridge has given you the answers to your specific questions.   

If you are used to using SAS you might find Bob Meunchen's book Muenchen, R. 
A. (2008). R for SAS and SPSS Users (1st ed.). Springer. useful.  A shorter 
version is available as a pdf at http://rforsasandspssusers.com/.  The pdf is a 
very useful guide for anyone beginning to use R.

--- On Mon, 7/27/09, zhijie zhang rusers...@gmail.com wrote:

 From: zhijie zhang rusers...@gmail.com
 Subject: [R] How should i change the SAS Codes into R Codes?
 To: r-h...@stat.math.ethz.ch
 Received: Monday, July 27, 2009, 11:03 AM
 Dear R users,
   I have a SAS codes with several loops in it, and i
 hope to use R to do the
 same task. The SAS codes are as follows,
 /*to generate the dataset*/
 DATA Single_Simulation;
  DO se=0 to 1 by 0.01;
   DO sp=0 to 1 by 0.01;
    DO DR=0 to 1 by 0.01;
     TR=(DR+sp-1)/(se+sp-1+1.0e-12);
     Adjust_Factor=TR/(DR+1.0e-12);
     OUTPUT;
    END;
   END;
  END;
 RUN;
 
 /*to select some data*/
 DATA sampledata;
  SET Single_Simulation;
  IF DR=0.02  sp=1;
 RUN;
 
 #I tried the following codes with R,failed
 num-seq(0, 1, by = 0.01)
 for (se in num) {
  for (sp in num) {
    for (DR in num) {
     TR=(DR+sp-1)/(se+sp-1+1.0e-12)
     Adjust_Factor=TR/(DR+1.0e-12)
   }
  }
 }
 
 My questions are,
 1. What is the correct codes for R to do the similar task?
 2. Sometimes, the simulated dataset are very large, so i
 need to put the
 generated dataset in a file with large disk place, not in
 the memory, how to
 do it? SAS can do it with the 'libname' argument,  do
 R have the similar
 method?
   Thanks a lot.
 
     [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.
 


  __
Yahoo! Canada Toolbar: Search from anywhere on the web, and bookmark your 
favourite sites. Download it now
http://ca.toolbar.yahoo.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] normal mixture model

2009-07-27 Thread cindy Guo
Hi, Christian,

Yes, it works. Thank you very much. It's really helpful.

Cindy

On Mon, Jul 27, 2009 at 5:39 AM, Christian Hennig chr...@stats.ucl.ac.ukwrote:

 Hi Cindy,

 you need the summary function

 mclustsummary - summary(mclustBICoutputobject,data)

 to get all the information. Some (like best model) is given if you just
 print out the summary object. Some other information (like estimated
 parameter values) are accessible as components of the summary object, like
 mclustsummary$parameters$...
 Try

 str(mclustsummary)

 to see what's there (unfortunately this is not fully documented).

 For more detail see the help pages.

 Hope this helps,

 Christian

 On Sun, 26 Jul 2009, cindy Guo wrote:

   Hi, Christian,

 Thank you for the reply. I just tried. Does the function mclustBIC only
 give
 the best model, or does it also do EM to get the cluster means and
 variances
 according to the best model it picks? I didn't find it.  Is there a way to
 automatically select the best number of components and do EM? Because I
 need
 to do the normal mixture model in a loop (one EM at an iteration), so I
 want
 it to do everything automatically.
 Thanks,

 Cindy

 On Sun, Jul 26, 2009 at 3:46 PM, Christian Hennig chr...@stats.ucl.ac.uk
 wrote:

   You can use mclustBIC in package mclust (uses the BIC for deciding
 about
 the number of components and hierarchical clustering for initialisation).

 Christian


 On Sun, 26 Jul 2009, cindy Guo wrote:

  Hi, All,


 I want to fit a normal mixture model. Which package in R is best for
 this?
 I
 was using the package 'mixdist', but I need to group the data into
 groups
 before fitting model, and different groupings seem to lead to different
 results. What other package can I use which is stable? And are there
 packages that can automatically determine the number of components?

 Thank you,

 Cindy

   [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.htmlhttp://www.r-project.org/posting-guide.html
 http://www.r-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 *** --- ***
 Christian Hennig
 University College London, Department of Statistical Science
 Gower St., London WC1E 6BT, phone +44 207 679 1698
 chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche



 *** --- ***
 Christian Hennig
 University College London, Department of Statistical Science
 Gower St., London WC1E 6BT, phone +44 207 679 1698
 chr...@stats.ucl.ac.uk, www.homepages.ucl.ac.uk/~ucakche


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Creating new data frame using loop

2009-07-27 Thread Steve Lianoglou

Hi Andy,

On Jul 27, 2009, at 12:18 PM, Andrew Aldersley wrote:



Hi all,

I sent a request round last week asking for help with using a for  
loop to read and separate a large dataset. The response I got worked  
great, but now I have another problem with using my loop.


Basically I have a number of different files containing columned  
data. There are 132 datasets, named such that I have something in  
the form...


precip_colxxx.txt

...where xxx is a number ranging from 1 to 132. What I want to do is  
read in every 13th table and extract the third column, and then  
place this in a new dataset. The new dataset will thus compose of 11  
columns of data. I have written the following bit of script to read  
in every 13th table separately, however I'm not sure how to do the  
next step of creating a new data frame and dumping the third  
column of my tables into this data frame. Is there are chance I will  
have to do a nested loop?


for (i in seq(1,120,13)) {

nm - sprintf('precip_col%03d.txt', i)

precip - read.table(nm, header=T)

}


You can do this by building up your columns into a list, then using  
a combo of do.call and cbind.


For example:

mydata - list()
for (i in seq(1,120,13)) {
  nm - sprintf('precip_col%03d.txt', i)
  precip - read.table(nm, header=T)
  mydata[[i]] - precip[,3]
}

mydata - do.call(cbind, mydata)

The first param in do.call is the function you want to call, the  
second param is a *list* of parameters you'd like to pass into the  
function.


-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Should nlme's augPred work with a factor covariate?

2009-07-27 Thread Gavin Kelly
I'm having difficulty getting the augPred function from the nlme
package to work when the primary covariate is a factor.  I don't know
if it is intended to work in these situations, but I can't immediately
see anything in the documentation that forbids this - ideally I'd like
to be able to plot the results just like I can for numeric primary
covariates.
Many thanks - Gavin Kelly (Cancer Research UK, Bioinformatics  Biostatistics)

 library(nlme)
 fm - lme(ergoStool)
 augPred(fm)
Error in Summary.factor(c(1L, 2L, 3L, 4L, 1L, 2L, 3L, 4L, 1L, 2L, 3L,  :
  min not meaningful for factors

 sessionInfo()
R version 2.9.1 (2009-06-26)
i386-pc-mingw32

locale:
LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
Kingdom.1252;LC_MONETARY=English_United
Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

other attached packages:
[1] nlme_3.1-92

loaded via a namespace (and not attached):
[1] grid_2.9.1  lattice_0.17-25

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Determine the dimension-names of an element in an array in R

2009-07-27 Thread Sauvik De
Hi there,

Thanks again for your reply. I know for-loop is always a solution to my
problem and I had already coded using for-loop. But the number of levels for
each dimension is large enough in actual problem and hence it was
time-consuming.
So, I was just wondering if there are any other alternative way-outs to
solving my problem. That's why I tried with apply functions (sapply)assuming
that this might work out faster even fractionally as compared to for-loop.

Cheers,
Sauvik

On Mon, Jul 27, 2009 at 12:28 AM, Poersching poerschin...@web.de wrote:

  Sauvik De schrieb:

 Hi:
 Lots of thanks for your valuable time!

 But I am not sure how you would like to use the function in this situation.

 As I had mentioned that the first element of my output array should be
 like:


 cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs)

 in my below code.

 and

 the output array of correlation I wish to get using sapply as follows:

 Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...],
 use=pairwise.complete.obs))

 So it would be of great help if you could kindly specify how to utilise
 your function findIndex in ...

 Apologies for all this!

 Thanks  Regards,
 Sauvik

  Hey,
 sorry, I haven't understood your problem last time, but now this solution
 should solve your problem, so I hope. :-)
 It's only a for to loop, but an apply function may work too. I will think
 about this, but for now...  ;-)

 la-length(a)
 lb-length(b)
 lc-length(c)
 ld-length(d)
 for (ia in 1:la) {
   for (ib in 1:lb) {
 for (ic in 1:lc) {
   for (id in 1:ld) {
 Correl[ia,ib,ic,id]-cor(
  DataArray_1[dimnames(Correl)[[1]][ia],
  dimnames(Correl)[[2]][ib],
  dimnames(Correl)[[4]][id],]
  ,
  DataArray_2[dimnames(Correl)[[1]][ia],
   dimnames(Correl)[[3]][ic],
   dimnames(Correl)[[4]][id],]
  ,
  use=pairwise.complete.obs)
   }
 }
   }
 }
 ## with function findIndex you can find the dimensions with
 ## i.e. cor values greater 0.5 or smaller -0.5, like:
 findIndex(Correl,Correl[Correl0.5])
 findIndex(Correl,Correl[Correl(-0.5)])

 I have changed the code of the function findIndex in line which contents:
 el[j]-which(is.element(data,element[j]))

 Rigards,
 Christian


 On Sun, Jul 26, 2009 at 3:54 PM, Poerschingpoerschin...@web.de wrote:
  Sauvik De schrieb:
 
  Hi Gabor:
  Many thanks for your prompt reply!
  The code is fine. But I need it in more general form as I had mentioned
 that
  I need to input any 0 to find its dimension-names.
 
  Actually, I was using sapply to calculate correlation and this idea was
  required in the middle of correlation calculation.
  I am providing the way I tried my calculation.
 
  a= c(A1,A2,A3,A4,A5)
  b= c(B1,B2,B3)
  c= c(C1,C2,C3,C4)
  d= c(D1,D2)
  e= c(E1,E2,E3,E4,E5,E6,E7,E8)
 
  DataArray_1 = array(c(rnorm(240)),dim=c(length(a),length(b),
  length(d),length(e)),dimnames=list(a,b,d,e))
  DataArray_2 = array(c(rnorm(320)), dim=c(length(a),length(c),
  length(d),length(e)),dimnames=list(a,c,d,e))
 
  #Defining an empty array which will contain the correlation values
 (output
  array)
  Correl = array(NA, dim=c(length(a),length(b),
  length(c),length(d)),dimnames=list(a,b,c,d))
 
  #Calculating Correlation between attributes b  c over values of e
  Correl = sapply(Correl,function(d) cor(DataArray_1[...],DataArray_2[...],
  use=pairwise.complete.obs))
 
  This is where I get stuck.
  In the above, d is acting as an element in the Correl array. Hence I
 need
  to get the dimension-names for d.
 
  #The first element of Correl will be:
 
 cor(DataArray_1[dimnames(Correl)[[1]][1],dimnames(Correl)[[2]][1],dimnames(Correl)[[4]][1],],DataArray_2[dimnames(Correl)[[1]][1],dimnames(Correl)[[3]][1],dimnames(Correl)[[4]][1],],use=pairwise.complete.obs)
 
  So my problem boils down to extracting the dim-names in terms of
 element(d)
  and not in terms of Correl (that I have mentioned as ... in the above
  code)
 
  My sincere thanks for your valuable time  suggestions.
 
  Many Thanks  Kind Regards,
  Sauvik
 
 
  On Sun, Jul 26, 2009 at 5:26 AM, Gabor Grothendieck 
 ggrothendi...@gmail.com
 
 
  wrote:
 
 
 
 
  Try this:
 
 
 
  ix - c(1, 3, 4, 2)
  mapply([, dimnames(mydatastructure), ix)
 
 
  [1] S1 T3 U4 V2
 
 
  On Sat, Jul 25, 2009 at 5:12 PM, Sauvik Desauvik.s...@gmail.com wrote:
 
 
  Hi:
  How can I extract the dimension-names of a pre-defined element in a
  multidimensional array in R ?
 
  A toy example is provided below:
  I have a 4-dimensional array with each dimension having certain length.
 
 
  In
 
 
  the below example, mydatastructure explains the structure of my data.
 
  mydatastructure = array(0,
 
 
  dim=c(length(b),length(z),length(x),length(d)),
 
 
  dimnames=list(b,z,x,d))
 
  where,
  b=c(S1,S2,S3,S4,S5)
  

[R] plotting a PNG from an in-memory object

2009-07-27 Thread Rajarshi Guha
Hi, I have code which, via rJava can bring up a JFrame to display an image.
What I'd like to be able to is to capture that image and make an R plot out
of it (analogous to plotting a PNG file, but not from an actual file).

I can rite Java code that could be called from R to take a snapshot of the
window, but is it at all possible to some return that data to the R side to
display via plot()?

Any pointers would be appreciated

-- 
Rajarshi Guha

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] probability on a barplot

2009-07-27 Thread Erin Hodgess
Dear R People:

I have a barplot created from a table.

What is the best way to set up the barplot such that is shows
probability rather totals, please?

I've tried plot also but it shows horizontal bars rather than vertical bars.

Thanks,
Erin


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] probability on a barplot

2009-07-27 Thread Erin Hodgess
Please ignore the previous email

I figured it out.


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] dataframe to list conversion

2009-07-27 Thread voidobscura

Hi all, I have been experimenting with writing my own matrix column sum
function.  I want it to return a list.

csum-function(m)
{
a = data.frame(m)
s = lapply(a,sum)
return(s)
}

I wish to use the same code up until the return(s) that I have listed above. 
The problem is that s, I believe, is a data frame (looks like this:)

$X1
[1] 148

$X2
[1] 156

$X3
[1] 164

$X4
[1] 172

$X5
[1] 180

I would like a vector with these values (148,156,etc).  How may I do this? 
tia!
-- 
View this message in context: 
http://www.nabble.com/dataframe-to-list-conversion-tp24682262p24682262.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How to deal with this random variable?

2009-07-27 Thread Manuel Ramon

Hello to everybody,
I have a data frame with 100 measures of quality for 3 variables: A, B and
C. These quality variables are measured in diferent times along the
productive process. My data comes from 5 experiments (5 replicates with 20
measures for replicate). I also have a final measure (Z) but just one
measure for each unit, that is, for the 20 units that are measured on each
replica. 

My objetive is to study the relationships between the 3 quality parameters
with the last measure, that is:
 
  lm(Z ~ A+B+C, data=mydata)

I have found significant differences between replicas for each qualite
parameters (A, B and C) and I would like to include the replica effect as a
random effect:

  lme(Z ~ A+B+C, data=mydata, random=~1|replica)

And here is my problem. I know that there are signifficant diferences
between replicas but since the final measure, Z, is the same for each
replica I do not know how to deal with. 

Can you help me? How could I take into account the variability due to the
replica when I want to study the effects of variables A, B and C on the
final result of a productive process?

Thank you in advance.

-
Manuel Ramón Fernández
Group of Reproductive Biology (GBR)
University of Castilla-La Mancha (Spain)
mra...@jccm.es
-- 
View this message in context: 
http://www.nabble.com/How-to-deal-with-this-random-variable--tp24684341p24684341.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Superscripts and rounding

2009-07-27 Thread ehux

I am new to the world of R/programming so this may be a really easy question.
I thank you for your patience and help in advance 

I would like the characters km^2 to be displayed on the plot subtitle as km
squared - two as a superscript. 

I would also like to have the numbers from the data set for longitude and
latitude to be rounded to four decimal places.

Thank you.

plot (
  decade[['date']],
  decade[['value']],

  type = 'l',
  col = 'lightsteelblue4',
  ylab = 'Discharge [cms]',
  main = sprintf('%s [%s]', stn[['metadata']][['name']],
stn[['metadata']][['id']]),
  km^2 - expression
  sub = sprintf('Seasonal station with natural streamflow - Lat: %s Lon: %s
Gross Area %s km^2 - Effective Area %s km^2',
stn[['metadata']][['latitude']],
stn[['metadata']][['longitude']],stn[['metadata']][['grossarea']],
stn[['metadata']][['effectivearea']]),
  cex.sub = 1, font.sub = 3, col.sub = black
  )
-- 
View this message in context: 
http://www.nabble.com/Superscripts-and-rounding-tp24682319p24682319.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataframe to list conversion

2009-07-27 Thread Dimitris Rizopoulos
have a look at ?unlist(); you can also use sapply() in this case instead 
of lapply().



Best,
Dimitris


voidobscura wrote:

Hi all, I have been experimenting with writing my own matrix column sum
function.  I want it to return a list.

csum-function(m)
{
a = data.frame(m)
s = lapply(a,sum)
return(s)
}


I wish to use the same code up until the return(s) that I have listed above. 
The problem is that s, I believe, is a data frame (looks like this:)


$X1
[1] 148

$X2
[1] 156

$X3
[1] 164

$X4
[1] 172

$X5
[1] 180

I would like a vector with these values (148,156,etc).  How may I do this? 
tia!


--
Dimitris Rizopoulos
Assistant Professor
Department of Biostatistics
Erasmus University Medical Center

Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands
Tel: +31/(0)10/7043478
Fax: +31/(0)10/7043014

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] dataframe to list conversion

2009-07-27 Thread Jorge Ivan Velez
Hi voidobscura,
Try either

csum2 - function(m){
   a = data.frame(m)
   s = lapply(a,sum)
   do.call(c, s)
}

or

colSums(m)

See ?do.call and ?colSums for more details.

HTH,

Jorge


On Mon, Jul 27, 2009 at 11:03 AM, voidobscura nshah...@gmail.com wrote:


 Hi all, I have been experimenting with writing my own matrix column sum
 function.  I want it to return a list.

 csum-function(m)
 {
a = data.frame(m)
s = lapply(a,sum)
return(s)
 }

 I wish to use the same code up until the return(s) that I have listed
 above.
 The problem is that s, I believe, is a data frame (looks like this:)

 $X1
 [1] 148

 $X2
 [1] 156

 $X3
 [1] 164

 $X4
 [1] 172

 $X5
 [1] 180

 I would like a vector with these values (148,156,etc).  How may I do this?
 tia!
 --
 View this message in context:
 http://www.nabble.com/dataframe-to-list-conversion-tp24682262p24682262.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to deal with this random variable?

2009-07-27 Thread Bert Gunter
This sounds way too complicated for this forum, which is designed to provide
help to users on the  use of the R language, not remote statistical
consulting. While you may receive replies, I would argue that you would do
better to find a local statistical expert with whom to work -- not least
because they should probably have a deep understanding of how your
experiment was conducted, data gathered, measurements made, etc. to be able
to give you worthwhile advice.

Long distance consulting based on incomplete understanding is very risky.
Caveat emptor!

Bert Gunter
Genentech Nonclinical Biostatistics


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Manuel Ramon
Sent: Monday, July 27, 2009 9:54 AM
To: r-help@r-project.org
Subject: [R] How to deal with this random variable?


Hello to everybody,
I have a data frame with 100 measures of quality for 3 variables: A, B and
C. These quality variables are measured in diferent times along the
productive process. My data comes from 5 experiments (5 replicates with 20
measures for replicate). I also have a final measure (Z) but just one
measure for each unit, that is, for the 20 units that are measured on each
replica. 

My objetive is to study the relationships between the 3 quality parameters
with the last measure, that is:
 
  lm(Z ~ A+B+C, data=mydata)

I have found significant differences between replicas for each qualite
parameters (A, B and C) and I would like to include the replica effect as a
random effect:

  lme(Z ~ A+B+C, data=mydata, random=~1|replica)

And here is my problem. I know that there are signifficant diferences
between replicas but since the final measure, Z, is the same for each
replica I do not know how to deal with. 

Can you help me? How could I take into account the variability due to the
replica when I want to study the effects of variables A, B and C on the
final result of a productive process?

Thank you in advance.

-
Manuel Ramón Fernández
Group of Reproductive Biology (GBR)
University of Castilla-La Mancha (Spain)
mra...@jccm.es
-- 
View this message in context:
http://www.nabble.com/How-to-deal-with-this-random-variable--tp24684341p2468
4341.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] skip plot/blank plot on purpose (multi-plot question)

2009-07-27 Thread Mark Knecht
Hi,
   Say that I've got a function that has the following code in it:

X11(width=10, height=10)
layout(rbind(c(1,1,1,2,2,2), c(3,4,5,6,7,8), c(9,10,11,12,13,14)),
height=c(3,1,1))
layout.show(14)

Sometimes when I call this function it will turn out by design that
one or more of the data sets that I use to create the plots in
positions 3-14 are empty. As there is a day of the week relationship
between 3-8 and 9-14, and say that 12 is an empty set, how can I skip
12, leave it blank or make it blank, and then make the next data set
plot in position 13?

1) Is there some generic way to call plot and have it plot, but it
plots nothing so I don't see anything at all in position 12? This
could be a blank plot function I call when I notice the data set is
empty.

2) Is there some generic way to specify the position number I want the
next plot to use so that I'd not plot 12 but would specify 13?

Thanks,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to correlate nominal variables?

2009-07-27 Thread Daniel Malter

Benoit Vaillant made me aware of an indexing mistake in the computation of
Cramer's V. The col.sum indexes rows instead of columns. This is a
correction of the code:

cramers.v=function(x){
x=as.data.frame(x)
chisq=0
row.sum=NULL
col.sum=NULL
row.sum=rowSums(table(x))
col.sum=colSums(table(x))
for(k in 1:dim(table(x))[1]){
  for(l in 1:dim(table(x))[2]){
 
chisq=chisq+((table(x)[k,l]-(row.sum[k]*col.sum[l])/(dim(x)[1]))^2)/((row.sum[k]*col.sum[l])/(dim(x)[1]))
  cramers.v=sqrt(chisq/(dim(x)[1]*(min(dim(table(x)))-1)))
  }
}
  }


Daniel Malter wrote:
 
 You can copy the code below to your R-code editor. For Yule's Q, the data
 is expected in two vectors. For cramer's phi, the data is expected in
 separate columns of a matrix or dataframe.
 
 ##Run this code
 yule.Q=function(x,y){(table(x,y)[1,1]*table(x,y)[2,2]-table(x,y)[1,2]*table(x,y)[2,1])/(table(x,y)[1,1]*table(x,y)[2,2]+table(x,y)[1,2]*table(x,y)[2,1])}
 
 ##create test data
 vector.one=rbinom(100,1,0.4)
 vector.two=rbinom(100,1,0.8)
 table(vector.one,vector.two)
 
 ##compute yule's Q
 yule.Q(vector.one,vector.two)  
 ##just put your two vector names there
 
 
 
 
 ##Cramer's V
 
 ##Run this code
 cramers.v=function(x){
 x=as.data.frame(x)
 chisq=0
 row.sum=NULL
 col.sum=NULL
 for(i in 1:dim(table(x))[1])
   row.sum[i]=sum(table(x)[i,])
 for(j in 1:dim(table(x))[2])
   col.sum[j]=sum(table(x)[j,])
 for(k in 1:dim(table(x))[1]){
   for(l in 1:dim(table(x))[2]){
  
 chisq=chisq+((table(x)[k,l]-(row.sum[k]*col.sum[l])/(dim(x)[1]))^2)/((row.sum[k]*col.sum[l])/(dim(x)[1]))
   cramers.v=sqrt(chisq/(dim(x)[1]*(min(dim(table(x)))-1)))
   }
 }
   }
 
 ##create test data
 toanalyze=cbind(rbinom(100,2,0.4),rbinom(100,1,0.6))
 toanalyze2=cbind(rep(c(0,1),each=50),rep(c(0,1),each=50))
 
 ##compute cramer's v for the test data 
 v1=cramers.v(toanalyze) ## just put your dataframe or matrix name
 v2=cramers.v(toanalyze2)
 
 v1 ##cramer's v
 v2 ##cramer's v
 
 
 
 Timo Stolz wrote:
 
 Dear R-Users,
 
 I need functions to calculate Yule's Y or Cramérs Index, in order to
 correlate variables that are nominally scaled?
 
 Am I wrong? Are such functions existing?
 
 Sincerely,
 Timo
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 
 

-- 
View this message in context: 
http://www.nabble.com/how-to-correlate-nominal-variables--tp18441195p24686228.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Forumla format?

2009-07-27 Thread Noah Silverman
Hi,

Quick question.

I'm working on training an SVM.

I have a dataframe with about 50 columns.  I want to train on 46 of them.

Is there a way to say All except columns 22,23,25 and 31?

It would be nice to not have to do +c1 +c2 +c3 +c4, etc for all 48 columns.

Thanks!

-N

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Searching for specific values in a matrix

2009-07-27 Thread Mehdi Khan
i am able to return the first column, but anything else returns this:
0 rows (or 0-length row.names)

any idea?

On Tue, Jul 21, 2009 at 12:49 PM, Steve Lianoglou 
mailinglist.honey...@gmail.com wrote:


 On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote:

  I understand your explanation about the test for even numbers.  However I
 am still a bit confused as to how to go about finding a particular value.
  Here is an example data set

 col #  attr1attr2   attr 3LONLAT
 17209 DNANA -122.9409 38.27645
 17210BCNANA -122.9581 38.36304
 17211 BNANA -123.6851 41.67121
 17212BCNANA -123.0724 38.93073
 17213 CNANA -123.7240 41.84403
 17214  NA   464NA -122.9430 38.30988
 17215 CNANA -123.4442 40.65369
 17216BCNANA -122.9389 38.31551
 17217 CNANA -123.0747 38.97998
 17218 CNANA -123.6580 41.59610
 17219 CNANA -123.4513 40.70992
 17220 CNANA -123.0901 39.06473
 17221BCNANA -123.0653 38.94845
 17222BCNANA -122.9464 38.36808
 17223  NA   464NA -123.0143 38.70205
 17224  NANA 5 -122.8609 37.94137
 17225  NANA 5 -122.8628 37.95057
 17226  NANA 7 -122.8646 37.95978


 For future reference, perhaps paste this in a way that's easy for us to
 paste into a running R session so we can use it, like so:

 df - data.frame(
 coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216, 17217,
 17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226),

 attr1=c(D,BC,B,BC,C,NA,C,BC,C,C,C,C,BC,BC,NA,NA,NA,NA),
 attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA),
 attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7),
 LON=c(
 -122.9409,-122.9581,-123.6851,-123.0724,-123.7240,-122.9430,-123.4442,-122.9389,-123.0747,-123.6580,-123.4513,-123.0901,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646),

 LAT=c(38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978))

  If I wanted to find the row with Lat = 37.95978


 Using an indexing vector:

 R lats - df$LAT == 37.95978
 # or with the %~% from before:
 # lats - df$LAT %~% 37.95978
 R df[lats,]
coln attr1 attr2 attr3   LON  LAT
 18 17226  NANA 7 -122.8646 37.95978

 Using the subset function:

 R subset(df, LAT == 37.95978)
coln attr1 attr2 attr3   LON  LAT
 18 17226  NANA 7 -122.8646 37.95978

  , how would i do that?  How would  I find the rows with BC?


 R subset(df, attr1 == 'BC')
coln attr1 attr2 attr3   LON  LAT
 2  17210BCNANA -122.9581 38.36304
 4  17212BCNANA -123.0724 38.93073
 8  17216BCNANA -122.9389 38.31551
 13 17221BCNANA -123.0653 38.94845
 14 17222BCNANA -122.9464 38.36808


 If you try with an indexing vector the NA's will trip you up:

 R df[df$attr1 == 'BC',]
  coln attr1 attr2 attr3   LON  LAT
 217210BCNANA -122.9581 38.36304
 417212BCNANA -123.0724 38.93073
 NA  NA  NANANANA   NA
 817216BCNANA -122.9389 38.31551
 13   17221BCNANA -123.0653 38.94845
 14   17222BCNANA -122.9464 38.36808
 NA.1NA  NANANANA   NA
 NA.2NA  NANANANA   NA
 NA.3NA  NANANANA   NA
 NA.4NA  NANANANA   NA

 So you could do something like:

  df[df$attr1 == 'BC'  !is.na(df$attr1),]
coln attr1 attr2 attr3   LON  LAT
 2  17210BCNANA -122.9581 38.36304
 4  17212BCNANA -123.0724 38.93073
 8  17216BCNANA -122.9389 38.31551
 13 17221BCNANA -123.0653 38.94845
 14 17222BCNANA -122.9464 38.36808


 HTH,
 -steve

 --
 Steve Lianoglou
 Graduate Student: Physiology, Biophysics and Systems Biology
 Weill Medical College of Cornell University

 Contact Info: 
 http://cbio.mskcc.org/~lianos/contacthttp://cbio.mskcc.org/%7Elianos/contact





[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Forumla format?

2009-07-27 Thread Steve Lianoglou

Hi,

On Jul 27, 2009, at 3:01 PM, Noah Silverman wrote:


Hi,

Quick question.

I'm working on training an SVM.

I have a dataframe with about 50 columns.  I want to train on 46 of  
them.


Is there a way to say All except columns 22,23,25 and 31?


Assume your dataframe is called my.data:

my.data[,-c(22,23,25,31)]

Returns the data.frame w/o columns 22,23,25 and 31.

-steve

It would be nice to not have to do +c1 +c2 +c3 +c4, etc for all 48  
columns.


Yes, it is nice, isn't it? :-)

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] skip plot/blank plot on purpose (multi-plot question)

2009-07-27 Thread Bert Gunter
Well, all of this can be done quite nicely with lattice graphics: ?xyplot 
(See, e.g. the skip argument)


1) Is there some generic way to call plot and have it plot, but it
plots nothing so I don't see anything at all in position 12? This
could be a blank plot function I call when I notice the data set is
empty.

-- But if you do not wish to learn lattice, please at least read the docs on
standard graphics: ?plot (the type argument) ?plot.default (the axes
argument)

-- Bert Gunter
Genentech, Inc.

2) Is there some generic way to specify the position number I want the
next plot to use so that I'd not plot 12 but would specify 13?

Thanks,
Mark

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Working with tables with missing levels

2009-07-27 Thread Andre Nathan
Hello

I'm trying to write a function to calculate the relative entropy between
two distributions. The data I have is in table format, for example:

 t1 - prop.table(table(c(0,0,2,4,4)))
 t2 - prop.table(table(c(0,2,2,2,3)))
 t1

  0   2   4 
0.4 0.2 0.4 
 t2

  0   2   3 
0.2 0.6 0.2

The relative entropy is given by

  H[P||Q] = sum(p * log2(p/q))

with the conventions that 0*log2(0/q) = 0 and p*log2(p/0) = Inf.

I'm not sure about what is the best way to achieve that. Is there a way
to test if a table has a value for a given level, so that I can detect
that, for example, t1 is missing levels 1 and 3 and t2 is missing levels
1 and 4 (is level the correct terminology here?)? Simply trying to
access t1[[1]], for example, gives a subscript out of bounds error.

Another option would be to expand the tables, so that, for example, t1
becomes

  0   1   2   3   4 
0.4 0.0 0.2 0.0 0.4

Is there a way to do that?

Thanks,
Andre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Cross-validating two matrices

2009-07-27 Thread Brian McCarthy

Hello,

I am trying to help a colleague with an R problem (see below) to whit  
I can only generate a very inelegant solution. Any advice would be  
welcome.


Thanks,
Brian

If you have two matrices, say a species by trait matrix (Matrix 1  
below) and a plot by species matrix (Matrix 2 below), is there a  
straightforward way to prune one matrix so that the species list  
matches those in a second matrix? Ideally, I need to prune both  
matrices so that they include only the species found in both. In this  
case, the two matrices would therefore include only sp1,sp4,sp5.


Thank you so much for any suggestions!

E.g.:
Matrix 1:
Trait1
sp1 1.0
sp2 1.2
sp4 3.1
sp5 4.0
sp7 4.5

Matrix 2
sp1 sp3 sp4 sp5 sp6
plot1   1   2   5   1   0
plot2   3   0   1   2   5   
plot3   1   1   2   3   1
plot4   0   1   2   1   0


Brian C. McCarthy, Ph.D.
Professor of Forest Ecology
Dept. of Environmental  Plant Biology
317 Porter Hall
Ohio University
Athens, OH  45701-2979  USA

T: 740-593-1615
F: 740-593-1130
E: mccar...@ohio.edu
W: 
http://www.plantbio.ohiou.edu/index.php/directory/faculty_page/brian_mccarthy/

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Working with tables with missing levels

2009-07-27 Thread Tal Galili
Hi Andre,
Just about expending the table,

The way you could do this is by using factors, for example:
t1 - prop.table(table(factor(c(0,0,2,4,4
t2 - prop.table(table(factor( c(0,2,2,2,3

The rest is for more knowledgeable people then me to say...





On Mon, Jul 27, 2009 at 10:21 PM, Andre Nathan an...@digirati.com.brwrote:

 Hello

 I'm trying to write a function to calculate the relative entropy between
 two distributions. The data I have is in table format, for example:

  t1 - prop.table(table(c(0,0,2,4,4)))
  t2 - prop.table(table(c(0,2,2,2,3)))
  t1

  0   2   4
 0.4 0.2 0.4
  t2

  0   2   3
 0.2 0.6 0.2

 The relative entropy is given by

  H[P||Q] = sum(p * log2(p/q))

 with the conventions that 0*log2(0/q) = 0 and p*log2(p/0) = Inf.

 I'm not sure about what is the best way to achieve that. Is there a way
 to test if a table has a value for a given level, so that I can detect
 that, for example, t1 is missing levels 1 and 3 and t2 is missing levels
 1 and 4 (is level the correct terminology here?)? Simply trying to
 access t1[[1]], for example, gives a subscript out of bounds error.

 Another option would be to expand the tables, so that, for example, t1
 becomes

  0   1   2   3   4
 0.4 0.0 0.2 0.0 0.4

 Is there a way to do that?

 Thanks,
 Andre

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
--


My contact information:
Tal Galili
Phone number: 972-50-3373767
FaceBook: Tal Galili
My Blogs:
http://www.r-statistics.com/
http://www.talgalili.com
http://www.biostatistics.co.il

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Searching for specific values in a matrix

2009-07-27 Thread Steve Lianoglou


On Jul 27, 2009, at 2:54 PM, Mehdi Khan wrote:


i am able to return the first column, but anything else returns this:
0 rows (or 0-length row.names)

any idea?


I'm not sure what you're doing.

The result you're getting happens when no rows pass the logical test  
that you are using to index the rows of your data.frame for.


Can you show the code that you are using (based on the example data  
you gave) that is giving you the 0 rows result?


-steve



On Tue, Jul 21, 2009 at 12:49 PM, Steve Lianoglou mailinglist.honey...@gmail.com 
 wrote:


On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote:

I understand your explanation about the test for even numbers.   
However I am still a bit confused as to how to go about finding a  
particular value.  Here is an example data set


col #  attr1attr2   attr 3LONLAT
17209 DNANA -122.9409 38.27645
17210BCNANA -122.9581 38.36304
17211 BNANA -123.6851 41.67121
17212BCNANA -123.0724 38.93073
17213 CNANA -123.7240 41.84403
17214  NA   464NA -122.9430 38.30988
17215 CNANA -123.4442 40.65369
17216BCNANA -122.9389 38.31551
17217 CNANA -123.0747 38.97998
17218 CNANA -123.6580 41.59610
17219 CNANA -123.4513 40.70992
17220 CNANA -123.0901 39.06473
17221BCNANA -123.0653 38.94845
17222BCNANA -122.9464 38.36808
17223  NA   464NA -123.0143 38.70205
17224  NANA 5 -122.8609 37.94137
17225  NANA 5 -122.8628 37.95057
17226  NANA 7 -122.8646 37.95978

For future reference, perhaps paste this in a way that's easy for us  
to paste into a running R session so we can use it, like so:


df - data.frame(
coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216,  
17217, 17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226),
attr1 
= 
c 
(D 
,BC 
,B,BC,C,NA,C,BC,C,C,C,C,BC,BC,NA,NA,NA,NA),

attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA),
attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7),
LON 
= 
c 
( -122.9409 
,-122.9581 
,-123.6851 
,-123.0724 
,-123.7240 
,-122.9430 
,-123.4442 
,-122.9389 
,-123.0747 
,-123.6580 
,-123.4513 
,-123.0901 
,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646),
LAT 
= 
c 
(38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978 
))



If I wanted to find the row with Lat = 37.95978

Using an indexing vector:

R lats - df$LAT == 37.95978
# or with the %~% from before:
# lats - df$LAT %~% 37.95978
R df[lats,]
   coln attr1 attr2 attr3   LON  LAT
18 17226  NANA 7 -122.8646 37.95978

Using the subset function:

R subset(df, LAT == 37.95978)
   coln attr1 attr2 attr3   LON  LAT
18 17226  NANA 7 -122.8646 37.95978


, how would i do that?  How would  I find the rows with BC?

R subset(df, attr1 == 'BC')
   coln attr1 attr2 attr3   LON  LAT
2  17210BCNANA -122.9581 38.36304
4  17212BCNANA -123.0724 38.93073
8  17216BCNANA -122.9389 38.31551
13 17221BCNANA -123.0653 38.94845
14 17222BCNANA -122.9464 38.36808


If you try with an indexing vector the NA's will trip you up:

R df[df$attr1 == 'BC',]
 coln attr1 attr2 attr3   LON  LAT
217210BCNANA -122.9581 38.36304
417212BCNANA -123.0724 38.93073
NA  NA  NANANANA   NA
817216BCNANA -122.9389 38.31551
13   17221BCNANA -123.0653 38.94845
14   17222BCNANA -122.9464 38.36808
NA.1NA  NANANANA   NA
NA.2NA  NANANANA   NA
NA.3NA  NANANANA   NA
NA.4NA  NANANANA   NA

So you could do something like:

 df[df$attr1 == 'BC'  !is.na(df$attr1),]
   coln attr1 attr2 attr3   LON  LAT
2  17210BCNANA -122.9581 38.36304
4  17212BCNANA -123.0724 38.93073
8  17216BCNANA -122.9389 38.31551
13 17221BCNANA -123.0653 38.94845
14 17222BCNANA -122.9464 38.36808


HTH,
-steve

--
Steve Lianoglou
Graduate Student: Physiology, Biophysics and Systems Biology
Weill Medical College of Cornell University

Contact Info: http://cbio.mskcc.org/~lianos/contact






--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide 

Re: [R] Cross-validating two matrices

2009-07-27 Thread Steve Lianoglou

Hi,

On Jul 27, 2009, at 3:29 PM, Brian McCarthy wrote:


Hello,

I am trying to help a colleague with an R problem (see below) to  
whit I can only generate a very inelegant solution. Any advice would  
be welcome.


Thanks,
Brian

If you have two matrices, say a species by trait matrix (Matrix 1  
below) and a plot by species matrix (Matrix 2 below), is there a  
straightforward way to prune one matrix so that the species list  
matches those in a second matrix? Ideally, I need to prune both  
matrices so that they include only the species found in both. In  
this case, the two matrices would therefore include only sp1,sp4,sp5.


Thank you so much for any suggestions!

E.g.:
Matrix 1:
Trait1
sp1 1.0
sp2 1.2
sp4 3.1
sp5 4.0
sp7 4.5

Matrix 2
sp1 sp3 sp4 sp5 sp6
plot1   1   2   5   1   0
plot2   3   0   1   2   5   
plot3   1   1   2   3   1
plot4   0   1   2   1   0


keep - intersect(rownames(matrix1), colnames(matrix2))
m1 - matrix1[keep,]
m2 - matrix2[,keep]

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Working with tables with missing levels

2009-07-27 Thread Henrique Dallazuanna
Try this:

t1 - prop.table(table(factor(c(0,0,2,4,4), levels = 0:4)))
t2 - prop.table(table(factor(c(0,2,2,2,3), levels = 0:4)))

On Mon, Jul 27, 2009 at 4:21 PM, Andre Nathan an...@digirati.com.br wrote:

 Hello

 I'm trying to write a function to calculate the relative entropy between
 two distributions. The data I have is in table format, for example:

  t1 - prop.table(table(c(0,0,2,4,4)))
  t2 - prop.table(table(c(0,2,2,2,3)))
  t1

  0   2   4
 0.4 0.2 0.4
  t2

  0   2   3
 0.2 0.6 0.2

 The relative entropy is given by

  H[P||Q] = sum(p * log2(p/q))

 with the conventions that 0*log2(0/q) = 0 and p*log2(p/0) = Inf.

 I'm not sure about what is the best way to achieve that. Is there a way
 to test if a table has a value for a given level, so that I can detect
 that, for example, t1 is missing levels 1 and 3 and t2 is missing levels
 1 and 4 (is level the correct terminology here?)? Simply trying to
 access t1[[1]], for example, gives a subscript out of bounds error.

 Another option would be to expand the tables, so that, for example, t1
 becomes

  0   1   2   3   4
 0.4 0.0 0.2 0.0 0.4

 Is there a way to do that?

 Thanks,
 Andre

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40 S 49° 16' 22 O

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Forumla format?

2009-07-27 Thread Noah Silverman
Hi,

I'm not sure that would work for the formula format of an SVM function.

the idea is normally

svm(label ~ c1 + c2 +c3, data=mydata);

It doesn't work to say

svm(label ~ -c(22,23,24), data=mydata)


On 7/27/09 12:17 PM, Steve Lianoglou wrote:
 Hi,

 On Jul 27, 2009, at 3:01 PM, Noah Silverman wrote:

 Hi,

 Quick question.

 I'm working on training an SVM.

 I have a dataframe with about 50 columns.  I want to train on 46 of 
 them.

 Is there a way to say All except columns 22,23,25 and 31?

 Assume your dataframe is called my.data:

 my.data[,-c(22,23,25,31)]

 Returns the data.frame w/o columns 22,23,25 and 31.

 -steve

 It would be nice to not have to do +c1 +c2 +c3 +c4, etc for all 48 
 columns.

 Yes, it is nice, isn't it? :-)

 -steve

 -- 
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
   |  Memorial Sloan-Kettering Cancer Center
   |  Weill Medical College of Cornell University
 Contact Info: http://cbio.mskcc.org/~lianos/contact


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] skip plot/blank plot on purpose (multi-plot question)

2009-07-27 Thread Mark Knecht
On Mon, Jul 27, 2009 at 12:21 PM, Bert Guntergunter.ber...@gene.com wrote:
 Well, all of this can be done quite nicely with lattice graphics: ?xyplot
 (See, e.g. the skip argument)


 1) Is there some generic way to call plot and have it plot, but it
 plots nothing so I don't see anything at all in position 12? This
 could be a blank plot function I call when I notice the data set is
 empty.

 -- But if you do not wish to learn lattice, please at least read the docs on
 standard graphics: ?plot (the type argument) ?plot.default (the axes
 argument)


Thank you. It was the ?plot.default/axis argument that I was looking
for. I knew type=n.

Cheers,
Mark


 -- Bert Gunter
 Genentech, Inc.

 2) Is there some generic way to specify the position number I want the
 next plot to use so that I'd not plot 12 but would specify 13?

 Thanks,
 Mark

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Forumla format?

2009-07-27 Thread Steve Lianoglou

Hi,

On Jul 27, 2009, at 3:47 PM, Noah Silverman wrote:


Hi,

I'm not sure that would work for the formula format of an SVM  
function.


the idea is normally

svm(label ~ c1 + c2 +c3, data=mydata);

It doesn't work to say

svm(label ~ -c(22,23,24), data=mydata)


You're quite right. Sorry, I misunderstood the question ... I'm  
actually not sure if/how you could do that as I don't use the formula  
formulation too much.


Is it possible to build up your formula as a string, and then convert  
to formula w/ as.formula?


ie.

f - as.formula(sprintf('label ~ %s', paste(colnames(my.data)[- 
c(22,23,24), collapse= + ))


Then use `f` in your call to svm? Maybe?

-steve




On 7/27/09 12:17 PM, Steve Lianoglou wrote:


Hi,

On Jul 27, 2009, at 3:01 PM, Noah Silverman wrote:


Hi,

Quick question.

I'm working on training an SVM.

I have a dataframe with about 50 columns.  I want to train on 46  
of them.


Is there a way to say All except columns 22,23,25 and 31?


Assume your dataframe is called my.data:

my.data[,-c(22,23,25,31)]

Returns the data.frame w/o columns 22,23,25 and 31.

-steve

It would be nice to not have to do +c1 +c2 +c3 +c4, etc for all 48  
columns.


Yes, it is nice, isn't it? :-)

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact



--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Working with tables with missing levels

2009-07-27 Thread Andre Nathan
On Mon, 2009-07-27 at 16:34 -0300, Henrique Dallazuanna wrote:
 Try this:
 
 t1 - prop.table(table(factor(c(0,0,2,4,4), levels = 0:4)))
 t2 - prop.table(table(factor(c(0,2,2,2,3), levels = 0:4)))

Is there a way to do this given an already existing table? The problem
is that I actually build the distributions as I read data from files,
something like

  distr - NULL
  for (file in files) {
x - as.matrix(read.table(file))
t - c(distr, table(x))
distr - tapply(t, names(t), sum)
  }
  distr - prop.table(distr)

So I only know the maximum level after the distributions are created.

Thanks,
Andre

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Searching for specific values in a matrix

2009-07-27 Thread Steve Lianoglou


On Jul 27, 2009, at 3:50 PM, Mehdi Khan wrote:

the problem is, it works with the example data i gave.  however, it  
does NOT work with the data set i have, which is 600,000 rows.  the  
class is still a data frame.


So the problem must be in your data, or what you think is in your  
data. Somehow you're constructing a boolean query that returns false  
for every row. As long as you're not getting any memory errors, the  
size of your data doesn't change the mechanics of how this would work.


I suspect you're not getting 0 rows for every possible query you can  
come up with, right?


Look at the first 10 lines of your dataset and try to select some rows  
from your entire data.frame by using values you can see in the first  
10 rows you've just looked at.


I'm expecting this would work, in which case I'm not sure how much  
more help I can provide.


-steve


On Mon, Jul 27, 2009 at 12:15 PM, Steve Lianoglou mailinglist.honey...@gmail.com 
 wrote:


On Jul 27, 2009, at 2:54 PM, Mehdi Khan wrote:

i am able to return the first column, but anything else returns this:
0 rows (or 0-length row.names)

any idea?

I'm not sure what you're doing.

The result you're getting happens when no rows pass the logical  
test that you are using to index the rows of your data.frame for.


Can you show the code that you are using (based on the example data  
you gave) that is giving you the 0 rows result?


-steve



On Tue, Jul 21, 2009 at 12:49 PM, Steve Lianoglou mailinglist.honey...@gmail.com 
 wrote:


On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote:

I understand your explanation about the test for even numbers.   
However I am still a bit confused as to how to go about finding a  
particular value.  Here is an example data set


col #  attr1attr2   attr 3LONLAT
17209 DNANA -122.9409 38.27645
17210BCNANA -122.9581 38.36304
17211 BNANA -123.6851 41.67121
17212BCNANA -123.0724 38.93073
17213 CNANA -123.7240 41.84403
17214  NA   464NA -122.9430 38.30988
17215 CNANA -123.4442 40.65369
17216BCNANA -122.9389 38.31551
17217 CNANA -123.0747 38.97998
17218 CNANA -123.6580 41.59610
17219 CNANA -123.4513 40.70992
17220 CNANA -123.0901 39.06473
17221BCNANA -123.0653 38.94845
17222BCNANA -122.9464 38.36808
17223  NA   464NA -123.0143 38.70205
17224  NANA 5 -122.8609 37.94137
17225  NANA 5 -122.8628 37.95057
17226  NANA 7 -122.8646 37.95978

For future reference, perhaps paste this in a way that's easy for us  
to paste into a running R session so we can use it, like so:


df - data.frame(
coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216,  
17217, 17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226),
attr1 
= 
c 
(D 
,BC 
,B,BC,C,NA,C,BC,C,C,C,C,BC,BC,NA,NA,NA,NA),

attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA),
attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7),
LON 
= 
c 
( -122.9409 
,-122.9581 
,-123.6851 
,-123.0724 
,-123.7240 
,-122.9430 
,-123.4442 
,-122.9389 
,-123.0747 
,-123.6580 
,-123.4513 
,-123.0901 
,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646),
LAT 
= 
c 
(38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978 
))



If I wanted to find the row with Lat = 37.95978

Using an indexing vector:

R lats - df$LAT == 37.95978
# or with the %~% from before:
# lats - df$LAT %~% 37.95978
R df[lats,]
  coln attr1 attr2 attr3   LON  LAT
18 17226  NANA 7 -122.8646 37.95978

Using the subset function:

R subset(df, LAT == 37.95978)
  coln attr1 attr2 attr3   LON  LAT
18 17226  NANA 7 -122.8646 37.95978


, how would i do that?  How would  I find the rows with BC?

R subset(df, attr1 == 'BC')
  coln attr1 attr2 attr3   LON  LAT
2  17210BCNANA -122.9581 38.36304
4  17212BCNANA -123.0724 38.93073
8  17216BCNANA -122.9389 38.31551
13 17221BCNANA -123.0653 38.94845
14 17222BCNANA -122.9464 38.36808


If you try with an indexing vector the NA's will trip you up:

R df[df$attr1 == 'BC',]
coln attr1 attr2 attr3   LON  LAT
217210BCNANA -122.9581 38.36304
417212BCNANA -123.0724 38.93073
NA  NA  NANANANA   NA
817216BCNANA -122.9389 38.31551
13   17221BCNANA -123.0653 38.94845
14   17222BCNANA -122.9464 38.36808
NA.1NA  NANANANA   NA
NA.2NA  NANANANA   NA
NA.3NA  NANANANA   NA
NA.4NA  NANANANA   NA

So you could do 

Re: [R] probability on a barplot

2009-07-27 Thread Nair, Murlidharan T
Kindly, post the solution to the problem, so that it will benefit others. An 
example could would be great. 
Cheers../Murli


-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
Behalf Of Erin Hodgess
Sent: Monday, July 27, 2009 1:33 PM
To: R help
Subject: [R] probability on a barplot

Please ignore the previous email

I figured it out.


-- 
Erin Hodgess
Associate Professor
Department of Computer and Mathematical Sciences
University of Houston - Downtown
mailto: erinm.hodg...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] computing the radius of an arc

2009-07-27 Thread Nair, Murlidharan T
Alex Brenning, the developer of the RSAGA package told me that and I quote the 
RSAGA package (which uses functions from the free geographical information 
system [GIS] SAGA GIS) has a curvature function that is designed to calculate 
the curvature of surfaces, in particular raster (i.e. gridded) digital 
elevation models. I am not aware of a function in SAGA GIS or other GIS that 
would calculate curvatures along a line, especially not in 3D

I shall try to develop it and if I am successful I shall make it available.

Cheers../Murli
 

-Original Message-
From: Greg Snow [mailto:greg.s...@imail.org] 
Sent: Friday, July 24, 2009 12:41 PM
To: Nair, Murlidharan T; Hans W Borchers; r-h...@stat.math.ethz.ch; Bert 
Gunter; 'Gabor Grothendieck'
Subject: RE: [R] computing the radius of an arc

There is a function rsaga.local.morphometry in the RSAGA package that says it 
computes curvature (among other things).  It looks like that function was 
designed for a different type of data than yours is, but it may work, or if 
not, then you may be able to adapt some of the code to work with your data.

Another idea: since the curvature is a function of the radius of the circle 
similar to the curve (or possibly sphere), you could find the least squares 
estimate of the circle/sphere that best fits the data (or a weighted subset 
along the lines of loess) then take the curvature from the radius of that 
circle/sphere.

Hope this helps,

-- 
Gregory (Greg) L. Snow Ph.D.
Statistical Data Center
Intermountain Healthcare
greg.s...@imail.org
801.408.8111


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-
 project.org] On Behalf Of Nair, Murlidharan T
 Sent: Friday, July 24, 2009 8:53 AM
 To: Hans W Borchers; r-h...@stat.math.ethz.ch; Bert Gunter; 'Gabor
 Grothendieck'
 Subject: Re: [R] computing the radius of an arc
 
 Thanks, for all the suggestions. Indeed I am interested in computing
 curvature (http://en.wikipedia.org/wiki/Curvature) of the curve that is
 given as a discrete set of points. I have the curve in 3D but I can
 certainly write it in 2D as well. Code is given at the bottom. I will
 try with the suggestions you have given, but if anyone has done this
 before, I would appreciate your help.
 Cheers../Murli
[snip]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] create dataset permanently in package (i.e. default or our own package)

2009-07-27 Thread cls59


Albert EINstEIN wrote:
 
 Hi,
 actually while opening R console and R commander we see some packages like
 car and datasets. in this packages we have default datasets are available.
 example: women and prestige like that. now i created a sales dataset
 importing from excel, xml or text file. now i want to store that dataset
 permanently  in any one of the package like i mentioned above (car or
 datasets). now i closed my R session. after some time i opened R console
 and R commander. Now I will not create again sales dataset.While clicking
 any one of package that sales dataset should be found. 
 if possible please give me the code it will be very helpful for us.
 
 Thanks in advance.
 
 

The steps you need to follow are:

1. Open a new R session.

2. Create your 'sales' dataset and any other datasets you want to preserve.
Make sure these are the only objects in your workspace- i.e. they are the
only names that come up when you use ls().

3. Create a new package using package.skeleton('myPackage')

This will create a new folder called myPackage that contains a folder called
data. Inside data will be a .rda file for your sales dataset and any other
datasets you had in your environment at the time package.sekeleton was run.

Now you need to install your package. This can be done by:

system('R CMD INSTALL myPackage')

Or from the command line:

R CMD INSTALL myPackage

Note that if you are using Windows, you will need to install Duncan
Murdoch's Rtools package located at:

http://www.murdoch-sutherland.com/Rtools/

If the installer asks anything about modifying your PATH, allow it to do so.

Once the package has been installed, you can load your dataset using:

library(myPackage)
data(sales)

If you want to add additional data sets, save them to individual .rda files
using:

save('myFirstNewDataset',file='myFirstNewDataset.rda')
save('mySecondNewDataset',file='mySecondNewDataset.rda')

Then move the .rda files to the data folder inside the myPackage folder and
re-run R CMD INSTALL

Hope that helps!

-Charlie

-
Charlie Sharpsteen
Undergraduate
Environmental Resources Engineering
Humboldt State University
-- 
View this message in context: 
http://www.nabble.com/create-dataset-permanently-in-package-%28i.e.-default-or-our-own-package%29-tp24679076p24688214.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to correlate nominal variables?

2009-07-27 Thread Mark Difford

Hi Timo,

 I need functions to calculate Yule's Y or Cramérs Index... Are such
 functions existing?

Also look at assocstats() in package vcd.

Regards, Mark.


Timo Stolz wrote:
 
 Dear R-Users,
 
 I need functions to calculate Yule's Y or Cramérs Index, in order to
 correlate variables that are nominally scaled?
 
 Am I wrong? Are such functions existing?
 
 Sincerely,
 Timo
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/how-to-correlate-nominal-variables--tp18441195p24688304.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reordering the columns of my dataframe

2009-07-27 Thread Mark Na
Hi R-helpers,

I have written this line of code:

 data-cbind(data[,1],data[,2:6],data[,18],data[,7:17])

to reorder the columns of my dataframe, but I'm losing the column names of
my 1st and 18th columns (they are now named data[,1] and data[,18]
respectively).

Can I use cbind to do this (without losing my column names) or is there
another way?

Many thanks,

Mark Na

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reordering the columns of my dataframe

2009-07-27 Thread Rolf Turner


On 28/07/2009, at 9:45 AM, Mark Na wrote:


Hi R-helpers,

I have written this line of code:


data-cbind(data[,1],data[,2:6],data[,18],data[,7:17])


to reorder the columns of my dataframe, but I'm losing the column  
names of

my 1st and 18th columns (they are now named data[,1] and data[,18]
respectively).

Can I use cbind to do this (without losing my column names) or is  
there

another way?


Just do

data - data[,c(1:6,18,7:17)]

cheers,

Rolf Turner

##
Attention:\ This e-mail message is privileged and confid...{{dropped:9}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Searching for specific values in a matrix

2009-07-27 Thread Steve Lianoglou
no luck, it's okay, i will figure it out!  i might isolate and  
recombine all the columns, maybe that will work.  thanks for the help!


No, wait .. no luck in being able to select out rows from your  
data.frame using values you see somewhere in the top 10 rows?


Can you just paste in some key lines in your session so we can see?

For instance, let's assume your data is in my.data, I'd like to see  
the results for:


# Replace the column values (1:5) with other columns
# you want to use for selection
R my.data[1:10,1:5]

# Now show me your query and it's result that returns
# a 0-row data.frame, for example using a value
# that appears in that column from the previous query
R my.data[my.data[,1] == 'something',]
0 rows (or 0-length row.names)

There should be a simple answer to what's going wrong here.

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Split rownames into factors

2009-07-27 Thread jimdare

Hi Guys,

I was wondering how you would go about solving the following problem:

I have a list where the grouping information is in the row names.

Rowname [,1]

X1Jan08  324
X1Jun08  65
X1Dec08  543
X2Jan08  23
X2Jun08  54
X2Dec08  8765
X3Jan08  213
X3Jun08  43
X3Dec08  65

How can I create the following dataframe:
   ValueDateGroup
[1,]  324  Jan 08X1
[2,]  65   Jun 08X1
[3,]  543 Dec 08X1
 etc.

Thanks for your help!
James

-- 
View this message in context: 
http://www.nabble.com/Split-rownames-into-factors-tp24689181p24689181.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Searching for specific values in a matrix

2009-07-27 Thread Steve Lianoglou

Ahh ..

On Jul 27, 2009, at 6:01 PM, Mehdi Khan wrote:

Even when choosing a value from the first few rows, it doesn't work.  
okay here it goes:


 rearranged[1:10, 1:5]
   xy band1 VSCAT.001 soiltype
1  -124.3949 40.42468NANA   CD
2  -124.3463 40.27358NANA   CD
3  -124.3357 40.25226NANA   CD
4  -124.3663 40.40241NANA   CD
5  -124.3674 40.49810NANA   CD
6  -124.3083 40.24744NA   464 NA
7  -124.3017 40.31295NANAD
8  -124.3375 40.47557NA   464 NA
9  -124.2511 40.11697 1NA NA
10 -124.2532 40.12640 1NA NA

 query- rearranged$y== 40.42468
 rearranged[query,]
[1] x y band1 VSCAT.001 soiltype
0 rows (or 0-length row.names)


This isn't working because the numbers you see for y (40.42468) isn't  
precisely what that number is. As I mentioned before you should use an  
almost.equals type of search for this scenario. My %~% function  
isn't working in your session because that is a function I've defined  
myself. You can of course use it, you just have to define it in your  
workspace. Paste these lines into your workspace (or save them to a  
file and source that file into your workspace).


## === almost.equal functions 

almost.equal - function(x, y, tolerance=.Machine$double.eps^0.5) {
  abs(x - y)  tolerance
}

%~% - function(x, y) almost.equal(x, y)

## === end paste ==

Now you can use %~% once that's in. Let's use the almost.equal  
function now because I don't know if the default tolerance here is too  
strict (I suspect showing the value for rearranged$y[1] will show you  
more significant digits than you're seeing in the table(?))


query - almost.equal(rearranged$y, 40.42468, tolerance=0.0001)
rearranged[query,]

This will get you something.


query- rearranged$ VSCAT.001== 464
except it's  a huge table (I guess I have to get rid of all rows  
with NA).


Yes, I believe I mentioned earlier that you have to axe the NA matches  
manually:


query - rearranged$VSCAT.001 == 464  !is.na(rearranged$VSCAT.001)
rearranged[query,]

Will get you what you want.

I tried using the %~% but R doesn't recognize it.  So maybe it has  
to do with the rounding errors?


Rounding errors won't happen with integer comparisons (and it looks  
like the VSCAT.001 columns is integers, no?).


-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Searching for specific values in a matrix

2009-07-27 Thread Bert Gunter
Nothing wrong with rolling your own, but see ?all.equal for R's built-in
almost.equal version.

Bert Gunter
Genentech Nonclinical Biostatistics

-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On
Behalf Of Steve Lianoglou
Sent: Monday, July 27, 2009 3:17 PM
To: Mehdi Khan
Cc: r-help@r-project.org
Subject: Re: [R] Searching for specific values in a matrix

Ahh ..

On Jul 27, 2009, at 6:01 PM, Mehdi Khan wrote:

 Even when choosing a value from the first few rows, it doesn't work.  
 okay here it goes:

  rearranged[1:10, 1:5]
xy band1 VSCAT.001 soiltype
 1  -124.3949 40.42468NANA   CD
 2  -124.3463 40.27358NANA   CD
 3  -124.3357 40.25226NANA   CD
 4  -124.3663 40.40241NANA   CD
 5  -124.3674 40.49810NANA   CD
 6  -124.3083 40.24744NA   464 NA
 7  -124.3017 40.31295NANAD
 8  -124.3375 40.47557NA   464 NA
 9  -124.2511 40.11697 1NA NA
 10 -124.2532 40.12640 1NA NA

  query- rearranged$y== 40.42468
  rearranged[query,]
 [1] x y band1 VSCAT.001 soiltype
 0 rows (or 0-length row.names)

This isn't working because the numbers you see for y (40.42468) isn't  
precisely what that number is. As I mentioned before you should use an  
almost.equals type of search for this scenario. My %~% function  
isn't working in your session because that is a function I've defined  
myself. You can of course use it, you just have to define it in your  
workspace. Paste these lines into your workspace (or save them to a  
file and source that file into your workspace).

## === almost.equal functions 

almost.equal - function(x, y, tolerance=.Machine$double.eps^0.5) {
   abs(x - y)  tolerance
}

%~% - function(x, y) almost.equal(x, y)

## === end paste ==

Now you can use %~% once that's in. Let's use the almost.equal  
function now because I don't know if the default tolerance here is too  
strict (I suspect showing the value for rearranged$y[1] will show you  
more significant digits than you're seeing in the table(?))

query - almost.equal(rearranged$y, 40.42468, tolerance=0.0001)
rearranged[query,]

This will get you something.

 query- rearranged$ VSCAT.001== 464
 except it's  a huge table (I guess I have to get rid of all rows  
 with NA).

Yes, I believe I mentioned earlier that you have to axe the NA matches  
manually:

query - rearranged$VSCAT.001 == 464  !is.na(rearranged$VSCAT.001)
rearranged[query,]

Will get you what you want.

 I tried using the %~% but R doesn't recognize it.  So maybe it has  
 to do with the rounding errors?

Rounding errors won't happen with integer comparisons (and it looks  
like the VSCAT.001 columns is integers, no?).

-steve

--
Steve Lianoglou
Graduate Student: Computational Systems Biology
   |  Memorial Sloan-Kettering Cancer Center
   |  Weill Medical College of Cornell University
Contact Info: http://cbio.mskcc.org/~lianos/contact

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Splitting matrix into several small matrices

2009-07-27 Thread kathie

Dear R users...

I need to split this matrix(or dataframe), for example,

z - matrix(c(13,1,1,1,1,12,0,0,0,0,8,1,0,1,1,8,0,1,0,0,
  10,1,1,1,1,3,0,1,0,0,3,1,0,1,1,6,1,1,1,1),8,5,byrow = T)


 z
 [,1] [,2] [,3] [,4] [,5]
[1,]   131111
[2,]   120000
[3,]81011
[4,]90100
[5,]   101111
[6,]30100
[7,]31011
[8,]61111


(actually, z matrix is big, about 1000*15 matrix)

to 4 matrices like this way,


#- 1st matrix--
131111
101111
 61111

#- 2nd matrix--
120000


#- 3rd matrix--
81011
31011


#- 4th matrix--
90100
30100



Any comments will be greatly appreciated.

Kathryn Lord
-- 
View this message in context: 
http://www.nabble.com/Splitting-matrix-into-several-small-matrices-tp24689585p24689585.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Draw plot.table axis on right hand side

2009-07-27 Thread Sean Carmody
With an ordinary plot, to customise the axis it is possible to suppress
drawing the axis and then call Axis. I have been trying to change the
location of the y-axis on a plot.table plot to the right hand side, but
cannot even work out how to suppress drawing the labels.

Here is a toy example of the sort of plot I am working with. Any suggestions
as to how to have the axis on the right hand side not the left hand side
would be appreciated.

set.seed(2)
data - data.frame(x= floor(runif(80)*5)+1, y=floor(runif(80)*5)+1)
data$a - c(Cow, Dog, Fish, Mouse, Frog)[data$x]
data$b - c(Banana, Apple, Pear, Orange, Melon)[data$y]
plot(table(data$a, data$b), col=rainbow(5), las=1, main=)

Regards,
Sean.

-- 
Sean Carmody

The Stubborn Mule
http://www.stubbornmule.net
http://twitter.com/seancarmody




-- 
Sean Carmody

The Stubborn Mule
http://www.stubbornmule.net
http://twitter.com/seancarmody

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Double Truncation Fit??

2009-07-27 Thread Vivek Ayer
Hey guys,

Do you all know of a function that provides fitting for double-sided
truncation? truncreg accounts for one-sided truncation, but not two,
or at least I don't how to. Our outlier values are -115 on the left
side and -55 on the right.

Help appreciated,
Vivek

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Split rownames into factors

2009-07-27 Thread Christopher Bare
Hi,

I'm not an R expert, but I thought I'd give your question a shot anyway.

First, it looks like you're starting with a matrix, rather than a
list. Let's hope I guessed that right:

 m = matrix(c(324, 65, 543, 23, 54, 8765, 213, 43, 65))
 rownames(m) = c('X1Jan08', 'X1Jun08', 'X1Dec08', 'X2Jan08', 'X2Jun08', 
 'X2Dec08', 'X3Jan08', 'X3Jun08', 'X3Dec08')
 m
   [,1]
X1Jan08  324
X1Jun08   65
X1Dec08  543
X2Jan08   23
X2Jun08   54
X2Dec08 8765
X3Jan08  213
X3Jun08   43
X3Dec08   65

You can pull the individual values out of the compound thing in
row names using regular expressions like so:

 gsub('([A-Z]\\d+)([A-Za-z]+)(\\d+)', '\\2 \\3', rownames(m), perl=T)

With that, we can make a data.frame:

 df = data.frame(Value=m[,1], Date=gsub('([A-Z]\\d+)([A-Za-z]+)(\\d+)', '\\2 
 \\3', rownames(m), perl=T), Group=gsub('([A-Z]\\d+)([A-Za-z]+)(\\d+)', '\\1', 
 rownames(m), perl=T))

The old compound row names hold over from the matrix, but we can cure
that easily enough:

 rownames(df) = NULL

 df
 Value   Date Group
1   324 Jan 08X1
265 Jun 08X1
3   543 Dec 08X1
423 Jan 08X2
554 Jun 08X2
6  8765 Dec 08X2
7   213 Jan 08X3
843 Jun 08X3
965 Dec 08X3

Both Date and Group will be coerced to factors, which is probably what
you want with Group and maybe not with Data.

If I'm wrong and you really have a list, it's not that different.
First, get a vector of values:

 data.list = list(X1Jan08=324, X1Jun08=65, X1Dec08=543, X2Jan08=23, 
 X2Jun08=54, X2Dec08=8765, X3Jan08=213, X3Jun08=43, X3Dec08=65)

 values = as.vector(data.list, mode=integer)

The rest is very similar to what's above. I hope this helps,

-chris

On Mon, Jul 27, 2009 at 3:10 PM, jimdarejamesdar...@gmail.com wrote:

 Hi Guys,

 I was wondering how you would go about solving the following problem:

 I have a list where the grouping information is in the row names.

 Rowname [,1]

 X1Jan08  324
 X1Jun08  65
 X1Dec08  543
 X2Jan08  23
 X2Jun08  54
 X2Dec08  8765
 X3Jan08  213
 X3Jun08  43
 X3Dec08  65

 How can I create the following dataframe:
       Value    Date    Group
 [1,]  324      Jan 08    X1
 [2,]  65       Jun 08    X1
 [3,]  543     Dec 08    X1
  etc.

 Thanks for your help!
 James

 --
 View this message in context: 
 http://www.nabble.com/Split-rownames-into-factors-tp24689181p24689181.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] frequent sequences

2009-07-27 Thread jgant

Hello,
I'm having a few issues mining frequent sequences. I've read the
documentation and played around with arules and arulesSequences with little
success. For example if I have a vector,

a = t(t(c(1,2,3,0,1,2,3,5,6,7)));

I'd like to be able mine the association rules {1,2}--{3}, {2}--{3},
{1}--{2}, etc. Can anyone point me in the right direction here. From what
I've read using arules, it doesn't seem to take a single vector and extract
sequences from the vector to mine from within itself. Any help would be
appreciated!

Thanks,
John
-- 
View this message in context: 
http://www.nabble.com/frequent-sequences-tp24687821p24687821.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Non-Linear Regression with two Predictors

2009-07-27 Thread Berlinerfee
Hi and thank you for your reply,

in my new regression formular the parameter delta is inserted:
fit - nls(dataset$V2~(( alpha / ( 1 + exp( beta - gamma* dataset$V1 ) ) 
) + (dataset$V6*delta)),data=dataset,start=startparam)

The sense is, that dataset$V6 is a dummy variable that represents the 
german reunion. I expect, that the delta in the regression is about 
1.000.000 because of the reunion. The logistic function thus has a jump 
at this point. But I would like to get the exact paramter value for 
delta (the jump) as the parameters for the logistic function of growth 
(alpha to gamma). The partial derivative to delta would be like a 
stair-function. It is 0 until 1990 and 1 there after.

Any idea? Thank you!

Regards

Moshe Olshansky schrieb:
 Hi,

 I believe that since delta does not appear in the function you are 
 optimizing, it's partial derivative with respect to delta is always 0 and so 
 the gradient is singular.
 Why do you need delta at all?

 --- On Mon, 27/7/09, Berlinerfee berliner...@yahoo.de wrote:

   
 From: Berlinerfee berliner...@yahoo.de
 Subject: [R] Non-Linear Regression with two Predictors
 To: r-h...@stat.math.ethz.ch
 Received: Monday, 27 July, 2009, 2:52 AM
 Hello there,

 I am using nls the first time for a non-linear regression
 with a logistic growth function:
 startparam - c(alpha=3e+07,beta=4000,gamma=2)
 fit - nls(dataset$V2~(( alpha / ( 1 + exp( beta - gamma
 * dataset$V1 ) ) ) ),data=dataset,start=startparam)

 Everything works fine and i get good results. Now I would
 like to improve the results using my DUMMY Variable
 (dataset$V6) the runs half of the time 0 and then 1. This is
 my new nls:
 startparam -
 c(alpha=3e+07,beta=4000,gamma=2,delta=100)
 fit - nls(dataset$V2~(( alpha / ( 1 + exp( beta - gamma
 * dataset$V1 ) ) ) + (dataset$V6*dataset$V1*delta)
 ),data=dataset,start=startparam)

 I get Singular Gradient Matrice. May anyone give me the
 right nls function for this problem??

 Regards

 __
 R-help@r-project.org
 mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained,
 reproducible code.

 

   


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Searching for specific values in a matrix

2009-07-27 Thread Mehdi Khan
the problem is, it works with the example data i gave.  however, it does NOT
work with the data set i have, which is 600,000 rows.  the class is still a
data frame.

On Mon, Jul 27, 2009 at 12:15 PM, Steve Lianoglou 
mailinglist.honey...@gmail.com wrote:


 On Jul 27, 2009, at 2:54 PM, Mehdi Khan wrote:

  i am able to return the first column, but anything else returns this:
 0 rows (or 0-length row.names)

 any idea?


 I'm not sure what you're doing.

 The result you're getting happens when no rows pass the logical test that
 you are using to index the rows of your data.frame for.

 Can you show the code that you are using (based on the example data you
 gave) that is giving you the 0 rows result?

 -steve



 On Tue, Jul 21, 2009 at 12:49 PM, Steve Lianoglou 
 mailinglist.honey...@gmail.com wrote:

 On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote:

 I understand your explanation about the test for even numbers.  However I
 am still a bit confused as to how to go about finding a particular value.
  Here is an example data set

 col #  attr1attr2   attr 3LONLAT
 17209 DNANA -122.9409 38.27645
 17210BCNANA -122.9581 38.36304
 17211 BNANA -123.6851 41.67121
 17212BCNANA -123.0724 38.93073
 17213 CNANA -123.7240 41.84403
 17214  NA   464NA -122.9430 38.30988
 17215 CNANA -123.4442 40.65369
 17216BCNANA -122.9389 38.31551
 17217 CNANA -123.0747 38.97998
 17218 CNANA -123.6580 41.59610
 17219 CNANA -123.4513 40.70992
 17220 CNANA -123.0901 39.06473
 17221BCNANA -123.0653 38.94845
 17222BCNANA -122.9464 38.36808
 17223  NA   464NA -123.0143 38.70205
 17224  NANA 5 -122.8609 37.94137
 17225  NANA 5 -122.8628 37.95057
 17226  NANA 7 -122.8646 37.95978

 For future reference, perhaps paste this in a way that's easy for us to
 paste into a running R session so we can use it, like so:

 df - data.frame(
 coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216, 17217,
 17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226),

 attr1=c(D,BC,B,BC,C,NA,C,BC,C,C,C,C,BC,BC,NA,NA,NA,NA),
 attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA),
 attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7),
 LON=c(
 -122.9409,-122.9581,-123.6851,-123.0724,-123.7240,-122.9430,-123.4442,-122.9389,-123.0747,-123.6580,-123.4513,-123.0901,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646),

 LAT=c(38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978))


 If I wanted to find the row with Lat = 37.95978

 Using an indexing vector:

 R lats - df$LAT == 37.95978
 # or with the %~% from before:
 # lats - df$LAT %~% 37.95978
 R df[lats,]
   coln attr1 attr2 attr3   LON  LAT
 18 17226  NANA 7 -122.8646 37.95978

 Using the subset function:

 R subset(df, LAT == 37.95978)
   coln attr1 attr2 attr3   LON  LAT
 18 17226  NANA 7 -122.8646 37.95978


 , how would i do that?  How would  I find the rows with BC?

 R subset(df, attr1 == 'BC')
   coln attr1 attr2 attr3   LON  LAT
 2  17210BCNANA -122.9581 38.36304
 4  17212BCNANA -123.0724 38.93073
 8  17216BCNANA -122.9389 38.31551
 13 17221BCNANA -123.0653 38.94845
 14 17222BCNANA -122.9464 38.36808


 If you try with an indexing vector the NA's will trip you up:

 R df[df$attr1 == 'BC',]
 coln attr1 attr2 attr3   LON  LAT
 217210BCNANA -122.9581 38.36304
 417212BCNANA -123.0724 38.93073
 NA  NA  NANANANA   NA
 817216BCNANA -122.9389 38.31551
 13   17221BCNANA -123.0653 38.94845
 14   17222BCNANA -122.9464 38.36808
 NA.1NA  NANANANA   NA
 NA.2NA  NANANANA   NA
 NA.3NA  NANANANA   NA
 NA.4NA  NANANANA   NA

 So you could do something like:

  df[df$attr1 == 'BC'  !is.na(df$attr1),]
   coln attr1 attr2 attr3   LON  LAT
 2  17210BCNANA -122.9581 38.36304
 4  17212BCNANA -123.0724 38.93073
 8  17216BCNANA -122.9389 38.31551
 13 17221BCNANA -123.0653 38.94845
 14 17222BCNANA -122.9464 38.36808


 HTH,
 -steve

 --
 Steve Lianoglou
 Graduate Student: Physiology, Biophysics and Systems Biology
 Weill Medical College of Cornell University

 Contact Info: 
 http://cbio.mskcc.org/~lianos/contacthttp://cbio.mskcc.org/%7Elianos/contact





 --
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center


Re: [R] Searching for specific values in a matrix

2009-07-27 Thread Mehdi Khan
no luck, it's okay, i will figure it out!  i might isolate and recombine all
the columns, maybe that will work.  thanks for the help!

On Mon, Jul 27, 2009 at 1:00 PM, Steve Lianoglou 
mailinglist.honey...@gmail.com wrote:


 On Jul 27, 2009, at 3:50 PM, Mehdi Khan wrote:

  the problem is, it works with the example data i gave.  however, it does
 NOT work with the data set i have, which is 600,000 rows.  the class is
 still a data frame.


 So the problem must be in your data, or what you think is in your data.
 Somehow you're constructing a boolean query that returns false for every
 row. As long as you're not getting any memory errors, the size of your data
 doesn't change the mechanics of how this would work.

 I suspect you're not getting 0 rows for every possible query you can come
 up with, right?

 Look at the first 10 lines of your dataset and try to select some rows from
 your entire data.frame by using values you can see in the first 10 rows
 you've just looked at.

 I'm expecting this would work, in which case I'm not sure how much more
 help I can provide.

 -steve



  On Mon, Jul 27, 2009 at 12:15 PM, Steve Lianoglou 
 mailinglist.honey...@gmail.com wrote:

 On Jul 27, 2009, at 2:54 PM, Mehdi Khan wrote:

 i am able to return the first column, but anything else returns this:
 0 rows (or 0-length row.names)

 any idea?

 I'm not sure what you're doing.

 The result you're getting happens when no rows pass the logical test
 that you are using to index the rows of your data.frame for.

 Can you show the code that you are using (based on the example data you
 gave) that is giving you the 0 rows result?

 -steve



 On Tue, Jul 21, 2009 at 12:49 PM, Steve Lianoglou 
 mailinglist.honey...@gmail.com wrote:

 On Jul 21, 2009, at 3:27 PM, Mehdi Khan wrote:

 I understand your explanation about the test for even numbers.  However I
 am still a bit confused as to how to go about finding a particular value.
  Here is an example data set

 col #  attr1attr2   attr 3LONLAT
 17209 DNANA -122.9409 38.27645
 17210BCNANA -122.9581 38.36304
 17211 BNANA -123.6851 41.67121
 17212BCNANA -123.0724 38.93073
 17213 CNANA -123.7240 41.84403
 17214  NA   464NA -122.9430 38.30988
 17215 CNANA -123.4442 40.65369
 17216BCNANA -122.9389 38.31551
 17217 CNANA -123.0747 38.97998
 17218 CNANA -123.6580 41.59610
 17219 CNANA -123.4513 40.70992
 17220 CNANA -123.0901 39.06473
 17221BCNANA -123.0653 38.94845
 17222BCNANA -122.9464 38.36808
 17223  NA   464NA -123.0143 38.70205
 17224  NANA 5 -122.8609 37.94137
 17225  NANA 5 -122.8628 37.95057
 17226  NANA 7 -122.8646 37.95978

 For future reference, perhaps paste this in a way that's easy for us to
 paste into a running R session so we can use it, like so:

 df - data.frame(
 coln=c(17209, 17210, 17211, 17212, 17213, 17214, 17215, 17216, 17217,
 17218, 17219, 17220, 17221, 17222, 17223, 17224, 17225, 17226),

 attr1=c(D,BC,B,BC,C,NA,C,BC,C,C,C,C,BC,BC,NA,NA,NA,NA),
 attr2=c( NA,NA,NA,NA,NA,464,NA,NA,NA,NA,NA,NA,NA,NA,464,NA,NA,NA),
 attr3=c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,5,5,7),
 LON=c(
 -122.9409,-122.9581,-123.6851,-123.0724,-123.7240,-122.9430,-123.4442,-122.9389,-123.0747,-123.6580,-123.4513,-123.0901,-123.0653,-122.9464,-123.0143,-122.8609,-122.8628,-122.8646),

 LAT=c(38.27645,38.36304,41.67121,38.93073,41.84403,38.30988,40.65369,38.31551,38.97998,41.59610,40.70992,39.06473,38.94845,38.36808,38.70205,37.94137,37.95057,37.95978))


 If I wanted to find the row with Lat = 37.95978

 Using an indexing vector:

 R lats - df$LAT == 37.95978
 # or with the %~% from before:
 # lats - df$LAT %~% 37.95978
 R df[lats,]
  coln attr1 attr2 attr3   LON  LAT
 18 17226  NANA 7 -122.8646 37.95978

 Using the subset function:

 R subset(df, LAT == 37.95978)
  coln attr1 attr2 attr3   LON  LAT
 18 17226  NANA 7 -122.8646 37.95978


 , how would i do that?  How would  I find the rows with BC?

 R subset(df, attr1 == 'BC')
  coln attr1 attr2 attr3   LON  LAT
 2  17210BCNANA -122.9581 38.36304
 4  17212BCNANA -123.0724 38.93073
 8  17216BCNANA -122.9389 38.31551
 13 17221BCNANA -123.0653 38.94845
 14 17222BCNANA -122.9464 38.36808


 If you try with an indexing vector the NA's will trip you up:

 R df[df$attr1 == 'BC',]
coln attr1 attr2 attr3   LON  LAT
 217210BCNANA -122.9581 38.36304
 417212BCNANA -123.0724 38.93073
 NA  NA  NANANANA   NA
 817216BCNANA -122.9389 38.31551
 13   17221BCNANA -123.0653 38.94845
 14   17222BCNA   

Re: [R] Searching for specific values in a matrix

2009-07-27 Thread Mehdi Khan
it worked! thank you so much!!

On Mon, Jul 27, 2009 at 3:16 PM, Steve Lianoglou 
mailinglist.honey...@gmail.com wrote:

 Ahh ..

 On Jul 27, 2009, at 6:01 PM, Mehdi Khan wrote:

  Even when choosing a value from the first few rows, it doesn't work. okay
 here it goes:

  rearranged[1:10, 1:5]
   xy band1 VSCAT.001 soiltype
 1  -124.3949 40.42468NANA   CD
 2  -124.3463 40.27358NANA   CD
 3  -124.3357 40.25226NANA   CD
 4  -124.3663 40.40241NANA   CD
 5  -124.3674 40.49810NANA   CD
 6  -124.3083 40.24744NA   464 NA
 7  -124.3017 40.31295NANAD
 8  -124.3375 40.47557NA   464 NA
 9  -124.2511 40.11697 1NA NA
 10 -124.2532 40.12640 1NA NA

  query- rearranged$y== 40.42468
  rearranged[query,]
 [1] x y band1 VSCAT.001 soiltype
 0 rows (or 0-length row.names)


 This isn't working because the numbers you see for y (40.42468) isn't
 precisely what that number is. As I mentioned before you should use an
 almost.equals type of search for this scenario. My %~% function isn't
 working in your session because that is a function I've defined myself. You
 can of course use it, you just have to define it in your workspace. Paste
 these lines into your workspace (or save them to a file and source that
 file into your workspace).

 ## === almost.equal functions 

 almost.equal - function(x, y, tolerance=.Machine$double.eps^0.5) {
  abs(x - y)  tolerance
 }

 %~% - function(x, y) almost.equal(x, y)

 ## === end paste ==

 Now you can use %~% once that's in. Let's use the almost.equal function now
 because I don't know if the default tolerance here is too strict (I suspect
 showing the value for rearranged$y[1] will show you more significant digits
 than you're seeing in the table(?))

 query - almost.equal(rearranged$y, 40.42468, tolerance=0.0001)
 rearranged[query,]

 This will get you something.

  query- rearranged$ VSCAT.001== 464
 except it's  a huge table (I guess I have to get rid of all rows with NA).


 Yes, I believe I mentioned earlier that you have to axe the NA matches
 manually:

 query - rearranged$VSCAT.001 == 464  !is.na(rearranged$VSCAT.001)
 rearranged[query,]

 Will get you what you want.

  I tried using the %~% but R doesn't recognize it.  So maybe it has to do
 with the rounding errors?


 Rounding errors won't happen with integer comparisons (and it looks like
 the VSCAT.001 columns is integers, no?).


 -steve

 --
 Steve Lianoglou
 Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
 Contact Info: 
 http://cbio.mskcc.org/~lianos/contacthttp://cbio.mskcc.org/%7Elianos/contact



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Making a sub data.frame

2009-07-27 Thread desper

Dear all,

I have a data.frame like this 

ID VAR1
11   blaaal
121 blalda
121  adada
234baada
231 ddaaa
231 baada
...   ...

and I have another vector of ID, say, c(121,234,231)
How could I collect all the observations start with ID from c(121,234,231) ?


Thanks

All the Best,
Desper
-- 
View this message in context: 
http://www.nabble.com/Making-a-sub-data.frame-tp24687873p24687873.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Searching for specific values in a matrix

2009-07-27 Thread Mehdi Khan
Even when choosing a value from the first few rows, it doesn't work. okay
here it goes:

 rearranged[1:10, 1:5]
   xy band1 VSCAT.001 soiltype
1  -124.3949 40.42468NANA   CD
2  -124.3463 40.27358NANA   CD
3  -124.3357 40.25226NANA   CD
4  -124.3663 40.40241NANA   CD
5  -124.3674 40.49810NANA   CD
6  -124.3083 40.24744NA   464 NA
7  -124.3017 40.31295NANAD
8  -124.3375 40.47557NA   464 NA
9  -124.2511 40.11697 1NA NA
10 -124.2532 40.12640 1NA NA

 query- rearranged$y== 40.42468
 rearranged[query,]
[1] x y band1 VSCAT.001 soiltype
0 rows (or 0-length row.names)

hmm it seems to be working for the whole number...

query- rearranged$ VSCAT.001== 464
except it's  a huge table (I guess I have to get rid of all rows with NA). I
tried using the %~% but R doesn't recognize it.  So maybe it has to do with
the rounding errors?


On Mon, Jul 27, 2009 at 2:56 PM, Steve Lianoglou 
mailinglist.honey...@gmail.com wrote:

  no luck, it's okay, i will figure it out!  i might isolate and recombine
 all the columns, maybe that will work.  thanks for the help!


 No, wait .. no luck in being able to select out rows from your data.frame
 using values you see somewhere in the top 10 rows?

 Can you just paste in some key lines in your session so we can see?

 For instance, let's assume your data is in my.data, I'd like to see the
 results for:

 # Replace the column values (1:5) with other columns
 # you want to use for selection
 R my.data[1:10,1:5]

 # Now show me your query and it's result that returns
 # a 0-row data.frame, for example using a value
 # that appears in that column from the previous query
 R my.data[my.data[,1] == 'something',]
 0 rows (or 0-length row.names)

 There should be a simple answer to what's going wrong here.

 -steve

 --
 Steve Lianoglou

 Graduate Student: Computational Systems Biology
  |  Memorial Sloan-Kettering Cancer Center
  |  Weill Medical College of Cornell University
 Contact Info: 
 http://cbio.mskcc.org/~lianos/contacthttp://cbio.mskcc.org/%7Elianos/contact



[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


  1   2   >