Re: [R] shapiro wilk normality test

2008-07-13 Thread C.H.
You may consider the nortest package.

http://cran.r-project.org/web/packages/nortest/index.html

Regards,

CH

On Sat, Jul 12, 2008 at 11:30 PM, Bunny, lautloscrew.com
[EMAIL PROTECTED] wrote:
 Hi everybody,

 somehow i dont get the shapiro wilk test for normality. i just can´t find
 what the H0 is .

 i tried :

  shapiro.test(rnorm(5000))

Shapiro-Wilk normality test

 data:  rnorm(5000)
 W = 0.9997, p-value = 0.6205


 If normality is the H0, the test says it´s probably not normal, doesn´t it ?

 5000 is the biggest n allowed by the test...

 are there any other test ? ( i know qqnorm already ;)

 thanks in advance

 matthias
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
CH Chan
Research Assistant - KWH
http://www.macgrass.com
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Excel Trend Function

2008-07-13 Thread paulandpen

Hi Felipe,

Daniel mentions imputation is a disputed practice.  There are 
recommendations and rules of thumb for its use.  I am not sure that 
imputation is disputed.  I would be interested to see some links to articles 
recommending against its use.


Paul


- Original Message - 
From: Felipe Carrillo [EMAIL PROTECTED]

To: [EMAIL PROTECTED]
Sent: Sunday, July 13, 2008 5:46 AM
Subject: [R] Excel Trend Function



Hi:
I have a dataset and need to interpolate for missing days. In Excel I 
either average from sampled days from above and below the missing days or 
use the TREND function to make up for the missing values. I have been 
reading about na.approx, is this function similar to the TREND function? 
Which is the best recommendable way to make up for missing data?

Here's my dataset: weeks 17,18,26 and 46 have 0 daysSamp.

Year Week daysSamp Lower TotalPD Upper varTotalPD
2006 47 6 126988 188259 249530 1045878675
2006 48 7 189155 253350 317545 1148102355
2006 49 7 103300 132741 162182 241480186
2006 50 6 11801 252576 493352 16151006813
2006 51 7 2348 3671 4994 487926
2006 52 5 2606 29901 57196 215454181
2006 2 7 2968 4513 6058 664723
2006 3 7 1128 1889 2650 161231
2006 4 7 479 963 1447 65196
2006 5 7 2819 4413 6007 708094
2006 6 6 -1009 3128 7264 4766743
2006 7 7 -5239 10769 26777 71387835
2006 8 7 150 503 856 34685
2006 9 7 1858 2989 4120 356562
2006 10 7 193 494 795 25281
2006 11 7 125 346 567 13627
2006 12 7 432 767 1102 31189
2006 13 7 1229 1867 2505 113569
2006 14 7 813 1339 1865 77140
2006 15 4 -66 124 315 10105
2006 16 7 152 903 1654 157242
2006 17 0
2006 18 0
2006 19 5 0 0 0 0
2006 20 4 0 0 0 0
2006 21 5 0 0 0 0
2006 22 6 0 0 0 0
2006 23 7 -65 285 635 34112
2006 24 6 0 0 0 0
2006 25 7 0 0 0 0
2006 26 0
2006 27 4 228 931 1634 137726
2006 28 4 801 2231 3662 569977
2006 29 4 4544 9242 13939 6147522
2006 30 5 15798 28465 41131 44697915
2006 31 5 25398 41049 56701 68245523
2006 32 5 48197 82216 116235 322416917
2006 33 5 142980 230411 317841 2129630128
2006 34 5 227141 360468 493794 4952314336
2006 35 5 467244 756325 1045405 23281569629
2006 36 5 281049 463331 645614 9256900449
2006 37 2 227636 620330 1013023 42961663047
2006 38 3 478990 983472 1487954 70903343603
2006 39 7 539690 846522 1153354 26228718974
2006 40 7 320959 457866 594773 5221891252
2006 41 7 427561 582452 737343 6683813344
2006 42 7 271788 351103 430418 1752614293
2006 43 7 165019 208853 252687 535301133
2006 44 7 91514 117390 143266 186537178
2006 45 7 59061 79187 99313 112842787
2006 46 0

Felipe D. Carrillo
Supervisory Fishery Biologist
Department of the Interior
US Fish  Wildlife Service
 California, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] a warning message from lmer

2008-07-13 Thread Lan Wei
Thanks for reminding me. But, the problem still exists when I combine these two 
'joy' levels.

Quoting Douglas Bates [EMAIL PROTECTED]:

 By the way, did you notice that the levels of Emotion include both
 joy and joy .  You may want to correct that.

 On Sat, Jul 12, 2008 at 7:47 AM, Douglas Bates [EMAIL PROTECTED] wrote:
 On Sat, Jul 12, 2008 at 6:23 AM, Lan Wei [EMAIL PROTECTED] wrote:
 Hi all,

 I have a problem when running lmer.
 In my data set, Agree is a binary(0/1) response. WalkerID and ObsID is
 the identification number of the subjects. the description of the
 other variables are as follows:

 levels(regdat$Display)

 [1] Dynamic Static

 levels(regdat$Survey)

 [1] HM1_A HM1_B HM1_C HM2_A HM2_B HM2_C ST_A  ST_B
 ST_C

 levels(regdat$Emotion)

 [1] aneu ang  con  joy  joy  sad

 levels(regdat$ObsGender)

 [1] F M

 levels(regdat$WalkerGender)

 [1] F M

 the watning is:

 fit1-lmer(Agree~Display+Survey+Emotion+WalkerGender+ObsGender+(1|WalkerID)+(1|ObsID),family=binomial(link='logit'),data=regdat)
 Warning message:
 In mer_finalize(ans, verbose) : gr cannot be computed at initial par
 (65)

 Does anybody have some hint to solve this problem? I'd very much appreciate
 it!

 In situations like this it is best to add the argument

 verbose = TRUE

 in the call to lmer so that you can see the progress of the
 iterations.  (Also, you may want to call glmer directly.  When you
 call lmer with a non-gaussian family it simply calls glmer.  You can
 avoid the extra step.)

 This call is returning a warning about evaluation of the gradient at
 the initial values of the parameters.  I'm not sure if it then goes on
 to optimize the approximated deviance.

 If the approximated deviance is not being minimized for this model you
 may want to start with a simpler model, omitting some of the terms in
 the fixed effects.




[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Position in a vector of the last value n - *SOLVED*

2008-07-13 Thread Thaden, John J
Yes, your version (func2) is quick, quickest for longer vectors:
 m - matrix(rexp(6e6,rate=0.05), nrow=5) # 120 cols
 m[m20] - 20
 func1 - function(v,cut=20)  max(which(vcut))
 func2 - function(v,cut=20) {
+x - which(vcut)
+x[length(x)]
+ }
 func3 - function(v,cut=20) tail(which(vcut), 1)
 system.time(apply(m, 2, func1))
   user  system elapsed 
   0.580.010.59 
 system.time(apply(m, 2, func2))
   user  system elapsed 
   0.480.040.53 
 system.time(apply(m, 2, func3))
   user  system elapsed 
   0.550.000.56
-John Thaden

-Original Message-
From: jim holtman [mailto:[EMAIL PROTECTED] 
Sent: Saturday, July 12, 2008 6:56 AM
To: Thaden, John J
Cc: r-help@r-project.org
Subject: Re: [R] Position in a vector of the last value  n - *SOLVED*

A slight modification gives the equivalent results instead of using 'tail'

 m - matrix(rexp(6e6,rate=0.05), nrow=600) # 5,000 cols
 m[m20] - 20
 func1 - function(v,cut=20)  max(which(v20))
 func2 - function(v,cut=20) {
+ x - which(v20)
+ x[length(x)]
+ }
 system.time(apply(m, 2, func1))
   user  system elapsed
   1.330.051.47
 #   user  system elapsed
 #   0.400.020.42
 system.time(apply(m, 2, func2))
   user  system elapsed
   1.310.081.44
 #   user  system elapsed
 #   0.700.050.75


Here is another view using Rprof on the first version.  You can see
that 'tail' takes a fair amount of time; accounts for the differences
in timing:

/cygdrive/c: perl perf/bin/readrprof.pl tempxx.txt
  0   2.7 root
  1.1.8 system.time
  2. .1.7 eval
  3. . .1.7 eval
  4. . . .1.7 apply
  5. . . . |1.5 FUN
  6. . . . | .0.8 tail
  7. . . . | . .0.5 which
  8. . . . | . . .0.1 
  8. . . . | . . .0.0 
  8. . . . | . . .0.0 !
  7. . . . | . .0.3 tail.default
  8. . . . | . . .0.2 stopifnot
  9. . . . | . . . .0.1 eval
  9. . . . | . . . .0.0 match.call
  9. . . . | . . . .0.0 any
  6. . . . | .0.5 which
  7. . . . | . .0.1 
  7. . . . | . .0.1 
  7. . . . | . .0.0 names-
  7. . . . | . .0.0 is.na
  5. . . . |0.1 aperm
  5. . . . |0.0 unlist
  6. . . . | .0.0 lapply
  5. . . . |0.0 is.null
  2. .0.1 gc
  1.0.8 matrix
  2. .0.7 as.vector
  3. . .0.6 rexp
  1.0.1 
/cygdrive/c:


On Fri, Jul 11, 2008 at 12:23 PM, Thaden, John J [EMAIL PROTECTED]
wrote:
 I had written asking for a simple way to extract the
 Index of the last value in a vector greater than some
 cutoff, e.g., the index, 6, for a cutoff of 20 and this
 example vector:

 v - c(20, 134, 45, 20, 24, 500, 20, 20, 20)

 Thank you, Alain Guillet, for this simple solution sent
 to me offlist:

 max(which(v  20)

 Also, thank you Lisa Readdy for a lengthier solution.

 Other offerings yielded the value instead of the index
 (the phrasing of my question apparently was misleading):

 v[max(which(v  20))]  (Henrique Dallazuanna)

 tail(v[v20],1)(Jim Holtman)

 Jim's use of tail() suggests a variant to Alain's
 solution

 tail(which(v  20), 1)

 This is faster than the max() version with long vectors,
 but, to my surprise, slower (on my WinXP Lenovo T61 laptop)
 in a rough mockup of my column-wise apply() usage:

 m - matrix(rexp(3e6,rate=0.05), nrow=600) # 5,000 cols
 m[m20] - 20
 func1 - function(v,cut=20)  max(which(v20))
 func2 - function(v,cut=20) tail(which(v20),1)
 system.time(apply(m, 2, func1))
 #   user  system elapsed
 #   0.400.020.42
 system.time(apply(m, 2, func2))
 #   user  system elapsed
 #   0.700.050.75

 Thank you again, Alain and others.
 John

 

 On Thu, Jul 10, 2008 at 9:41 AM, John Thaden wrote:
 This shouldn't be hard, but it's just not
 coming to me:
 Given a vector, e.g.,
 v - c(20, 134, 45, 20, 24, 500, 20, 20, 20)
 how can I get the index of the last value in
 the vector that has a value greater than n, in
 the example, with n  20?  I'm looking for
 an efficient function I can use on very large
 matrices, as the FUN argument in the apply()
 command.

 Confidentiality Notice: This e-mail message, including a...{{dropped:8}}

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

Confidentiality Notice: This e-mail message, including a...{{dropped:8}}

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R crash with ATLAS precompiled Rblas.dll on Windows XP Core2 Duo

2008-07-13 Thread Prof Brian Ripley
Yes, that Rblas.dll is known to be faulty, and the person who built it is 
unable to re-build it.  It needs to be removed from CRAN.


(I've also tried to build on Core 2 Duo, and my Cygwin installation has a 
compiler crash during the build.)


On Tue, 8 Jul 2008, Law, Jason wrote:


I noticed a problem using R 2.7.1 on Windows XP SP2 with the precompiled
Atlas Rblas.dll.  Running the code below causes R to crash.  I started R
using Rgui --vanilla and am using the precompiled Atlas Rblas.dll from
cran.fhcrc.org dated 17-Jul-2007 05:04 for Core2 Duo.

The code that causes the crash:

x - rnorm(100)
y - rnorm(100)
z - rnorm(100)
loess(z ~ x * y)

loess(z ~ x) does not cause a crash using the Atlas BLAS and neither does
running the above code with the Rblas.dll that came with R 2.7.1.  In
addition, the code runs fine using the Atlas BLAS under R 2.6.2.

The windows error information that is printed to the screen when R closes:

AppName: rgui.exeAppVer: 2.71.45970.0ModName: rblas.dll
ModVer: 2.51.42199.0 Offset: 000501cc


sessionInfo returns:

R version 2.7.1 (2008-06-23)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

I checked the R FAQ, R for Windows FAQ, and the README associated with the
Atlas BLAS on CRAN and couldn't find any information related to possible
crash causes.  I've used the ATLAS BLAS for about 6 months on this machine
(it's a new machine) with R 2.6.2.

Using debug(stats:::simpleLoess), I've found that the crash occurs on the
first iteration of the line:

z - .C(R_loess_raw, as.double(y), as.double(x),
   as.double(weights), as.double(robust), as.integer(D),
   as.integer(N), as.double(span), as.integer(degree),
   as.integer(nonparametric), as.integer(order.drop.sqr),
   as.integer(sum.drop.sqr), as.double(span * cell),
   as.character(surf.stat), fitted.values = double(N),
   parameter = integer(7), a = integer(max.kd),
   xi = double(max.kd), vert = double(2 * D), vval = double((D
+
 1) * max.kd), diagonal = double(N), trL = double(1),
   delta1 = double(1), delta2 = double(1), as.integer(surf.stat
==
 interpolate/exact))

After that, I'm kind of stuck in terms of tracking it down.

Thanks for any input,

Jason Law
City of Portland, OR

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Brian D. Ripley,  [EMAIL PROTECTED]
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford, Tel:  +44 1865 272861 (self)
1 South Parks Road, +44 1865 272866 (PA)
Oxford OX1 3TG, UKFax:  +44 1865 272595

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] How to build a package which loads Rgraphviz (if installed)...

2008-07-13 Thread Uwe Ligges



Søren Højsgaard wrote:

The tricky part is not getting it through the checks on my computer. It is when I upload to CRAN I 
get the problems, because their computers need Rgraphviz as well... 
(Suggests does not seem to be the solution...)



Søren,

CRAN (which means my machine in the case when Windows is concerned) 
has a running version of Rgraphiz nowadays (since a week or so). If you 
like I can trigger updates of your packages.


Best wishes,
Uwe



Cheers
Søren



Fra: Duncan Murdoch [mailto:[EMAIL PROTECTED]
Sendt: sø 13-07-2008 00:36
Til: Søren Højsgaard
Cc: William Revelle; [EMAIL PROTECTED]
Emne: Re: [R] How to build a package which loads Rgraphviz (if installed)...



On 12/07/2008 6:27 PM, Søren Højsgaard wrote:

Bill,
Thanks for the suggestion, but it does not solve the problem; I get the same 
warning from rcmd check. I suspect that rcmd check actually checks that any 
package referred to in require() is declared in the DESCRIPTION file. From the 
version numbers from your 'psych' package I guess you are stuck with the same 
problem???


There are varying degrees of dependence.  Probably Suggests is what
you want.

Note that *you* need to have RGraphViz to make it through the checks,
but other users won't need it.

Duncan Murdoch

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R crash with ATLAS precompiled Rblas.dll on Windows XP Core2 Duo

2008-07-13 Thread Uwe Ligges



Prof Brian Ripley wrote:
Yes, that Rblas.dll is known to be faulty, and the person who built it 
is unable to re-build it.  It needs to be removed from CRAN.



Whoops, I forgot to remove it and will do so this afternoon.

Uwe


(I've also tried to build on Core 2 Duo, and my Cygwin installation has 
a compiler crash during the build.)


On Tue, 8 Jul 2008, Law, Jason wrote:


I noticed a problem using R 2.7.1 on Windows XP SP2 with the precompiled
Atlas Rblas.dll.  Running the code below causes R to crash.  I started R
using Rgui --vanilla and am using the precompiled Atlas Rblas.dll from
cran.fhcrc.org dated 17-Jul-2007 05:04 for Core2 Duo.

The code that causes the crash:

x - rnorm(100)
y - rnorm(100)
z - rnorm(100)
loess(z ~ x * y)

loess(z ~ x) does not cause a crash using the Atlas BLAS and neither does
running the above code with the Rblas.dll that came with R 2.7.1.  In
addition, the code runs fine using the Atlas BLAS under R 2.6.2.

The windows error information that is printed to the screen when R 
closes:


AppName: rgui.exe AppVer: 2.71.45970.0 ModName: rblas.dll
ModVer: 2.51.42199.0 Offset: 000501cc


sessionInfo returns:

R version 2.7.1 (2008-06-23)
i386-pc-mingw32

locale:
LC_COLLATE=English_United States.1252;LC_CTYPE=English_United
States.1252;LC_MONETARY=English_United
States.1252;LC_NUMERIC=C;LC_TIME=English_United States.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

I checked the R FAQ, R for Windows FAQ, and the README associated with 
the

Atlas BLAS on CRAN and couldn't find any information related to possible
crash causes.  I've used the ATLAS BLAS for about 6 months on this 
machine

(it's a new machine) with R 2.6.2.

Using debug(stats:::simpleLoess), I've found that the crash occurs on the
first iteration of the line:

z - .C(R_loess_raw, as.double(y), as.double(x),
   as.double(weights), as.double(robust), as.integer(D),
   as.integer(N), as.double(span), as.integer(degree),
   as.integer(nonparametric), as.integer(order.drop.sqr),
   as.integer(sum.drop.sqr), as.double(span * cell),
   as.character(surf.stat), fitted.values = double(N),
   parameter = integer(7), a = integer(max.kd),
   xi = double(max.kd), vert = double(2 * D), vval = 
double((D

+
 1) * max.kd), diagonal = double(N), trL = double(1),
   delta1 = double(1), delta2 = double(1), 
as.integer(surf.stat

==
 interpolate/exact))

After that, I'm kind of stuck in terms of tracking it down.

Thanks for any input,

Jason Law
City of Portland, OR

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing RWinEdt

2008-07-13 Thread Uwe Ligges



[EMAIL PROTECTED] wrote:

From the R console I invoke:


install.packages(RWinEdt)

and get:

Warning in install.packages(RWinEdt) :
  argument 'lib' is missing: using 'F:\Users\Kevin\Documents/R/win-library/2.7'
--- Please select a CRAN mirror for use in this session ---
trying URL 
'http://streaming.stat.iastate.edu/CRAN/bin/windows/contrib/2.7/RWinEdt_1.8-0.zip'
Content type 'application/zip' length 361598 bytes (353 Kb)
opened URL
downloaded 353 Kb

package 'RWinEdt' successfully unpacked and MD5 sums checked

The downloaded packages are in
F:\Users\Kevin\AppData\Local\Temp\RtmpOIlW0F\downloaded_packages
updating HTML package descriptions

So it seems to have worked. But when I use the 'library' command I get:


library(RWinEdt)

Error in file(file, r) : cannot open the connection
In addition: Warning message:
In file(file, r) :
  cannot open file 'F:\Program Files (x86)\WinEdt Team\WinEdt\R.ver': No such 
file or directory
Error : .onAttach failed in 'attachNamespace'
Error: package/namespace load failed for 'RWinEdt'

Any ideas on how I can install this package?


With administrator privileges, since it needs to write some files into 
the WinEdt directory.


Best wishes,
Uwe Ligges



Thank you.

Kevin

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro wilk normality test

2008-07-13 Thread Marta Colombo
Hi! 
Well, if you look at the output:
shapiro.test(rnorm(5000))

        Shapiro-Wilk normality test

 data:  rnorm(5000)
 W = 0.9997, p-value = 0.6205

You can see that the p-value is 0.6205 so you can't refuse the normality 
hypotesis. 
H0: normal data    vs H1: not normal
So shapiro.wilk test is saying that your data are normal and it's correct!
Bye
Marta


- Messaggio originale -
Da: C.H. [EMAIL PROTECTED]
A: Bunny, lautloscrew.com [EMAIL PROTECTED]
Cc: r-help@r-project.org
Inviato: Domenica 13 luglio 2008, 7:27:43
Oggetto: Re: [R] shapiro wilk normality test

You may consider the nortest package.

http://cran.r-project.org/web/packages/nortest/index.html

Regards,

CH

On Sat, Jul 12, 2008 at 11:30 PM, Bunny, lautloscrew.com
[EMAIL PROTECTED] wrote:
 Hi everybody,

 somehow i dont get the shapiro wilk test for normality. i just can´t find
 what the H0 is .

 i tried :

  shapiro.test(rnorm(5000))

        Shapiro-Wilk normality test

 data:  rnorm(5000)
 W = 0.9997, p-value = 0.6205


 If normality is the H0, the test says it´s probably not normal, doesn´t it ?

 5000 is the biggest n allowed by the test...

 are there any other test ? ( i know qqnorm already ;)

 thanks in advance

 matthias
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
CH Chan
Research Assistant - KWH
http://www.macgrass.com
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



  Vuoi incontrare Rihanna?
[[elided Yahoo spam]]

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assoociative array?

2008-07-13 Thread jim holtman
The reason for the empty levels was I did not put drop=TRUE on the
split to remove unused levels.  Here is the revised script:

 set.seed(1)  # start with a known number
 x - data.frame(cat=sample(LETTERS[1:3],20,TRUE),a=sample(letters[1:4], 20, 
 TRUE), b=runif(20))
 x
   cat a  b
1A d 0.82094629
2B a 0.64706019
3B c 0.78293276
4C a 0.55303631
5A b 0.52971958
6C b 0.78935623
7C a 0.02333120
8B b 0.47723007
9B d 0.73231374
10   A b 0.69273156
11   A b 0.47761962
12   A c 0.86120948
13   C b 0.43809711
14   B a 0.24479728
15   C d 0.07067905
16   B c 0.09946616
17   C d 0.31627171
18   C a 0.51863426
19   B c 0.66200508
20   C b 0.40683019
 # drop unused groups from the split
 (z - split(x, list(x$cat, x$a), drop=TRUE))
$B.a
   cat a b
2B a 0.6470602
14   B a 0.2447973

$C.a
   cat a  b
4C a 0.55303631
7C a 0.02333120
18   C a 0.51863426

$A.b
   cat a b
5A b 0.5297196
10   A b 0.6927316
11   A b 0.4776196

$B.b
  cat a b
8   B b 0.4772301

$C.b
   cat a b
6C b 0.7893562
13   C b 0.4380971
20   C b 0.4068302

$A.c
   cat a b
12   A c 0.8612095

$B.c
   cat a  b
3B c 0.78293276
16   B c 0.09946616
19   B c 0.66200508

$A.d
  cat a b
1   A d 0.8209463

$B.d
  cat a b
9   B d 0.7323137

$C.d
   cat a  b
15   C d 0.07067905
17   C d 0.31627171

 # access the value ('b' in this instance); two ways- should be the same
 z[[1]]$b
[1] 0.6470602 0.2447973
 z$B.a$b
[1] 0.6470602 0.2447973






On Sun, Jul 13, 2008 at 1:26 AM,  [EMAIL PROTECTED] wrote:
 This is almost it. Maybe it is as good as can be expected. The only problem 
 that I see is that this seems to form a Category/SubCategory pair where none 
 existed in the original data. For example, A might have two sub-categories a 
 and b, and B might have two categories c and d. As far as I can tell the 
 method that you outlined forms a Category/SubCategory pair like B a or B b 
 where none existed. This results in alot of empty lists and it seems to take 
 a long time to generate. But if that is as good as it gets then I can live 
 with it.

 I know that I said one more question. But I have run into a problem. c - 
 split(x, x$Category) returns a vector of the rows in each of the categories. 
 Now I would like to access the Quantity column within this split vector. I 
 can see it listed. I just can't access it. I have tried c[1]$Quantity and 
 c[1,2] both which give me errors. Any ideas?

 Sorry this is so hard for me. I am more used to C type arrays and C type 
 arrays of structures. This seems to be somewhat different.

 Thank you.

 Kevin
  jim holtman [EMAIL PROTECTED] wrote:
 Is this something like what you were asking for?  The output of a
 'split' will be a list of the dataframe subsets for the categories you
 have specified.

  x - data.frame(g1=sample(LETTERS[1:2],30,TRUE),
 + g2=sample(letters[1:2], 30, TRUE),
 + g3=1:30)
  y - split(x, list(x$g1, x$g2))
  str(y)
 List of 4
  $ A.a:'data.frame':7 obs. of  3 variables:
   ..$ g1: Factor w/ 2 levels A,B: 1 1 1 1 1 1 1
   ..$ g2: Factor w/ 2 levels a,b: 1 1 1 1 1 1 1
   ..$ g3: int [1:7] 3 4 6 8 9 13 24
  $ B.a:'data.frame':7 obs. of  3 variables:
   ..$ g1: Factor w/ 2 levels A,B: 2 2 2 2 2 2 2
   ..$ g2: Factor w/ 2 levels a,b: 1 1 1 1 1 1 1
   ..$ g3: int [1:7] 10 11 16 17 18 20 25
  $ A.b:'data.frame':6 obs. of  3 variables:
   ..$ g1: Factor w/ 2 levels A,B: 1 1 1 1 1 1
   ..$ g2: Factor w/ 2 levels a,b: 2 2 2 2 2 2
   ..$ g3: int [1:6] 2 12 23 26 27 29
  $ B.b:'data.frame':10 obs. of  3 variables:
   ..$ g1: Factor w/ 2 levels A,B: 2 2 2 2 2 2 2 2 2 2
   ..$ g2: Factor w/ 2 levels a,b: 2 2 2 2 2 2 2 2 2 2
   ..$ g3: int [1:10] 1 5 7 14 15 19 21 22 28 30
  y
 $A.a
g1 g2 g3
 3   A  a  3
 4   A  a  4
 6   A  a  6
 8   A  a  8
 9   A  a  9
 13  A  a 13
 24  A  a 24

 $B.a
g1 g2 g3
 10  B  a 10
 11  B  a 11
 16  B  a 16
 17  B  a 17
 18  B  a 18
 20  B  a 20
 25  B  a 25

 $A.b
g1 g2 g3
 2   A  b  2
 12  A  b 12
 23  A  b 23
 26  A  b 26
 27  A  b 27
 29  A  b 29

 $B.b
g1 g2 g3
 1   B  b  1
 5   B  b  5
 7   B  b  7
 14  B  b 14
 15  B  b 15
 19  B  b 19
 21  B  b 21
 22  B  b 22
 28  B  b 28
 30  B  b 30

  y[[2]]
g1 g2 g3
 10  B  a 10
 11  B  a 11
 16  B  a 16
 17  B  a 17
 18  B  a 18
 20  B  a 20
 25  B  a 25
 
 
 


 On Sat, Jul 12, 2008 at 8:51 PM,  [EMAIL PROTECTED] wrote:
  OK. Now I know that I am dealing with a data frame. One last question on 
  this topic. a - read.csv() gives me a dataframe. If I have 'c - split(x, 
  x$Category), then what is  returned by split in this case? c[1] seems to 
  be OK but c[2] is not right in my mind. If I run ci - split(nrow(a), 
  a$Category). And then ci[1] seems to be the rows associated with the first 
  category, c[2] is the indices/rows associated with the second category, 
  etc. But this seems different than c[1], c[2], etc.
 
  Using the techniques below I 

Re: [R] Reading Multi-value data fields for descriptive analysis

2008-07-13 Thread jim holtman
This may do what you want:

 x - read.table(/tempxx.txt, comment=, quote=, sep=|, header=TRUE, 
 as.is=TRUE)
 # split out by name
 z - lapply(seq(nrow(x)), function(.row){
+ .result - NULL
+ # construct the data output
+ for (i in c('picnic', 'food', 'other')){
+ .split - strsplit(x[.row,][[i]], ;#)
+ .result - rbind(.result, cbind(name=x[.row,][['name']],
field=i, value=unlist(.split)))
+ }
+ .result
+ })


 z
[[1]]
 namefieldvalue
[1,] Yogi Bear picnic Yes
[2,] Yogi Bear food   Hamburgers
[3,] Yogi Bear food   Hot Dogs
[4,] Yogi Bear food   I rely on others to bring the good stuff
[5,] Yogi Bear other  \Softball
[6,] Yogi Bear other  Blanket
[7,] Yogi Bear other  I bring boo-boo, but he hides\

[[2]]
 name  fieldvalue
[1,] Boo-Boo picnic Yes
[2,] Boo-Boo food   Potato Salad
[3,] Boo-Boo food   Cole Slaw
[4,] Boo-Boo food   whatever Yogi doesn't eat
[5,] Boo-Boo other  Lawn Chairs
[6,] Boo-Boo other  Blanket
[7,] Boo-Boo other  my running shoes

[[3]]
 name  fieldvalue
[1,] Ranger Rick picnic No
[2,] Ranger Rick food   I told you I don't picnic
[3,] Ranger Rick other  a big net and handcuffs

[[4]]
  name  fieldvalue
 [1,] Magilla Gorilla picnic Yes
 [2,] Magilla Gorilla food   Hamburgers
 [3,] Magilla Gorilla food   Hot Dogs
 [4,] Magilla Gorilla food   Potato Salad
 [5,] Magilla Gorilla food   Cole Slaw
 [6,] Magilla Gorilla food   BBQ Chicken
 [7,] Magilla Gorilla other  Softball
 [8,] Magilla Gorilla other  Volleyball
 [9,] Magilla Gorilla other  Lawn Chairs
[10,] Magilla Gorilla other  Blanket



On Sun, Jul 13, 2008 at 12:56 AM, Hohm, Dale [EMAIL PROTECTED] wrote:
 Thanks for the reply Jim.

 Here is a representation of the data I want to analyze - 10 records as 
 requested.  Each line can easily include an ID number as below.

 So I want to determine a frequency or percentage of respondents that bring 
 each of the 5 foods (Hamburgers, Hot Dogs, Potato Salad, Cole Slaw and BBQ 
 Chicken) and how many Other write-ins there are.  The same for what else is 
 brought besides food (Softball, Volleyball, Lawn Chairs and Blanket) as well 
 as a count of Other write-ins.  I'll also need to be able to discern how 
 many brought Hambergers AND a Blanket or how many brought a Softball AND a 
 Vollyball etc.

 ID|Your Name|Do you picnic?|What is your favorite picnic food?|What do you 
 bring besides food?
 1|Yogi Bear|Yes|Hamburgers;#Hot Dogs;#I rely on others to bring the good 
 stuff|Softball;#Blanket;#I bring boo-boo, but he hides
 2|Boo-Boo|Yes|Potato Salad;#Cole Slaw;#whatever Yogi doesn't eat|Lawn 
 Chairs;#Blanket;#my running shoes
 3|Ranger Rick|No|I told you I don't picnic|a big net and handcuffs
 4|Magilla Gorilla|Yes|Hamburgers;#Hot Dogs;#Potato Salad;#Cole Slaw;#BBQ 
 Chicken|Softball;#Volleyball;#Lawn Chairs;#Blanket
 5|Foghorn Leghorn|Yes|Hot Dogs;#Cole Slaw;#I say, I say, BBQ 
 Chicken?|Softball;#Blanket
 6|Peter Potamus|Yes|Hamburgers;#Hot Dogs;#anything, just a lot of 
 it|Softball;#Lawn Chairs;#hot air balloon
 7|Jonny Quest|No|too busy getting into and out of trouble|Hadji and Bandit
 8|Fleegle, Bingo, Drooper and Snorky|Yes|Hamburgers;#Hot Dogs;#Potato 
 Salad;#Cole Slaw;#A banana split|a laugh track
 9|George Jetson|No|Mr. Spacely is making me work|Lawn Chairs;#Blanket;#my 
 flying car
 10|Snagglepuss|Yes|Hamburgers;#Hot Dogs;#Potato Salad;#Cole Slaw;#BBQ 
 Chicken|Softball;#Heavens to Murgatroyd!  Exit stage left!

 Thanks in advance,

 Dale

 -Original Message-
 From: jim holtman [mailto:[EMAIL PROTECTED]
 Sent: Saturday, July 12, 2008 11:32 AM
 To: Hohm, Dale
 Cc: r-help@r-project.org
 Subject: Re: [R] Reading Multi-value data fields for descriptive analysis

 Can you provide a more complete example (say 10 lines) of what the
 input is like. Does each line have a unique index that can be related
 to it?  Do you want to summarize all the multi1-n values of Col2?  Do
 you want to know the percentage of input lines that have a
 Col3/multi-value4 on them?  You could read in the data as you have
 indicated below and add a column that is the record number and
 therefore you would have have to worry about trying to say if it
 existed or not.  For example, you might have:

 Rec#|col#|value
 1|1|single
 1|2|multi1
 1|2|multi2
 1|3|multi1
 2|1|single
 3|1|single
 3|2|multi1
 

 There are a number of potential ways of representing the data, but a
 lot depends on what you want to do with it, so a more extensive
 example of the input, along with the type of output you would like
 will help in providing an answer.

 On Sat, Jul 12, 2008 at 12:37 PM, Hohm, Dale [EMAIL PROTECTED] wrote:
 Hello,

 I'm looking for help on the best approach to get multi-value data fields 
 into R for simple descriptive analysis.

 -

 I am new to this list and new to R, but I really want to get over the hump 
 and get productive with it.  Some help with how to best get the 

Re: [R] shapiro wilk normality test

2008-07-13 Thread Frank E Harrell Jr

Marta Colombo wrote:
Hi! 
Well, if you look at the output:

shapiro.test(rnorm(5000))

        Shapiro-Wilk normality test

data:Â  rnorm(5000)
W = 0.9997, p-value = 0.6205


You can see that the p-value is 0.6205 so you can't refuse the normality hypotesis. 
H0: normal data    vs H1: not normal

So shapiro.wilk test is saying that your data are normal and it's correct!
Bye
Marta


A large P-value means nothing more than needing more data.  No 
conclusion is possible.  Please read the classic paper Absence of 
Evidence is not Evidence for Absence.


Your first sentence is correct, but not the second.

Why test for normality?  What downstream method depends on it?  If 
normality is in doubt why not use a method that doesn't require it?


Frank Harrell




- Messaggio originale -
Da: C.H. [EMAIL PROTECTED]
A: Bunny, lautloscrew.com [EMAIL PROTECTED]
Cc: r-help@r-project.org
Inviato: Domenica 13 luglio 2008, 7:27:43
Oggetto: Re: [R] shapiro wilk normality test

You may consider the nortest package.

http://cran.r-project.org/web/packages/nortest/index.html

Regards,

CH

On Sat, Jul 12, 2008 at 11:30 PM, Bunny, lautloscrew.com
[EMAIL PROTECTED] wrote:

Hi everybody,

somehow i dont get the shapiro wilk test for normality. i just can´t find
what the H0 is .

i tried :

  shapiro.test(rnorm(5000))

        Shapiro-Wilk normality test

data:Â  rnorm(5000)
W = 0.9997, p-value = 0.6205


If normality is the H0, the test says it´s probably not normal, doesn´t it ?

5000 is the biggest n allowed by the test...

are there any other test ? ( i know qqnorm already ;)

thanks in advance

matthias
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.








__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Frank E Harrell Jr   Professor and Chair   School of Medicine
 Department of Biostatistics   Vanderbilt University

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro wilk normality test

2008-07-13 Thread Ted Harding
On 13-Jul-08 13:29:13, Frank E Harrell Jr wrote:
 [...]
 A large P-value means nothing more than needing more data.  No 
 conclusion is possible.  Please read the classic paper Absence of 
 Evidence is not Evidence for Absence.

Is that ironic, Frank, or is there really a classic paper with
that title? If so, I'd be pleased to have a reference to it!

Thanks,
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 13-Jul-08   Time: 15:55:35
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro wilk normality test

2008-07-13 Thread Charles Annis, P.E.
http://www.bmj.com/cgi/content/full/311/7003/485

Charles Annis, P.E.

[EMAIL PROTECTED]
phone: 561-352-9699
eFax:  614-455-3265
http://www.StatisticalEngineering.com
 

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On
Behalf Of Ted Harding
Sent: Sunday, July 13, 2008 10:56 AM
To: Frank E Harrell Jr
Cc: r-help@r-project.org
Subject: Re: [R] shapiro wilk normality test

On 13-Jul-08 13:29:13, Frank E Harrell Jr wrote:
 [...]
 A large P-value means nothing more than needing more data.  No 
 conclusion is possible.  Please read the classic paper Absence of 
 Evidence is not Evidence for Absence.

Is that ironic, Frank, or is there really a classic paper with
that title? If so, I'd be pleased to have a reference to it!

Thanks,
Ted.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 13-Jul-08   Time: 15:55:35
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro wilk normality test

2008-07-13 Thread Berwin A Turlach
G'day all,

On Sun, 13 Jul 2008 15:55:38 +0100 (BST)
(Ted Harding) [EMAIL PROTECTED] wrote:

 On 13-Jul-08 13:29:13, Frank E Harrell Jr wrote:
  [...]
  A large P-value means nothing more than needing more data.  No 
  conclusion is possible.  

I would have thought that we need more data would qualify as a
conclusion. :)

  Please read the classic paper Absence of Evidence is not Evidence
  for Absence.
 
 Is that ironic, Frank, or is there really a classic paper with
 that title? If so, I'd be pleased to have a reference to it!

Of course, I do not know for sure which paper Frank has in mind, but
google and google schoar readily come up with papers/editorials that
have a nearly identical title:

http://www.bmj.com/cgi/content/full/311/7003/485
http://bmj.bmjjournals.com/cgi/content/full/328/7438/476
(see also
http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=351831)
http://www.ncbi.nlm.nih.gov/pubmed/6829975

My money is on Frank having the first of these publications in mind.

Cheers,

Berwin

=== Full address =
Berwin A TurlachTel.: +65 6516 4416 (secr)
Dept of Statistics and Applied Probability+65 6516 6650 (self)
Faculty of Science  FAX : +65 6872 3919   
National University of Singapore 
6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
Singapore 117546http://www.stat.nus.edu.sg/~statba

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Another packaging question

2008-07-13 Thread Johannes Huesing
Uwe Ligges [EMAIL PROTECTED] [Sat, Jul 12, 2008 at 11:48:38AM CEST]:


 Johannes Huesing wrote:
 I am still trying to build a package. At the moment I am stuck with a
 file not found error message when processing R code from the tests
 subdirectory. What would be the accurate relative path for files in
 the tests directory to access files in the data directory?


 You do not need a path. Just say
   data(your_data's_name)
 (if data is not already under Lazy Loading) in the R code. I guess you  
 rely on an installed package when the tests are executed, don't you?

Yes, that did it, thank you. 

Yes, and I had to learn that I have to include library(packagename) 
in the example files-
-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:[EMAIL PROTECTED]  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] initialize a factor vector

2008-07-13 Thread Johannes Huesing
What is the least surprising way of initializing a factor with 
predefined levels and with length 0? 
as.factor(c(eins, zwei, drei))[FALSE] 
does the job but looks a bit weird.

-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:[EMAIL PROTECTED]  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Another packaging question

2008-07-13 Thread Uwe Ligges



Johannes Huesing wrote:

Uwe Ligges [EMAIL PROTECTED] [Sat, Jul 12, 2008 at 11:48:38AM CEST]:


Johannes Huesing wrote:

I am still trying to build a package. At the moment I am stuck with a
file not found error message when processing R code from the tests
subdirectory. What would be the accurate relative path for files in
the tests directory to access files in the data directory?


You do not need a path. Just say
  data(your_data's_name)
(if data is not already under Lazy Loading) in the R code. I guess you  
rely on an installed package when the tests are executed, don't you?


Yes, that did it, thank you. 

Yes, and I had to learn that I have to include library(packagename) 
in the example files-


Not in the examples but in the tests, as far as I know.

Uwe

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro wilk normality test

2008-07-13 Thread Ted Harding
Many thanks to Berwin, and also to Charles Annis, for the
references. The're good!
Ted.

On 13-Jul-08 15:22:03, Berwin A Turlach wrote:
 G'day all,
 
 On Sun, 13 Jul 2008 15:55:38 +0100 (BST)
 (Ted Harding) [EMAIL PROTECTED] wrote:
 
 On 13-Jul-08 13:29:13, Frank E Harrell Jr wrote:
  [...]
  A large P-value means nothing more than needing more data.  No 
  conclusion is possible.  
 
 I would have thought that we need more data would qualify as a
 conclusion. :)
 
  Please read the classic paper Absence of Evidence is not Evidence
  for Absence.
 
 Is that ironic, Frank, or is there really a classic paper with
 that title? If so, I'd be pleased to have a reference to it!
 
 Of course, I do not know for sure which paper Frank has in mind, but
 google and google schoar readily come up with papers/editorials that
 have a nearly identical title:
 
 http://www.bmj.com/cgi/content/full/311/7003/485
 http://bmj.bmjjournals.com/cgi/content/full/328/7438/476
 (see also
 http://www.pubmedcentral.nih.gov/articlerender.fcgi?artid=351831)
 http://www.ncbi.nlm.nih.gov/pubmed/6829975
 
 My money is on Frank having the first of these publications in mind.
 
 Cheers,
 
   Berwin
 
 === Full address =
 Berwin A TurlachTel.: +65 6516 4416 (secr)
 Dept of Statistics and Applied Probability+65 6516 6650 (self)
 Faculty of Science  FAX : +65 6872 3919   
 National University of Singapore 
 6 Science Drive 2, Blk S16, Level 7  e-mail: [EMAIL PROTECTED]
 Singapore 117546http://www.stat.nus.edu.sg/~statba
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


E-Mail: (Ted Harding) [EMAIL PROTECTED]
Fax-to-email: +44 (0)870 094 0861
Date: 13-Jul-08   Time: 18:01:51
-- XFMail --

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how do I read only specific columns using read.csv or other read function

2008-07-13 Thread Juliet Hannah
I was not able to follow the solution posted. Could you demonstrate
this technique on an example
data set. Thanks!

dat - data.frame(a = letters[1:3], b = LETTERS[1:3], c = 1:3, d = 3:1)

On Wed, Jul 2, 2008 at 1:13 PM, Charles C. Berry [EMAIL PROTECTED] wrote:
 On Wed, 2 Jul 2008, Ben Tupper wrote:


 On Jul 2, 2008, at 6:53 AM, Philip James Smith wrote:

 Hi R people:

 I have huge files with as many as 5000 columns. I'd really like to read
 only certain columns of those files. I know column names I want to read.

 I looked at the documentation of read.csv . Although there is a col.names
 option, it allows users to specify the names of the columns, rather than to
 pick the columns of interest.

 Any suggestions on how to pick the columns I want to read only, rather
 than the entire file, would be greatly appreciated.


 There is a unix utility called 'cut' that enables stuff like

   columns.1.3.5.to.7 - read.csv( pipe( cut -d, -f1,3,5-7 your.file ) )

 and using

col.pos - match(names.of.variables.you.want,
 scan(your.file, what=character(0), nlines=1 )

 will enable you to set up the call to pipe.

 HTH,

 Chuck



 Hello,

 I think you want explicitly set the colClasses argument such that the
 columns you *don't* want are set to NULL and all others are set to
 appropriate classes.

 Cheers,
 Ben






 Phil Smith
 Duluth, GA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 Ben Tupper
 [EMAIL PROTECTED]

 I GoodSearch for Ashwood Waldorf School.

 Raise money for your favorite charity or school just by searching the
 Internet with GoodSearch - www.goodsearch.com - powered by Yahoo!

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 Charles C. Berry(858) 534-2098
Dept of Family/Preventive
 Medicine
 E mailto:[EMAIL PROTECTED]   UC San Diego
 http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] mpirun question with Rmpi

2008-07-13 Thread Martin Morgan
Erin Hodgess [EMAIL PROTECTED] writes:

 Dear R People:

 I'm running Rmpi on a single machine and I have the following
 statement from the command line:

  mpirun -np 3 ./R --no-save  eek1.in stuff4.out

All three versions of eek1.in write to the same location, over-writing
one another. You happen to see the results of the third process; some
other time you might see the three outputs intermingled.

The solutions are to have eek1.in create an appropriate output file
(e.g., using mpi.comm.rank() to uniquify a base name) or to design
eek1.in so that one of the nodes collates and outputs the results from
all the others (i.e., only one node writes; a typical solution might
include mpi.gather.Robj followed by a conditional based on
mpi.comm.rank).

Hope that helps.

Martin

 The stuff4.out file only contains the third result.  Is there a way to
 fix this such that it shows all 3 sets, please

 Thanks in advance,
 Erin

 -- 
 Erin Hodgess
 Associate Professor
 Department of Computer and Mathematical Sciences
 University of Houston - Downtown
 mailto: [EMAIL PROTECTED]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109

Location: Arnold Building M2 B169
Phone: (206) 667-2793

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] initialize a factor vector

2008-07-13 Thread Gavin Simpson
On Sun, 2008-07-13 at 18:47 +0200, Johannes Huesing wrote:
 What is the least surprising way of initializing a factor with 
 predefined levels and with length 0? 
 as.factor(c(eins, zwei, drei))[FALSE] 
 does the job but looks a bit weird.
 

Notice that one does not need to specify any data as argument 'x' to
factor() because, by default, x = character(). Therefore, we need only
specify the levels we want:

 factor(levels = c(one,two,three))
factor(0)
Levels: one two three

HTH

G

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Excel Trend Function

2008-07-13 Thread Daniel Malter
Disputed was probably not the correct wording for it. However, imputation
means that you make assumptions regarding the distribution of your missing
data dependent on the data that is available to you. Felipe had time series
data, and it is common to predict from the past to the future in such
models. However, I tried to outline why making these assumptions may be
critical in Felipe's case due to the environment in which his data was
missing (namely that the distribution of his variable around the missing
values seemed odd and raised questions whether or not they were measure
without error). 

The rule of thumb for imputation that I remember is: If you are not sure how
plausible the assumptions were that you would have to make - drop it. But
better, since newer, advice maybe available on that. However, I cannot
remember reading a single study in my field in which data was imputed (I
don't say there is none, but if they exist, they must be very rare), meaning
that observations with missing data are typically dropped. Therefore, how
you will be received for imputing data does not only depend on the
particular application, but also on the research tradition of your field.
 
General sources that review data imputation as well as associated problems
are:

Horton and Lipsitz, 2001, The American Statistician
Schafer and Graham, 2002, Psychological Methods

Best,
Daniel

-
cuncta stricte discussurus
-

-Ursprüngliche Nachricht-
Von: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] Im
Auftrag von paulandpen
Gesendet: Sunday, July 13, 2008 4:27 AM
An: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Betreff: Re: [R] Excel Trend Function

Hi Felipe,

Daniel mentions imputation is a disputed practice.  There are
recommendations and rules of thumb for its use.  I am not sure that
imputation is disputed.  I would be interested to see some links to articles
recommending against its use.

Paul


- Original Message -
From: Felipe Carrillo [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Sunday, July 13, 2008 5:46 AM
Subject: [R] Excel Trend Function


 Hi:
 I have a dataset and need to interpolate for missing days. In Excel I 
 either average from sampled days from above and below the missing days or 
 use the TREND function to make up for the missing values. I have been 
 reading about na.approx, is this function similar to the TREND function? 
 Which is the best recommendable way to make up for missing data?
 Here's my dataset: weeks 17,18,26 and 46 have 0 daysSamp.

 Year Week daysSamp Lower TotalPD Upper varTotalPD
 2006 47 6 126988 188259 249530 1045878675
 2006 48 7 189155 253350 317545 1148102355
 2006 49 7 103300 132741 162182 241480186
 2006 50 6 11801 252576 493352 16151006813
 2006 51 7 2348 3671 4994 487926
 2006 52 5 2606 29901 57196 215454181
 2006 2 7 2968 4513 6058 664723
 2006 3 7 1128 1889 2650 161231
 2006 4 7 479 963 1447 65196
 2006 5 7 2819 4413 6007 708094
 2006 6 6 -1009 3128 7264 4766743
 2006 7 7 -5239 10769 26777 71387835
 2006 8 7 150 503 856 34685
 2006 9 7 1858 2989 4120 356562
 2006 10 7 193 494 795 25281
 2006 11 7 125 346 567 13627
 2006 12 7 432 767 1102 31189
 2006 13 7 1229 1867 2505 113569
 2006 14 7 813 1339 1865 77140
 2006 15 4 -66 124 315 10105
 2006 16 7 152 903 1654 157242
 2006 17 0
 2006 18 0
 2006 19 5 0 0 0 0
 2006 20 4 0 0 0 0
 2006 21 5 0 0 0 0
 2006 22 6 0 0 0 0
 2006 23 7 -65 285 635 34112
 2006 24 6 0 0 0 0
 2006 25 7 0 0 0 0
 2006 26 0
 2006 27 4 228 931 1634 137726
 2006 28 4 801 2231 3662 569977
 2006 29 4 4544 9242 13939 6147522
 2006 30 5 15798 28465 41131 44697915
 2006 31 5 25398 41049 56701 68245523
 2006 32 5 48197 82216 116235 322416917
 2006 33 5 142980 230411 317841 2129630128
 2006 34 5 227141 360468 493794 4952314336
 2006 35 5 467244 756325 1045405 23281569629
 2006 36 5 281049 463331 645614 9256900449
 2006 37 2 227636 620330 1013023 42961663047
 2006 38 3 478990 983472 1487954 70903343603
 2006 39 7 539690 846522 1153354 26228718974
 2006 40 7 320959 457866 594773 5221891252
 2006 41 7 427561 582452 737343 6683813344
 2006 42 7 271788 351103 430418 1752614293
 2006 43 7 165019 208853 252687 535301133
 2006 44 7 91514 117390 143266 186537178
 2006 45 7 59061 79187 99313 112842787
 2006 46 0

 Felipe D. Carrillo
 Supervisory Fishery Biologist
 Department of the Interior
 US Fish  Wildlife Service
  California, USA

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide 
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__

[R] rm anova

2008-07-13 Thread korhan ozkan
Dear all,
I am new to r and most happy for now:)
I would like to ask an issue about rm-anova.
I have data of an experiment with 24 subjects 3 treatment (8 replicates for 
each treatment) and 8 sampling through time. data sheet is something like 
that(just an example, not real).
sample id,response(tp),treatment,date
a1,119,HP,27 june
a2,120,MP,27 june
a3,150,C,27 june
a4,100,C,27 june
..
..
a1,90 HP, 7 july
a2,80,MP,7 july
a3,170,C,7 july
a4,50,C,7 july
.
.
. 
is it correct to formulize rm-anova as
demo - aov(tn_mgl ~ factor(TN)*factor(prefix) + 
Error(sample/(factor(TN)+factor(prefix 
thanks in advance, best regards
korhan  ozkan
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Quick plotmath question

2008-07-13 Thread Mike Lawrence
While the earlier solutions involving expression() and paste() work  
great, unfortunately Gabor's first suggestion doesn't display on the  
OS X default quartz device, and Gabor's second suggestion displays on  
quartz, but not to the pdf() device.


In any event, the first replies in this thread provide a sufficient  
solution for me, so thanks all!


Mike

On 12-Jul-08, at 1:40 PM, Gabor Grothendieck wrote:


And this gives a slightly different one:

plot(1, main = \u394i \ubb 0)


2008/7/12 Gabor Grothendieck [EMAIL PROTECTED]:

This works on my Windows Vista system:

plot(1, main = \u394i \u300b 0)

See:
http://www.fileformat.info/info/unicode/char/300b/index.htm
http://www.fileformat.info/info/unicode/char/394/index.htm

On Sat, Jul 12, 2008 at 10:12 AM, Mike Lawrence  
[EMAIL PROTECTED] wrote:

Hi all,

Worked  looked around for a while on this to no avail. I'm trying  
to create

a plotmath expression that achieves:
Δi  0

and while:
expression(Delta*i0)

comes close, I'd prefer to have the  (denoting very much  
greater than).

Maybe  is a non-standard expression and therefore not supported?

Mike



--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

www.memetic.ca

The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less.
  - Piet Hein

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Mike Lawrence
Graduate Student, Department of Psychology, Dalhousie University

www.memetic.ca

The road to wisdom? Well, it's plain and simple to express:
Err and err and err again, but less and less and less.
- Piet Hein

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] initialize a factor vector

2008-07-13 Thread Johannes Huesing
Gavin Simpson [EMAIL PROTECTED] [Sun, Jul 13, 2008 at 08:18:37PM CEST]:
 On Sun, 2008-07-13 at 18:47 +0200, Johannes Huesing wrote:
[...]
  as.factor(c(eins, zwei, drei))[FALSE] 
  does the job but looks a bit weird.
  
[...]
  factor(levels = c(one,two,three))
 factor(0)
 Levels: one two three

Ah, ok, I was unaware of factor(), only knew as.factor(). 

Many thanks


Johannes
-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:[EMAIL PROTECTED]  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro wilk normality test

2008-07-13 Thread Johannes Huesing
Frank E Harrell Jr [EMAIL PROTECTED] [Sun, Jul 13, 2008 at 08:07:37PM CEST]:
 (Ted Harding) wrote:
 On 13-Jul-08 13:29:13, Frank E Harrell Jr wrote:
 [...]
 A large P-value means nothing more than needing more data.  No  
 conclusion is possible.  Please read the classic paper Absence of  
 Evidence is not Evidence for Absence.

[...]

 It's real.  Full text is available to all:  
 http://www.bmj.com/cgi/content/full/311/7003/485

The quotation is attributed to the late Carl Sagan who 
seemed to have used it as a strawman argument , see 
http://oyhus.no/AbsenceOfEvidence.html.

-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:[EMAIL PROTECTED]  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how do I read only specific columns using read.csv or other read function

2008-07-13 Thread Charles C. Berry

On Sun, 13 Jul 2008, Juliet Hannah wrote:


I was not able to follow the solution posted. Could you demonstrate
this technique on an example
data set. Thanks!

dat - data.frame(a = letters[1:3], b = LETTERS[1:3], c = 1:3, d = 3:1)


Using your example:


dat - data.frame(a = letters[1:3], b = LETTERS[1:3], c = 1:3, d = 3:1)
write.csv(dat,file=yourFrame.csv)
col.pos - match(c(b,d), 
scan(yourFrame.csv,sep=',',what=character(0),nlines=1))

Read 5 items

con - pipe( paste( cut -d, -f,paste(col.pos,collapse=','),  
yourFrame.csv,sep=''))
cols.b.d - read.csv( con )
cols.b.d

  b d
1 A 3
2 B 2
3 C 1




HTH,

Chuck




On Wed, Jul 2, 2008 at 1:13 PM, Charles C. Berry [EMAIL PROTECTED] wrote:

On Wed, 2 Jul 2008, Ben Tupper wrote:



On Jul 2, 2008, at 6:53 AM, Philip James Smith wrote:


Hi R people:

I have huge files with as many as 5000 columns. I'd really like to read
only certain columns of those files. I know column names I want to read.

I looked at the documentation of read.csv . Although there is a col.names
option, it allows users to specify the names of the columns, rather than to
pick the columns of interest.

Any suggestions on how to pick the columns I want to read only, rather
than the entire file, would be greatly appreciated.



There is a unix utility called 'cut' that enables stuff like

  columns.1.3.5.to.7 - read.csv( pipe( cut -d, -f1,3,5-7 your.file ) )

and using

   col.pos - match(names.of.variables.you.want,
scan(your.file, what=character(0), nlines=1 )

will enable you to set up the call to pipe.

HTH,

Chuck





Hello,

I think you want explicitly set the colClasses argument such that the
columns you *don't* want are set to NULL and all others are set to
appropriate classes.

Cheers,
Ben







Phil Smith
Duluth, GA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Ben Tupper
[EMAIL PROTECTED]

I GoodSearch for Ashwood Waldorf School.

Raise money for your favorite charity or school just by searching the
Internet with GoodSearch - www.goodsearch.com - powered by Yahoo!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
   Dept of Family/Preventive
Medicine
E mailto:[EMAIL PROTECTED]   UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] (no subject)

2008-07-13 Thread Paul Adams
Hello everyone,
I am using the following code to try to calculate the mean :
dat-read.table(file=C:\\Documents and  Settings.txt)
dat-as.numeric(dat)
x1.mmean(dat)
I am getting the following error message
Error in eval.with.vis(expr,envir,enclos):
(list) object cannot be coerced to typedouble'
I do not understand what is wrong as I thought that I have changed 
dat to a numeric.Whenever I list x1.m all I get are NA
Thank you
Paul


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] (no subject)

2008-07-13 Thread jim holtman
What does 'str(dat)' show?  the statement

dat - as.numeric(dat)

says you are trying to make an entire dataframe numeric.  This is
probably not what you want to do.  What is it you want to do?  Have
you tried

summary(dat)

e.g.,

 x - data.frame(a=1:10, b=101:110, c=letters[1:10])
 summary(x)
   a   b   c
 Min.   : 1.00   Min.   :101.0   a  :1
 1st Qu.: 3.25   1st Qu.:103.2   b  :1
 Median : 5.50   Median :105.5   c  :1
 Mean   : 5.50   Mean   :105.5   d  :1
 3rd Qu.: 7.75   3rd Qu.:107.8   e  :1
 Max.   :10.00   Max.   :110.0   f  :1
 (Other):4


On Sun, Jul 13, 2008 at 4:05 PM, Paul Adams [EMAIL PROTECTED] wrote:
 Hello everyone,
 I am using the following code to try to calculate the mean :
 dat-read.table(file=C:\\Documents and  Settings.txt)
 dat-as.numeric(dat)
 x1.mmean(dat)
 I am getting the following error message
 Error in eval.with.vis(expr,envir,enclos):
 (list) object cannot be coerced to typedouble'
 I do not understand what is wrong as I thought that I have changed
 dat to a numeric.Whenever I list x1.m all I get are NA
 Thank you
 Paul



[[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.





-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] stem and leaf plot: how to edit the stem-values

2008-07-13 Thread Jörg Groß

Hi,

I would like to make a stem and leaf plot and I want to edit the  
category-names.


So, by doing this:

 x - c(1,2,2,3,3,3,3,2,2,1)
 stem(x)

I get:

  1 | 00
  1 |
  2 | 
  2 |
  3 | 

First Question: Why do I get gaps between the categories?
(like in line 2 and line 4)

And second: How can I edit the categories so that I can create  
something like that:



  category A | 00
  category B | 
  category C | 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assoociative array?

2008-07-13 Thread rkevinburton
Thank you I will try drop=TRUE.

In the mean time do you know how I can access the members (for lack of a better 
term) of the results of a split? In the sample you provided below you have:

z - split(x, list(x$cat, x$a), drop=TRUE)

Now I can print out 'z[1], z[2] etc' This is nice but what if I want the 
access/iterate through all of the members of a particular column in z. You have 
given some methods like z[[1]]$b to access the specific columns in z. I notice 
for your example z[[1]]$b prints out two values. Can I assume that z[[1]]$b is 
a vecotr? So if I want to find the mean i can 'mean(z[[1]]$b)' and it will give 
me the mean value of the b columns in z? (similarily sum, and range, etc.). 
Does nrows(z[[1]]$b) return two in your example below? I would like to find out 
how many elements are in z[1]. Or would it be just as fast to do 'nrows(z[1])'?

Thank you for this extended session on data frames, matrices, and vectors. I 
feel much more comfortable with the concepts now.

Kevin
 jim holtman [EMAIL PROTECTED] wrote: 
 The reason for the empty levels was I did not put drop=TRUE on the
 split to remove unused levels.  Here is the revised script:
 
  set.seed(1)  # start with a known number
  x - data.frame(cat=sample(LETTERS[1:3],20,TRUE),a=sample(letters[1:4], 20, 
  TRUE), b=runif(20))
  x
cat a  b
 1A d 0.82094629
 2B a 0.64706019
 3B c 0.78293276
 4C a 0.55303631
 5A b 0.52971958
 6C b 0.78935623
 7C a 0.02333120
 8B b 0.47723007
 9B d 0.73231374
 10   A b 0.69273156
 11   A b 0.47761962
 12   A c 0.86120948
 13   C b 0.43809711
 14   B a 0.24479728
 15   C d 0.07067905
 16   B c 0.09946616
 17   C d 0.31627171
 18   C a 0.51863426
 19   B c 0.66200508
 20   C b 0.40683019
  # drop unused groups from the split
  (z - split(x, list(x$cat, x$a), drop=TRUE))
 $B.a
cat a b
 2B a 0.6470602
 14   B a 0.2447973
 
 $C.a
cat a  b
 4C a 0.55303631
 7C a 0.02333120
 18   C a 0.51863426
 
 $A.b
cat a b
 5A b 0.5297196
 10   A b 0.6927316
 11   A b 0.4776196
 
 $B.b
   cat a b
 8   B b 0.4772301
 
 $C.b
cat a b
 6C b 0.7893562
 13   C b 0.4380971
 20   C b 0.4068302
 
 $A.c
cat a b
 12   A c 0.8612095
 
 $B.c
cat a  b
 3B c 0.78293276
 16   B c 0.09946616
 19   B c 0.66200508
 
 $A.d
   cat a b
 1   A d 0.8209463
 
 $B.d
   cat a b
 9   B d 0.7323137
 
 $C.d
cat a  b
 15   C d 0.07067905
 17   C d 0.31627171
 
  # access the value ('b' in this instance); two ways- should be the same
  z[[1]]$b
 [1] 0.6470602 0.2447973
  z$B.a$b
 [1] 0.6470602 0.2447973
 
 
 
 
 
 
 On Sun, Jul 13, 2008 at 1:26 AM,  [EMAIL PROTECTED] wrote:
  This is almost it. Maybe it is as good as can be expected. The only problem 
  that I see is that this seems to form a Category/SubCategory pair where 
  none existed in the original data. For example, A might have two 
  sub-categories a and b, and B might have two categories c and d. As far as 
  I can tell the method that you outlined forms a Category/SubCategory pair 
  like B a or B b where none existed. This results in alot of empty lists and 
  it seems to take a long time to generate. But if that is as good as it gets 
  then I can live with it.
 
  I know that I said one more question. But I have run into a problem. c - 
  split(x, x$Category) returns a vector of the rows in each of the 
  categories. Now I would like to access the Quantity column within this 
  split vector. I can see it listed. I just can't access it. I have tried 
  c[1]$Quantity and c[1,2] both which give me errors. Any ideas?
 
  Sorry this is so hard for me. I am more used to C type arrays and C type 
  arrays of structures. This seems to be somewhat different.
 
  Thank you.
 
  Kevin
   jim holtman [EMAIL PROTECTED] wrote:
  Is this something like what you were asking for?  The output of a
  'split' will be a list of the dataframe subsets for the categories you
  have specified.
 
   x - data.frame(g1=sample(LETTERS[1:2],30,TRUE),
  + g2=sample(letters[1:2], 30, TRUE),
  + g3=1:30)
   y - split(x, list(x$g1, x$g2))
   str(y)
  List of 4
   $ A.a:'data.frame':7 obs. of  3 variables:
..$ g1: Factor w/ 2 levels A,B: 1 1 1 1 1 1 1
..$ g2: Factor w/ 2 levels a,b: 1 1 1 1 1 1 1
..$ g3: int [1:7] 3 4 6 8 9 13 24
   $ B.a:'data.frame':7 obs. of  3 variables:
..$ g1: Factor w/ 2 levels A,B: 2 2 2 2 2 2 2
..$ g2: Factor w/ 2 levels a,b: 1 1 1 1 1 1 1
..$ g3: int [1:7] 10 11 16 17 18 20 25
   $ A.b:'data.frame':6 obs. of  3 variables:
..$ g1: Factor w/ 2 levels A,B: 1 1 1 1 1 1
..$ g2: Factor w/ 2 levels a,b: 2 2 2 2 2 2
..$ g3: int [1:6] 2 12 23 26 27 29
   $ B.b:'data.frame':10 obs. of  3 variables:
..$ g1: Factor w/ 2 levels A,B: 2 2 2 2 2 2 2 2 2 2
..$ g2: Factor w/ 2 levels a,b: 2 2 2 2 2 2 2 2 2 2
..$ g3: int [1:10] 1 5 7 14 15 19 21 22 28 30
   y
  

Re: [R] Difficultes with grep

2008-07-13 Thread Fran100681

Thank you, but this is not what i want exactly.. i would want to launch
function myfun with this script:

 table - sample(LETTERS[1:5], 20,TRUE)

 name - A

 myfun - function(name) {
+ r - grep (name[^0-9], table )
+ return (r) }

but if I do it ,R doesn't accept this.. i want this because i have in
table (a data frame) ,a list of element that are hsa-mir-N (when N is
any number)..so, if i put in argument name this is: hsa-mir-70 ,function
matches hsa-mir-70, but also (for instance) hsa-mir-700, 710 724 and so
on... infact i insert a square brackets ^0-9 to exclude any other number
after those i have given (in name).. It's a little complex situation.. :(

p.s: i can't put in name simply: hsa-mir-20 to match all hsa-mir-20 in the
data frame because i need to match hsa-mir-20 but also (for instance)
hsa-mir-20-3... or hsa-mir-20b, hsa-mir-20a. and not those elements with
another number after the 20


jholtman wrote:
 
 I think this is what you want
 
 table - sample(LETTERS[1:5], 20,TRUE)

 name - A

 myfun - function(name) {
 + r - grep (name, table )
 + return (r) }

 myfun(name)
 [1]  4  7 14 18
 table
  [1] E B D A B B A B E B C C C A E D
 D A D C

 
 
 On Fri, Jul 11, 2008 at 1:57 PM, Fran100681 [EMAIL PROTECTED] wrote:

 Hello everybody!

 I'm using R and I have a little problem about function grep. I 've got
 to
 make a new function in which grep is present. So the first argument of
 grep is the string we want to find,ok..but in this case I define a
 function x before , x receives an argument in a object name (for
 instance), then inside function x ,i  define a grep.. so i want to set
 as
 pattern (1st argument of grep)  what i put in name and not the string
 name... how do i do that?

 ex:
 name - Tom

 myfun - function(name) {
 r - grep (name, table )
 return (r) }
 It returns nothing because it searches the word name in table rather
 Tom...
 I hope to receive some little help because this is stopping me in my
 projcet
 :/ ... i m'not able to reach a solution! Thanks a lot!
 --
 View this message in context:
 http://www.nabble.com/Difficultes-with-grep-tp18409347p18409347.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 
 
 
 -- 
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390
 
 What is the problem you are trying to solve?
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 

-- 
View this message in context: 
http://www.nabble.com/Difficultes-with-grep-tp18409347p18428404.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Installing RWinEdt

2008-07-13 Thread rkevinburton
I checked and reapplied full access to the whole WinEdt directory. I am not the 
administrator but I am a member of the administrators group. This couple with 
the fact the I have given full control to the WinEdt directory suggests that 
the problem is elsewhere. But this is not the first time that I have run into 
permission problems with Windows 2008 server.

Kevin

 Uwe Ligges [EMAIL PROTECTED] wrote: 
 
 
 [EMAIL PROTECTED] wrote:
 From the R console I invoke:
  
  install.packages(RWinEdt)
  
  and get:
  
  Warning in install.packages(RWinEdt) :
argument 'lib' is missing: using 
  'F:\Users\Kevin\Documents/R/win-library/2.7'
  --- Please select a CRAN mirror for use in this session ---
  trying URL 
  'http://streaming.stat.iastate.edu/CRAN/bin/windows/contrib/2.7/RWinEdt_1.8-0.zip'
  Content type 'application/zip' length 361598 bytes (353 Kb)
  opened URL
  downloaded 353 Kb
  
  package 'RWinEdt' successfully unpacked and MD5 sums checked
  
  The downloaded packages are in
  F:\Users\Kevin\AppData\Local\Temp\RtmpOIlW0F\downloaded_packages
  updating HTML package descriptions
  
  So it seems to have worked. But when I use the 'library' command I get:
  
  library(RWinEdt)
  Error in file(file, r) : cannot open the connection
  In addition: Warning message:
  In file(file, r) :
cannot open file 'F:\Program Files (x86)\WinEdt Team\WinEdt\R.ver': No 
  such file or directory
  Error : .onAttach failed in 'attachNamespace'
  Error: package/namespace load failed for 'RWinEdt'
  
  Any ideas on how I can install this package?
 
 With administrator privileges, since it needs to write some files into 
 the WinEdt directory.
 
 Best wishes,
 Uwe Ligges
 
 
  Thank you.
  
  Kevin
  
  __
  R-help@r-project.org mailing list
  https://stat.ethz.ch/mailman/listinfo/r-help
  PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
  and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro wilk normality test

2008-07-13 Thread Johannes Huesing
Ted Harding [EMAIL PROTECTED] [Sun, Jul 13, 2008 at 10:59:21PM CEST]:
 On 13-Jul-08 19:53:47, Johannes Huesing wrote:
  Frank E Harrell Jr [EMAIL PROTECTED] [Sun, Jul 13, 2008 at
  08:07:37PM CEST]:
  (Ted Harding) wrote:
  On 13-Jul-08 13:29:13, Frank E Harrell Jr wrote:
  [...]
  A large P-value means nothing more than needing more data.  No  
  conclusion is possible.  
[...]

 But absence
 of evidence, in my interpretation (which I believe is right for
 the statistical context of non-significant P-values), means that
 we do not know about A: we do not have enough information.
 

What would the p-value have to be like in your opinion to make the
null hypothesis look more likely after the experiment than before?

 The proof is, basically, given in terms of a 2-valued logic where
 every term is either TRUE or FALSE. In the real world we have at
 least a third possible value: UNKNOWN (or, as R would put it, NA).

How would the probabilities that A is NA be affected by the outcome
of an experiment like this? If this probability is affected, how
does this leave the probability that A is T or F unaffected?

Or do you assign the NA status to the data collected?

A high p-value does not always equate that you might as well have 
collected nothing but missing values. 

Of course I buy into the notion that a point estimate with a measure
of accuracy is much better suited to describe your data; but a
high p-value as a result of a test procedure that can be claimed to
be adequately powered may defensibly be taken as a hint that we
can for now stick with the null hypothesis.
-- 
Johannes Hüsing   There is something fascinating about science. 
  One gets such wholesale returns of conjecture 
mailto:[EMAIL PROTECTED]  from such a trifling investment of fact.  
  
http://derwisch.wikidot.com (Mark Twain, Life on the Mississippi)

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] any way to set defaults for par?

2008-07-13 Thread Carl Witthoft
I know how to set graphic parameters by calling par(), but what I'd like 
is a way to set the default values so that subsequent calls to par() use 
my defaults.  The reason to want this is that every time I create a new 
graphic window (I'm using quartz on OSX, and so far no answers in the 
Mac mailing list), my parameters get reset to the builtin defaults.
I read about the unexported variable .Pars, but would like to know if 
there's any way to manipulate that variable.


thanks
Carl

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Difficultes with grep

2008-07-13 Thread Charles C. Berry

On Sun, 13 Jul 2008, Fran100681 wrote:



Thank you, but this is not what i want exactly.. i would want to launch
function myfun with this script:


table - sample(LETTERS[1:5], 20,TRUE)

name - A

myfun - function(name) {

+ r - grep (name[^0-9], table )

...XX


This is not correct syntax. And likely R told you so.

If you intend name[^0-9] to be read as character, you need to quote it.

Perhaps you want a more complicated regex than the one Jim handed you, 
like


name - A[^0-9]

??



+ return (r) }

but if I do it ,R doesn't accept this.. i want this because i have in
table (a data frame) ,a list of element that are hsa-mir-N (when N is
any number)..so, if i put in argument name this is: hsa-mir-70 ,function
matches hsa-mir-70, but also (for instance) hsa-mir-700, 710 724 and so
on... infact i insert a square brackets ^0-9 to exclude any other number
after those i have given (in name).. It's a little complex situation.. :(

p.s: i can't put in name simply: hsa-mir-20 to match all hsa-mir-20 in the
data frame because i need to match hsa-mir-20 but also (for instance)
hsa-mir-20-3... or hsa-mir-20b, hsa-mir-20a. and not those elements with
another number after the 20


jholtman wrote:


I think this is what you want


table - sample(LETTERS[1:5], 20,TRUE)

name - A

myfun - function(name) {

+ r - grep (name, table )
+ return (r) }


myfun(name)

[1]  4  7 14 18

table

 [1] E B D A B B A B E B C C C A E D
D A D C





On Fri, Jul 11, 2008 at 1:57 PM, Fran100681 [EMAIL PROTECTED] wrote:


Hello everybody!

I'm using R and I have a little problem about function grep. I 've got
to
make a new function in which grep is present. So the first argument of
grep is the string we want to find,ok..but in this case I define a
function x before , x receives an argument in a object name (for
instance), then inside function x ,i  define a grep.. so i want to set
as
pattern (1st argument of grep)  what i put in name and not the string
name... how do i do that?

ex:
name - Tom

myfun - function(name) {
r - grep (name, table )
return (r) }
It returns nothing because it searches the word name in table rather
Tom...
I hope to receive some little help because this is stopping me in my
projcet
:/ ... i m'not able to reach a solution! Thanks a lot!
--
View this message in context:
http://www.nabble.com/Difficultes-with-grep-tp18409347p18409347.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





--
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




--
View this message in context: 
http://www.nabble.com/Difficultes-with-grep-tp18409347p18428404.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



Charles C. Berry(858) 534-2098
Dept of Family/Preventive Medicine
E mailto:[EMAIL PROTECTED]  UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/  La Jolla, San Diego 92093-0901

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Assoociative array?

2008-07-13 Thread jim holtman
On Sun, Jul 13, 2008 at 5:45 PM,  [EMAIL PROTECTED] wrote:
 Thank you I will try drop=TRUE.

 In the mean time do you know how I can access the members (for lack of a 
 better term) of the results of a split? In the sample you provided below you 
 have:

 z - split(x, list(x$cat, x$a), drop=TRUE)

You can do 'str(z)' to see the structure of 'z'.  In most cases, you
should be able to reference by the keys, if they exist:

 n - 20
 set.seed(1)
 x - data.frame(a=sample(LETTERS[1:2], n,TRUE), b=sample(letters[1:4], n, 
 TRUE), val=runif(n))
 z - split(x, list(x$a, x$b), drop=TRUE)
 str(z)
List of 8
 $ A.a:'data.frame':2 obs. of  3 variables:
  ..$ a  : Factor w/ 2 levels A,B: 1 1
  ..$ b  : Factor w/ 4 levels a,b,c,d: 1 1
  ..$ val: num [1:2] 0.647 0.245
 $ B.a:'data.frame':3 obs. of  3 variables:
  ..$ a  : Factor w/ 2 levels A,B: 2 2 2
  ..$ b  : Factor w/ 4 levels a,b,c,d: 1 1 1
  ..$ val: num [1:3] 0.5530 0.0233 0.5186
 $ A.b:'data.frame':3 obs. of  3 variables:
  ..$ a  : Factor w/ 2 levels A,B: 1 1 1
  ..$ b  : Factor w/ 4 levels a,b,c,d: 2 2 2
  ..$ val: num [1:3] 0.530 0.693 0.478
 $ B.b:'data.frame':4 obs. of  3 variables:
  ..$ a  : Factor w/ 2 levels A,B: 2 2 2 2
  ..$ b  : Factor w/ 4 levels a,b,c,d: 2 2 2 2
  ..$ val: num [1:4] 0.789 0.477 0.438 0.407
 $ A.c:'data.frame':3 obs. of  3 variables:
  ..$ a  : Factor w/ 2 levels A,B: 1 1 1
  ..$ b  : Factor w/ 4 levels a,b,c,d: 3 3 3
  ..$ val: num [1:3] 0.8612 0.0995 0.6620
 $ B.c:'data.frame':1 obs. of  3 variables:
  ..$ a  : Factor w/ 2 levels A,B: 2
  ..$ b  : Factor w/ 4 levels a,b,c,d: 3
  ..$ val: num 0.783
 $ A.d:'data.frame':1 obs. of  3 variables:
  ..$ a  : Factor w/ 2 levels A,B: 1
  ..$ b  : Factor w/ 4 levels a,b,c,d: 4
  ..$ val: num 0.821
 $ B.d:'data.frame':3 obs. of  3 variables:
  ..$ a  : Factor w/ 2 levels A,B: 2 2 2
  ..$ b  : Factor w/ 4 levels a,b,c,d: 4 4 4
  ..$ val: num [1:3] 0.7323 0.0707 0.3163

Here are some examples of accessing the data:

 z$B.d
   a bval
9  B d 0.73231374
15 B d 0.07067905
17 B d 0.31627171
 # or just the value (it is a vector)
 z$B.d$val
[1] 0.73231374 0.07067905 0.31627171
 # or by name
 z[[B.d]]$val
[1] 0.73231374 0.07067905 0.31627171
 # or by absolute number
 z[[8]]$val
[1] 0.73231374 0.07067905 0.31627171
 # take the mean
 mean(z$B.d$val)
[1] 0.3730882
 # get the length
 length(z$B.d$val)
[1] 3





 Now I can print out 'z[1], z[2] etc' This is nice but what if I want the 
 access/iterate through all of the members of a particular column in z. You 
 have given some methods like z[[1]]$b to access the specific columns in z. I 
 notice for your example z[[1]]$b prints out two values. Can I assume that 
 z[[1]]$b is a vecotr? So if I want to find the mean i can 'mean(z[[1]]$b)' 
 and it will give me the mean value of the b columns in z? (similarily sum, 
 and range, etc.). Does nrows(z[[1]]$b) return two in your example below? I 
 would like to find out how many elements are in z[1]. Or would it be just as 
 fast to do 'nrows(z[1])'?

 Thank you for this extended session on data frames, matrices, and vectors. I 
 feel much more comfortable with the concepts now.

 Kevin
  jim holtman [EMAIL PROTECTED] wrote:
 The reason for the empty levels was I did not put drop=TRUE on the
 split to remove unused levels.  Here is the revised script:

  set.seed(1)  # start with a known number
  x - data.frame(cat=sample(LETTERS[1:3],20,TRUE),a=sample(letters[1:4], 
  20, TRUE), b=runif(20))
  x
cat a  b
 1A d 0.82094629
 2B a 0.64706019
 3B c 0.78293276
 4C a 0.55303631
 5A b 0.52971958
 6C b 0.78935623
 7C a 0.02333120
 8B b 0.47723007
 9B d 0.73231374
 10   A b 0.69273156
 11   A b 0.47761962
 12   A c 0.86120948
 13   C b 0.43809711
 14   B a 0.24479728
 15   C d 0.07067905
 16   B c 0.09946616
 17   C d 0.31627171
 18   C a 0.51863426
 19   B c 0.66200508
 20   C b 0.40683019
  # drop unused groups from the split
  (z - split(x, list(x$cat, x$a), drop=TRUE))
 $B.a
cat a b
 2B a 0.6470602
 14   B a 0.2447973

 $C.a
cat a  b
 4C a 0.55303631
 7C a 0.02333120
 18   C a 0.51863426

 $A.b
cat a b
 5A b 0.5297196
 10   A b 0.6927316
 11   A b 0.4776196

 $B.b
   cat a b
 8   B b 0.4772301

 $C.b
cat a b
 6C b 0.7893562
 13   C b 0.4380971
 20   C b 0.4068302

 $A.c
cat a b
 12   A c 0.8612095

 $B.c
cat a  b
 3B c 0.78293276
 16   B c 0.09946616
 19   B c 0.66200508

 $A.d
   cat a b
 1   A d 0.8209463

 $B.d
   cat a b
 9   B d 0.7323137

 $C.d
cat a  b
 15   C d 0.07067905
 17   C d 0.31627171

  # access the value ('b' in this instance); two ways- should be the same
  z[[1]]$b
 [1] 0.6470602 0.2447973
  z$B.a$b
 [1] 0.6470602 0.2447973
 
 
 
 


 On Sun, Jul 13, 2008 at 1:26 AM,  [EMAIL PROTECTED] wrote:
  This is almost it. Maybe it is as good as can be expected. The only 
  problem that I 

Re: [R] Difficultes with grep

2008-07-13 Thread jim holtman
or your function looks like this were you dynamically create the string:

myfun - function(name) {

r - grep (paste(name, [^0-9], sep=), table )
return (r) }

On Sun, Jul 13, 2008 at 7:24 AM, Fran100681 [EMAIL PROTECTED] wrote:

 Thank you, but this is not what i want exactly.. i would want to launch
 function myfun with this script:

 table - sample(LETTERS[1:5], 20,TRUE)

 name - A

 myfun - function(name) {
 + r - grep (name[^0-9], table )
 + return (r) }

 but if I do it ,R doesn't accept this.. i want this because i have in
 table (a data frame) ,a list of element that are hsa-mir-N (when N is
 any number)..so, if i put in argument name this is: hsa-mir-70 ,function
 matches hsa-mir-70, but also (for instance) hsa-mir-700, 710 724 and so
 on... infact i insert a square brackets ^0-9 to exclude any other number
 after those i have given (in name).. It's a little complex situation.. :(

 p.s: i can't put in name simply: hsa-mir-20 to match all hsa-mir-20 in the
 data frame because i need to match hsa-mir-20 but also (for instance)
 hsa-mir-20-3... or hsa-mir-20b, hsa-mir-20a. and not those elements with
 another number after the 20


 jholtman wrote:

 I think this is what you want

 table - sample(LETTERS[1:5], 20,TRUE)

 name - A

 myfun - function(name) {
 + r - grep (name, table )
 + return (r) }

 myfun(name)
 [1]  4  7 14 18
 table
  [1] E B D A B B A B E B C C C A E D
 D A D C



 On Fri, Jul 11, 2008 at 1:57 PM, Fran100681 [EMAIL PROTECTED] wrote:

 Hello everybody!

 I'm using R and I have a little problem about function grep. I 've got
 to
 make a new function in which grep is present. So the first argument of
 grep is the string we want to find,ok..but in this case I define a
 function x before , x receives an argument in a object name (for
 instance), then inside function x ,i  define a grep.. so i want to set
 as
 pattern (1st argument of grep)  what i put in name and not the string
 name... how do i do that?

 ex:
 name - Tom

 myfun - function(name) {
 r - grep (name, table )
 return (r) }
 It returns nothing because it searches the word name in table rather
 Tom...
 I hope to receive some little help because this is stopping me in my
 projcet
 :/ ... i m'not able to reach a solution! Thanks a lot!
 --
 View this message in context:
 http://www.nabble.com/Difficultes-with-grep-tp18409347p18409347.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




 --
 Jim Holtman
 Cincinnati, OH
 +1 513 646 9390

 What is the problem you are trying to solve?

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



 --
 View this message in context: 
 http://www.nabble.com/Difficultes-with-grep-tp18409347p18428404.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] shapiro wilk normality test

2008-07-13 Thread Ted Harding
See at end.

On 13-Jul-08 21:42:19, Johannes Huesing wrote:
 Ted Harding [EMAIL PROTECTED] [Sun, Jul 13, 2008 at
 10:59:21PM CEST]:
 On 13-Jul-08 19:53:47, Johannes Huesing wrote:
  Frank E Harrell Jr [EMAIL PROTECTED] [Sun, Jul 13, 2008 at
  08:07:37PM CEST]:
  (Ted Harding) wrote:
  On 13-Jul-08 13:29:13, Frank E Harrell Jr wrote:
  [...]
  A large P-value means nothing more than needing more data.  No  
  conclusion is possible.  
 [...]
 
 But absence
 of evidence, in my interpretation (which I believe is right for
 the statistical context of non-significant P-values), means that
 we do not know about A: we do not have enough information.
 
 
 What would the p-value have to be like in your opinion to make the
 null hypothesis look more likely after the experiment than before?
 
 The proof is, basically, given in terms of a 2-valued logic where
 every term is either TRUE or FALSE. In the real world we have at
 least a third possible value: UNKNOWN (or, as R would put it, NA).
 
 How would the probabilities that A is NA be affected by the outcome
 of an experiment like this? If this probability is affected, how
 does this leave the probability that A is T or F unaffected?
 
 Or do you assign the NA status to the data collected?
 
 A high p-value does not always equate that you might as well have 
 collected nothing but missing values. 
 
 Of course I buy into the notion that a point estimate with a measure
 of accuracy is much better suited to describe your data; but a
 high p-value as a result of a test procedure that can be claimed to
 be adequately powered may defensibly be taken as a hint that we
 can for now stick with the null hypothesis.
 -- 
 Johannes Hüsing

I shall perhaps try later to respond in more detail to specific
points above. But, for the moment, let me say that I think your
statement a high p-value as a result of a test procedure that
can be claimed to be adequately powered may defensibly be taken
as a hint that we can for now stick with the null hypothesis
is the main key.

The power function of a test (which of course depends on the
design of the investigation and on its size, i.e. number of
data gathered) is basically much the same (in my mind) as the
amount of evidence.

A high P-value with a very powerful test serves to exclude
all alternatives to the Null Hypothesis except those which
lie very close to the Null Hypothesis.

In that sense, we do in fact have a lot of evidence against
all hypotheses except those which are very similar to the Null.
So we are not in an absence of evidence situation, and we
do have evidence of absence.

The basic logic of a Hypothesis Test (in its standard sense)
is the generalisation, to a logic where certainty is at best
probabilistic, of the classical-logic argument:

Given (as a matter of fact): If A, then B
Observed: B is FALSE
Conclusion: A is FALSE

Probabilistically:
Given: If A (H0), then B has high probability
Observed: B is FALSE
Conclusion: An event (not-B) has occurred which has very
small probability if A is TRUE. Hence we (as George Barnard
used to put it) apply The Principle of Disbelief in Tall Stories
and disbelieve A to the extent that we disbelieve not-B as
a possible outcome from A (H0).

In applications, the event B will be specified in terms of
a set of possible values of a Test Statistic T, devised so
as to represent an interesting measure of discrepancy between
the data and the hypothesis H0 (e.g. the t-statistic for
testing whether two samples are drawn from populations with
equal means -- if that is the case, then E(T) = 0, and the
set of values {abs(T)  T0} will be a discrepant set.

By choosing T0 to be such that Prob(abs(T)  T0) = p0, a small
value which we choose to suit ourselves, we are defining the
threshold at which we are prepared to deem that the claim
that Abs(T)  T0 is compatible with H0 is too unlikely to
be plausible.

The cleanest example in real life can be drawn from the basic
principle in criminal law for concluding that an accused person
is guilty, namely The accused is deemed innocent until proved
guilty beyond reasonable doubt.

What constitutes reasonable doubt can become a very interesting
question, but there are some crimes for which it has a definite
statistical interpretation, typically exceeding some authorised
limit (of speed in a vehicle, of alcohol content in the blood
while driving a vehicle, of a factory plant exceeding permitted
levels of polluting emissions [which in the UK, under the
Environmental Protection Act, is a criminal offence].

In the days when blood alcohol was determined by laboratory
analysis of a blood sample, it was possible to determine that
the margin of error corresponded to a P-value less than or
equal to 0.001 (i.e. if the lab analysis yielded a result in
exceess of the legal limit + 2*SE, then the inevitable result
was a conviction unless it could be independently proved in
defence that the statutory procedures were carried out in a
flawed manner).

So, in that case, 

Re: [R] Reading Multi-value data fields for descriptive analysis

2008-07-13 Thread Hohm, Dale
Thanks Jim,

I wish I were comfortable enough with the language for the fix needed to the 
syntax to be obvious, but it is not yet.  With your example, I get:

Error in strsplit(x[.row, ][[i]], ;#) : non-character argument

x appears to be filled properly, but z is not due to the error.

Also, if you were willing to provide some brief annotation or describe the 
overall logic in the code you supplied it would help me immensely.

Thanks,

Dale

-Original Message-
From: jim holtman [mailto:[EMAIL PROTECTED]
Sent: Sunday, July 13, 2008 6:35 AM
To: Hohm, Dale
Cc: r-help@r-project.org
Subject: Re: [R] Reading Multi-value data fields for descriptive analysis

This may do what you want:

 x - read.table(/tempxx.txt, comment=, quote=, sep=|, header=TRUE, 
 as.is=TRUE)
 # split out by name
 z - lapply(seq(nrow(x)), function(.row){
+ .result - NULL
+ # construct the data output
+ for (i in c('picnic', 'food', 'other')){
+ .split - strsplit(x[.row,][[i]], ;#)
+ .result - rbind(.result, cbind(name=x[.row,][['name']],
field=i, value=unlist(.split)))
+ }
+ .result
+ })


 z
[[1]]
 namefieldvalue
[1,] Yogi Bear picnic Yes
[2,] Yogi Bear food   Hamburgers
[3,] Yogi Bear food   Hot Dogs
[4,] Yogi Bear food   I rely on others to bring the good stuff
[5,] Yogi Bear other  \Softball
[6,] Yogi Bear other  Blanket
[7,] Yogi Bear other  I bring boo-boo, but he hides\

[[2]]
 name  fieldvalue
[1,] Boo-Boo picnic Yes
[2,] Boo-Boo food   Potato Salad
[3,] Boo-Boo food   Cole Slaw
[4,] Boo-Boo food   whatever Yogi doesn't eat
[5,] Boo-Boo other  Lawn Chairs
[6,] Boo-Boo other  Blanket
[7,] Boo-Boo other  my running shoes

[[3]]
 name  fieldvalue
[1,] Ranger Rick picnic No
[2,] Ranger Rick food   I told you I don't picnic
[3,] Ranger Rick other  a big net and handcuffs

[[4]]
  name  fieldvalue
 [1,] Magilla Gorilla picnic Yes
 [2,] Magilla Gorilla food   Hamburgers
 [3,] Magilla Gorilla food   Hot Dogs
 [4,] Magilla Gorilla food   Potato Salad
 [5,] Magilla Gorilla food   Cole Slaw
 [6,] Magilla Gorilla food   BBQ Chicken
 [7,] Magilla Gorilla other  Softball
 [8,] Magilla Gorilla other  Volleyball
 [9,] Magilla Gorilla other  Lawn Chairs
[10,] Magilla Gorilla other  Blanket



On Sun, Jul 13, 2008 at 12:56 AM, Hohm, Dale [EMAIL PROTECTED] wrote:
 Thanks for the reply Jim.

 Here is a representation of the data I want to analyze - 10 records as 
 requested.  Each line can easily include an ID number as below.

 So I want to determine a frequency or percentage of respondents that bring 
 each of the 5 foods (Hamburgers, Hot Dogs, Potato Salad, Cole Slaw and BBQ 
 Chicken) and how many Other write-ins there are.  The same for what else is 
 brought besides food (Softball, Volleyball, Lawn Chairs and Blanket) as well 
 as a count of Other write-ins.  I'll also need to be able to discern how 
 many brought Hambergers AND a Blanket or how many brought a Softball AND a 
 Vollyball etc.

 ID|Your Name|Do you picnic?|What is your favorite picnic food?|What do you 
 bring besides food?
 1|Yogi Bear|Yes|Hamburgers;#Hot Dogs;#I rely on others to bring the good 
 stuff|Softball;#Blanket;#I bring boo-boo, but he hides
 2|Boo-Boo|Yes|Potato Salad;#Cole Slaw;#whatever Yogi doesn't eat|Lawn 
 Chairs;#Blanket;#my running shoes
 3|Ranger Rick|No|I told you I don't picnic|a big net and handcuffs
 4|Magilla Gorilla|Yes|Hamburgers;#Hot Dogs;#Potato Salad;#Cole Slaw;#BBQ 
 Chicken|Softball;#Volleyball;#Lawn Chairs;#Blanket
 5|Foghorn Leghorn|Yes|Hot Dogs;#Cole Slaw;#I say, I say, BBQ 
 Chicken?|Softball;#Blanket
 6|Peter Potamus|Yes|Hamburgers;#Hot Dogs;#anything, just a lot of 
 it|Softball;#Lawn Chairs;#hot air balloon
 7|Jonny Quest|No|too busy getting into and out of trouble|Hadji and Bandit
 8|Fleegle, Bingo, Drooper and Snorky|Yes|Hamburgers;#Hot Dogs;#Potato 
 Salad;#Cole Slaw;#A banana split|a laugh track
 9|George Jetson|No|Mr. Spacely is making me work|Lawn Chairs;#Blanket;#my 
 flying car
 10|Snagglepuss|Yes|Hamburgers;#Hot Dogs;#Potato Salad;#Cole Slaw;#BBQ 
 Chicken|Softball;#Heavens to Murgatroyd!  Exit stage left!

 Thanks in advance,

 Dale

 -Original Message-
 From: jim holtman [mailto:[EMAIL PROTECTED]
 Sent: Saturday, July 12, 2008 11:32 AM
 To: Hohm, Dale
 Cc: r-help@r-project.org
 Subject: Re: [R] Reading Multi-value data fields for descriptive analysis

 Can you provide a more complete example (say 10 lines) of what the
 input is like. Does each line have a unique index that can be related
 to it?  Do you want to summarize all the multi1-n values of Col2?  Do
 you want to know the percentage of input lines that have a
 Col3/multi-value4 on them?  You could read in the data as you have
 indicated below and add a column that is the record number and
 therefore you would have have to worry about trying to say if it
 existed or not.  For example, you might have:

 Rec#|col#|value
 1|1|single
 1|2|multi1
 

Re: [R] any way to set defaults for par?

2008-07-13 Thread jim holtman
One way that I do it is to save the default parameters with the
following statement in my profile:

assign('Default.par', par(no.readonly=T))

An then I have a function which will reset them:

plotReset -
function()
{# reset plotting window
par(Default.par)
windows(width=7.5,height=4.7, record=T, pointsize=10)
}


On Sun, Jul 13, 2008 at 6:31 PM, Carl Witthoft [EMAIL PROTECTED] wrote:
 I know how to set graphic parameters by calling par(), but what I'd like is
 a way to set the default values so that subsequent calls to par() use my
 defaults.  The reason to want this is that every time I create a new graphic
 window (I'm using quartz on OSX, and so far no answers in the Mac mailing
 list), my parameters get reset to the builtin defaults.
 I read about the unexported variable .Pars, but would like to know if
 there's any way to manipulate that variable.

 thanks
 Carl

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.




-- 
Jim Holtman
Cincinnati, OH
+1 513 646 9390

What is the problem you are trying to solve?

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Moran's I test- Ordinal Logistic Regression Model

2008-07-13 Thread Rees, Lisa Marie (MU-Student)
Hi,
 
I am trying to do a Moran's I test on an ordinal logistic regression model.
I have a simple spatial weights matrix listed below I would like to use.
 
Y=
10   0   0   0   0   0   0   0  
01   1   0   0   0   0   1   1  
01   1   0   0   0   0   1   1  
00   0   1   1   1   1   0   0  
00   0   1   1   1   1   0   0  
00   0   1   1   1   1   0   0  
00   0   1   1   1   1   0   0  
01   1   0   0   0   0   1   1  
00   1   0   0   0   0   1   1  

I try to run the test as follows-

moran.test(order$resid, y).  It then gives me an error- Error in 
moran.test(resid(order), y) : y is not a listw object

Can I transform my matrix into a listw object or use some other test where I 
can use my simple matrix to perform the test?  Also, is using $resid for the 
ordinal logistic regression the proper way to run the moran's I test?

Thanks for any help you can provide me.

Lisa 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reading Multi-value data fields for descriptive analysis

2008-07-13 Thread jim holtman
Is one of the rows NULL?  Do an 'str(x)' show?  The example you sent
seems to work with the code.  Are you reading in a different set of
data?  I think I know what happened.  I shortened the names on your
example so it was easier to access.  Here is the data I used:

ID|name|picnic|food|other
1|Yogi Bear|Yes|Hamburgers;#Hot Dogs;#I rely on others to bring the
good stuff|Softball;#Blanket;#I bring boo-boo, but he hides
2|Boo-Boo|Yes|Potato Salad;#Cole Slaw;#whatever Yogi doesn't eat|Lawn
Chairs;#Blanket;#my running shoes
3|Ranger Rick|No|I told you I don't picnic|a big net and handcuffs
4|Magilla Gorilla|Yes|Hamburgers;#Hot Dogs;#Potato Salad;#Cole
Slaw;#BBQ Chicken|Softball;#Volleyball;#Lawn Chairs;#Blanket
5|Foghorn Leghorn|Yes|Hot Dogs;#Cole Slaw;#I say, I say, BBQ
Chicken?|Softball;#Blanket
6|Peter Potamus|Yes|Hamburgers;#Hot Dogs;#anything, just a lot of
it|Softball;#Lawn Chairs;#hot air balloon
7|Jonny Quest|No|too busy getting into and out of trouble|Hadji and Bandit
8|Fleegle, Bingo, Drooper and Snorky|Yes|Hamburgers;#Hot
Dogs;#Potato Salad;#Cole Slaw;#A banana split|a laugh track
9|George Jetson|No|Mr. Spacely is making me work|Lawn
Chairs;#Blanket;#my flying car
10|Snagglepuss|Yes|Hamburgers;#Hot Dogs;#Potato Salad;#Cole Slaw;#BBQ
Chicken|Softball;#Heavens to Murgatroyd!  Exit stage left!


Here is the code with a few more comments.  The basic structure was to
loop through the three data columns since they had the same format.

x - read.table(/tempxx.txt, comment=, quote=, sep=|,
header=TRUE, as.is=TRUE)
# split out by name. the 'lapply' will cycle through for each row in
the data and
# the index of the row is passed to the '.row' parameter of the function
z - lapply(seq(nrow(x)), function(.row){
# this sets the result to NULL so that we can accumulate the data
as it is processed
.result - NULL
# construct the data output
# the three columns were shorted to the following names.  the
'for' loop will
# iterate through each of the three columns, taking the data in
the columns and
# spliting them by your separator ';#'
for (i in c('picnic', 'food', 'other')){
# this will access the specific column (given in the variable 'i')
.split - strsplit(x[.row,][[i]], ;#)
# this appended on to '.result' the contents of this column
after creating
# the three columns of data which are the name, the column ID
and the value
# from that column
.result - rbind(.result, cbind(name=x[.row,][['name']],
field=i, value=unlist(.split)))
}
.result   # return the result
})


z



On Sun, Jul 13, 2008 at 7:25 PM, Hohm, Dale [EMAIL PROTECTED] wrote:
 Thanks Jim,

 I wish I were comfortable enough with the language for the fix needed to the 
 syntax to be obvious, but it is not yet.  With your example, I get:

Error in strsplit(x[.row, ][[i]], ;#) : non-character argument

 x appears to be filled properly, but z is not due to the error.

 Also, if you were willing to provide some brief annotation or describe the 
 overall logic in the code you supplied it would help me immensely.

 Thanks,

 Dale

 -Original Message-
 From: jim holtman [mailto:[EMAIL PROTECTED]
 Sent: Sunday, July 13, 2008 6:35 AM
 To: Hohm, Dale
 Cc: r-help@r-project.org
 Subject: Re: [R] Reading Multi-value data fields for descriptive analysis

 This may do what you want:

 x - read.table(/tempxx.txt, comment=, quote=, sep=|, header=TRUE, 
 as.is=TRUE)
 # split out by name
 z - lapply(seq(nrow(x)), function(.row){
 + .result - NULL
 + # construct the data output
 + for (i in c('picnic', 'food', 'other')){
 + .split - strsplit(x[.row,][[i]], ;#)
 + .result - rbind(.result, cbind(name=x[.row,][['name']],
 field=i, value=unlist(.split)))
 + }
 + .result
 + })


 z
 [[1]]
 namefieldvalue
 [1,] Yogi Bear picnic Yes
 [2,] Yogi Bear food   Hamburgers
 [3,] Yogi Bear food   Hot Dogs
 [4,] Yogi Bear food   I rely on others to bring the good stuff
 [5,] Yogi Bear other  \Softball
 [6,] Yogi Bear other  Blanket
 [7,] Yogi Bear other  I bring boo-boo, but he hides\

 [[2]]
 name  fieldvalue
 [1,] Boo-Boo picnic Yes
 [2,] Boo-Boo food   Potato Salad
 [3,] Boo-Boo food   Cole Slaw
 [4,] Boo-Boo food   whatever Yogi doesn't eat
 [5,] Boo-Boo other  Lawn Chairs
 [6,] Boo-Boo other  Blanket
 [7,] Boo-Boo other  my running shoes

 [[3]]
 name  fieldvalue
 [1,] Ranger Rick picnic No
 [2,] Ranger Rick food   I told you I don't picnic
 [3,] Ranger Rick other  a big net and handcuffs

 [[4]]
  name  fieldvalue
  [1,] Magilla Gorilla picnic Yes
  [2,] Magilla Gorilla food   Hamburgers
  [3,] Magilla Gorilla food   Hot Dogs
  [4,] Magilla Gorilla food   Potato Salad
  [5,] Magilla Gorilla food   Cole Slaw
  [6,] Magilla Gorilla food   BBQ Chicken
  [7,] Magilla Gorilla other  Softball
  [8,] Magilla Gorilla other  Volleyball
  [9,] Magilla Gorilla other  Lawn Chairs
 [10,] Magilla 

[R] Computing row means for sets of 2 columns

2008-07-13 Thread Daren Tan

Is there a better or more efficent approach than this without the use of t() ? 
 
 (m - matrix(1:40, ncol=4))  [,1] [,2] [,3] [,4] [1,]1   11   21   31 
 [2,]2   12   22   32 [3,]3   13   23   33 [4,]4   14   24   34 
 [5,]5   15   25   35 [6,]6   16   26   36 [7,]7   17   27   37 
 [8,]8   18   28   38 [9,]9   19   29   39[10,]   10   20   30   40
 (groups - rep(1:2, each=2))[1] 1 1 2 2
 (m.mean - t(aggregate(t(m), by=list(groups), mean)))[,1] [,2]Group.1 
12V1 6   26V2 7   27V3 8   28V4 9   
 29V510   30V611   31V712   32V813   33V9  
   14   34V10   15   35
_
Easily edit your photos like a pro with Photo Gallery.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Computing row means for sets of 2 columns

2008-07-13 Thread Henrik Bengtsson
m - matrix(1:40, ncol=4);
groups - rep(1:2, each=2);
uGroups - unique(groups);
mMeans - matrix(NA, nrow=nrow(m), ncol=length(uGroups));
for (gg in seq(along=uGroups)) {
  mMeans[,gg] - rowMeans(m[,groups == uGroups[gg], drop=FALSE]);
}

(Preallocation of result matrix is more memory efficient than using
cbind() or similar!)

/Henrik

On Sun, Jul 13, 2008 at 6:03 PM, Daren Tan [EMAIL PROTECTED] wrote:

 Is there a better or more efficent approach than this without the use of t() ?

 (m - matrix(1:40, ncol=4))  [,1] [,2] [,3] [,4] [1,]1   11   21   
 31 [2,]2   12   22   32 [3,]3   13   23   33 [4,]4   14   24   
 34 [5,]5   15   25   35 [6,]6   16   26   36 [7,]7   17   27   
 37 [8,]8   18   28   38 [9,]9   19   29   39[10,]   10   20   30   40
 (groups - rep(1:2, each=2))[1] 1 1 2 2
 (m.mean - t(aggregate(t(m), by=list(groups), mean)))[,1] 
 [,2]Group.112V1 6   26V2 7   27V3 8   28V4   
   9   29V510   30V611   31V712   32V813  
  33V914   34V10   15   35
 _
 Easily edit your photos like a pro with Photo Gallery.

[[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] stem and leaf plot: how to edit the stem-values

2008-07-13 Thread Duncan Murdoch

On 13/07/2008 5:40 PM, Jörg Groß wrote:

Hi,

I would like to make a stem and leaf plot and I want to edit the  
category-names.


If you have a computer you can do much better histograms.  But since you 
have chosen to do this, one way is to edit the underlying C code.  It's 
available in


https://svn.r-project.org/R/src/appl/stem.c

Another way is to save the plot into a file, and manually edit the file.

The best way is to produce the whole thing by hand, using pen and paper, 
while sitting on a tropical island without access to a computer.  I 
recommend Half Moon Caye, 
http://maps.google.com/maps?f=qhl=engeocode=q=half+moon+cayesll=37.0625,-95.677068sspn=93.342821,105.292969ie=UTF8ll=17.206312,-87.532454spn=0.05854,0.051413t=hz=14 



Duncan Murdoch



So, by doing this:

  x - c(1,2,2,3,3,3,3,2,2,1)
  stem(x)

I get:

   1 | 00
   1 |
   2 | 
   2 |
   3 | 

First Question: Why do I get gaps between the categories?
(like in line 2 and line 4)

And second: How can I edit the categories so that I can create  
something like that:



   category A | 00
   category B | 
   category C | 

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.