date:20121006


Hello,

My example with 'x' was just that, an example. Inline.

Em 06-10-2012 00:03, Jhope escreveu:

Hi,

I have tried the script posted but received the following errors.  I hope I
copied it correctly. I'm sorry but I don't know how to alter the script
myself.
Please advise, Jean


x - 0:30 + runif(124)
data.to.analyze$VegIndex - cut(x, breaks = seq(0, 35, 5))

Error in `$-.data.frame`(`*tmp*`, VegIndex, value = c(1L, 1L, 1L, 1L,  :
   replacement has 124 rows, data has 123


Use

cut(data.to.analize$VegIndex, breaks = seq(0, 35, 5))


l - levels(data.to.analyze$VegIndex)
l1 - sub(\\], ), l[1])
l2 - as.numeric(sub(\\(([[:digit:]]+),.*, \\1, l[-1])) + 1
l3 - sub(.*,([[:digit:]]+).*, \\1, l[-1])
l.new - c(l1, paste0((,l2,,,l3, )))

Error: could not find function paste0


paste0 was introduced with R 2.15.0, update your version of R and in the 
mean time use


paste(...etc..., sep = )


Rui Barradas

levels(data.to.analyze$VegIndex) - l.new

Error: object 'l.new' not found

str(data.to.analyze$VegIndex)

  NULL

barplot(table(data.to.analyze$VegIndex))

Error in plot.window(xlim, ylim, log = log, ...) :
   need finite 'xlim' values
In addition: Warning messages:
1: In min(w.l) : no non-missing arguments to min; returning Inf
2: In max(w.r) : no non-missing arguments to max; returning -Inf
3: In min(x) : no non-missing arguments to min; returning Inf
4: In max(x) : no non-missing arguments to max; returning -Inf



--
View this message in context: 
http://r.789695.n4.nabble.com/Creating-vegetation-distance-groups-from-one-column-tp4644970p4645230.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vector is not assigned correctly in for loop

2012-10-06 Thread Berend Hasselman


On 06-10-2012, at 08:14, 周果 guo.c...@gmail.com wrote:

 Hi there,
 Here is a minimum working example:
 
 lower = 0
 upper = 1
 n_bins = 50
 interval = (upper - lower) / n_bins
 bins = vector(mode=numeric, length=n_bins)
 breaks = seq(from=lower  + interval, to=upper, by=interval)
 
 for(idx in breaks)
 {
 bins[idx / interval] = idx
 }
 
 print(bins)
 
 which outputs:
 
 [1] 0.02 0.04 0.06 0.08 0.10 0.14 0.00 0.16 0.20 0.00 0.22 0.24 0.26 0.28
 [15] 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 0.52 0.54 0.56
 [29] 0.58 0.60 0.62 0.64 0.66 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82 0.84
 [43] 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00
 
 It turns out that some elements are incorrect, such as the 6th
 element 0.14, which should be 0.12 in fact.

And the 7th is also incorrect.

 Is this a bug or I am missing something?

It is not a bug in R.
Yes you are indeed missing something. Read R FAQ 7.31.
Answer is: floating point inaccuracy.

Insert

print(formatC(idx/interval,format=f,digits=17))   
print(as.integer(idx/interval))

immediately after the opening { of the for loop.
If you insist on copying breaks to bins in the way you are doing you could use 
round(idx/interval,3) for example.

Berend

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vector is not assigned correctly in for loop

Forgot to cc the list.

RMW

On Sat, Oct 6, 2012 at 11:29 AM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 A case study of a good question! Would that all posters did such a good job.



 On Sat, Oct 6, 2012 at 7:14 AM, 周果 guo.c...@gmail.com wrote:
 Hi there,
 Here is a minimum working example:
 
 lower = 0
 upper = 1
 n_bins = 50
 interval = (upper - lower) / n_bins
 bins = vector(mode=numeric, length=n_bins)
 breaks = seq(from=lower  + interval, to=upper, by=interval)

 for(idx in breaks)
 {
 bins[idx / interval] = idx
 }


 Note that this could slightly move idiomatically be done as

 bins[breaks / interval] = breaks

 print(bins)
 
 which outputs:
 
  [1] 0.02 0.04 0.06 0.08 0.10 0.14 0.00 0.16 0.20 0.00 0.22 0.24 0.26 0.28
 [15] 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 0.52 0.54 0.56
 [29] 0.58 0.60 0.62 0.64 0.66 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82 0.84
 [43] 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00
 
 It turns out that some elements are incorrect, such as the 6th
 element 0.14, which should be 0.12 in fact.
 Is this a bug or I am missing something?

 Take a look at

 as.integer(breaks / interval)

 You're hitting up on floating-point issues (see the link in R FAQ 7.31
 for the definitive reference, but it's a large and complicated field
 with many little manifestations like this)

 What's basically happening is that the 7 you see in breaks / interval,
 is actually 6. (or so) which gets printed as a 7 by
 print() but truncated to a 6 for subsetting as mentioned in ?`[`. If
 you were to turn on more digits for printing, you'd see it's not
 really a 7.

 You'd probably rather have

 bins[round(breaks / interval)] = breaks

 Cheers and thanks again for spending so much time to make a good question,

 Michael

 And here is the output of sessionInfo():
 
 R version 2.15.0 (2012-03-30)
 Platform: x86_64-pc-mingw32/x64 (64-bit)

 locale:
 [1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936
 [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
 [3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936
 [4] LC_NUMERIC=C
 [5] LC_TIME=Chinese (Simplified)_People's Republic of China.936

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 loaded via a namespace (and not attached):
 [1] cubature_1.1-1 tools_2.15.0
 
 Thanks in advance.

 Regards,
 Guo

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Download limit

On Sat, Oct 6, 2012 at 8:19 AM, agiani99 agian...@hotmail.com wrote:
 Hi all,
 I am trying to use in RStudio the latest code given in
 https://github.com/systematicinvestor/SIT/blob/master/R/bt.test.r,
 which seems to work fine  but with the following warning for download
 limits (one for each of the tickers).
 I searched in options() something which could be related to this
 setting, w/o success.
 Any hint for me in order to raise or remove these limits? Where is this
 limit set? I am using R 2.15-1 on Rstudio 0.96.331 in W7.
 Best
 Andrea


I don't believe this is an R or an RStudio problem as much as it is a
connectivity problem. I'd be willing to guess you're behind a firewall
of some sort?

Cheers,
Michael


tickers = spl('SPY,TLT,GLD,SHY')
data - new.env()
getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, 
auto.assign = T)
 environment: 0x0b49ba98
 Warnmeldungen:
 1: In download.file(paste(yahoo.URL, s=, Symbols.name, a=, from.m,  :
heruntergeladene Länge 261497 != angegebener Länge 200


 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] warning in summary(aov())

On Sat, Oct 6, 2012 at 9:12 AM, Jhope jeanwaij...@gmail.com wrote:
 Hi R-listers,

 I am receiving an error - see below. Aeventexhumed is the event in which
 nesting occured, so it is defined by A, B, C. I thought as a factor was ok,
 tried to change it to as.character but it still gave me the same error. Is
 there something I should do about this error or just ignore it?

 Please advise, Jean

summary(aov(EDI ~ HTLIndex + Aeventexhumed + HTLIndex:Aeventexhumed,
 data=data.to.analyze))

Df  Sum Sq Mean Sq F value Pr(F)
 HTLIndex6   2.435 0.40575  0.2027 0.9752
 Aeventexhumed   2   4.652 2.32601  1.1619 0.3172
 HTLIndex:Aeventexhumed 11   7.941 0.72192  0.3606 0.9680
 Residuals  98 196.193 2.00197
 5 observations deleted due to missingness
 Warning message:
 In model.matrix.default(mt, mf, contrasts) :
   variable 'Aeventexhumed' converted to a factor

I think you should have only seen this when Aeventexhumed was a
character -- it's nothing to worry about, just letting you know a
factor conversion had to happen (which is almost surely what you
wanted).

If you see this when Aeventexhumed is a factor already, that's
somewhat surprising. What does str(data.to.analyze) show you?

Michael




 --
 View this message in context: 
 http://r.789695.n4.nabble.com/warning-in-summary-aov-tp4645253.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vector is not assigned correctly in for loop


Hello,

This seems to be a case for FAQ 7.31 Why doesn't R think these numbers 
are equal?


See this example:

3/5 - 1/5 - 2/5  # not zero
3/5 - (1/5 + 2/5)  # not zero, different from above

In your case, try

for(idx in breaks){
print(idx / interval, digits = 16)  # see problem indices
bins[idx / interval] = idx
}
b2 - breaks

identical(bins, b2)  # FALSE

What happens is that instead of 7, the value of idx/interval is 
6.999 with integer part 6. So bins[6] is assigned twice, first 1.2 
then this valuew is overwritten by 1.4 and bins[7] is never written to. 
The same goes with indices 9 and 10.


Avoid this type of indexing. And if possible use the vectorized 
instruction b2 - breaks.


Hope this helps,

Rui Barradas

Em 06-10-2012 07:14, 周果 escreveu:

Hi there,
Here is a minimum working example:

lower = 0
upper = 1
n_bins = 50
interval = (upper - lower) / n_bins
bins = vector(mode=numeric, length=n_bins)
breaks = seq(from=lower  + interval, to=upper, by=interval)

for(idx in breaks)
{
bins[idx / interval] = idx
}

print(bins)

which outputs:

  [1] 0.02 0.04 0.06 0.08 0.10 0.14 0.00 0.16 0.20 0.00 0.22 0.24 0.26 0.28
[15] 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 0.52 0.54 0.56
[29] 0.58 0.60 0.62 0.64 0.66 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82 0.84
[43] 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00

It turns out that some elements are incorrect, such as the 6th
element 0.14, which should be 0.12 in fact.
Is this a bug or I am missing something?
And here is the output of sessionInfo():

R version 2.15.0 (2012-03-30)
Platform: x86_64-pc-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936
[2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
[3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936
[4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Simplified)_People's Republic of China.936

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base

loaded via a namespace (and not attached):
[1] cubature_1.1-1 tools_2.15.0

Thanks in advance.

Regards,
Guo

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating the mean in one column with empty cells

On Sat, Oct 6, 2012 at 9:11 AM, fxen3k f.seha...@gmail.com wrote:
 Hi,

 the first command was bringing the numbers into R directly:
 * testdata - c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262,
 0.0264198664609803, 0.0200581303857603, -0.2971754213679500,
 -0.2353086361784190, 0.0667195538296534, 0.1755852636926560)
 mean(testdata)
 [1] 0.0161584*

 Here I tried to calculate the mean with the same numbers as given above, but
 taken from my dataset.
 *
 str(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
  num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18
 mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
 [1] 0.0167
 *

 It seems that in the second case he calculates the mean with rounded numbers
 (0.2 and not 0.20061601085...)
 Could it be that R imports only the rounded numbers?
 How can I build a CSV-file with numbers showing all decimal places? Because
 I think my current CSV-file only has numbers with 2 decimal places.

That's something you need to figure out with whatever software is
writing the csv.

Cheers,
Michael



 Kind Regards,
 Felix




 --
 View this message in context: 
 http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135p4645252.html
 Sent from the R help mailing list archive at Nabble.com.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] svyhist

2012-10-06 Thread Anthony Damico

?ylim says numeric vectors of length 2  - so just the beginning and end.

?svyhist doesn't specifically mention the ylim parameter, meaning you
should look for a ... in the arguments list and click through to the page
for ?hist

?hist has an example that shows the ylim parameter only containing the
beginning and end values.

try using

ylim = c( 0 , 0.030 )

if you're looking to set the tick marks, look at ?axis   ;)


On Fri, Oct 5, 2012 at 11:18 PM, Muhuri, Pradip (SAMHSA/CBHSQ) 
pradip.muh...@samhsa.hhs.gov wrote:

 Dear Anthony and David,

 Sorry- the earlier-sent plots were mislabeled, which I have corrected and
 attached.  But, the y-lim issue is yet to be resolved.

 Thanks,

 Pradip Muhuri


 
 From: Anthony Damico [ajdam...@gmail.com]
 Sent: Friday, October 05, 2012 7:29 PM
 To: David Winsemius
 Cc: Muhuri, Pradip (SAMHSA/CBHSQ); R help
 Subject: Re: [R] svyhist

 this worked for me -- and doesn't require removing the PSUs from the
 design  :)

 options( survey.lonely.psu = adjust )
 svyhist (~dthage,
 subset (nhis, xspd2=='No SPD'), breaks=MyBreaks, main=  ,
 col=grey80,
 xlab=Age at Death Distribution
 )
 lines (svysmooth(~dthage, bandwidth=5,subset(nhis, xspd2=='No SPD')),
 lwd=2)


 Dr. Lumley has written quite a bit about single-PSU strata here:
 http://faculty.washington.edu/tlumley/survey/exmample-lonely.html



 On Fri, Oct 5, 2012 at 7:16 PM, David Winsemius dwinsem...@comcast.net
 mailto:dwinsem...@comcast.net wrote:

 On Oct 5, 2012, at 3:33 PM, Muhuri, Pradip (SAMHSA/CBHSQ) wrote:

  Hello,
 
  I was trying to draw histograms of age at death  and got the following
 2 error messages:
 
 
  1)  Error in tapply(1:NROW(x), list(factor(strata)), function(index) { :
 
   arguments must have same length

 This is the top of the output of str applied to the data argument you
 offered to svyhist:


  str(subset (nhis, xspd2==2) )
 List of 9
  $ cluster   :'data.frame': 0 obs. of  1 variable:
   ..$ psu: Factor w/ 47 levels 109.1,115.2,..:
   ..- attr(*, terms)=Classes 'terms', 'formula' length 2 ~psu
   .. .. ..- attr(*, variables)= language list(psu)
   .. .. ..- attr(*, factors)= int [1, 1] 1
   .. .. .. ..- attr(*, dimnames)=List of 2
   .. .. .. .. ..$ : chr psu
   .. .. .. .. ..$ : chr psu

 At least one problem seems pretty clear. No data. That can be corrected by
 wrapping as.numeric() around the factor on which you are subsetting in two
 places.

 Another problem may arise when you restrict to one class only, namely
 there won't any design to work with. All the clusters  there would be
 only one   no longer have any multiplicity,  and svyhist apparently
 isn't built to handle situation, at least with that design argument.

 Error in onestrat(x[index, , drop = FALSE], clusters[index],
 nPSU[index][1],  :
   Stratum (2) has only one PSU at stage 1

 Taking the 'stratum' argument out of the design() spec allows it to
 proceed, but I do not know if that is introducing invalidity in the
 analysis.
 --
 David.

 
 
  2)  Error in findInterval(mm[, i], gx) : 'vec' contains NAs
 
  In addition: Warning messages:
 
  1: In min(x) : no non-missing arguments to min; returning Inf
 
  2: In max(x) : no non-missing arguments to max; returning -Inf
 
 
 
  I would appreciate if someone could help me resolve these issues.
 
 
 
  Below is reproducible example.
 
  Thanks,
 
  Pradip Muhuri
 
 
 
  setwd (E:/RDATA)
  options(width = 120)
  library (survey)
  library (KernSmooth)
  xd1 -
  dthage ypll_75 xspd2 psu stratum wt8
56  19 2   2  33 1512.7287
86   0 2   2 129 1830.6400
81   0 2   1  67  536.1400
47  28 2   1  17  519.8350
71   4 1   1 225  254.4087
72   3 1   1 238  424.4787
75   0 2   2 115  407.0987
83   0 2   2  46  622.5137
79  -4 2   1 300  509.1212
78  -3 2   1 133  517.3325
71   4 2   2 328 1179.3063
64  11 2   1   2  301.5250
78  -3 2   1  62  253.9025
65  10 2   2 260  932.6575
75   0 2   1 247  145.5900
63  12 2   2 156  247.0650
71   4 2   1 146  829.4787
76  -1 2   2 234  432.5437
76   0 2   1 109  859.6888
68   7 2   1 228 1236.2975
64  11 2   2 167  347.5788
62  13 2   2 312  354.0500
77   0 2   2 275  882.1938
78  -3 2   1  28  481.5975
81   0 2   1 180 1285.5425
79   0 2   2 205  576.
70   5 2   1 173  128.3725
75   0 2   2 189  359.3863
78   0 2   1 332  512.8062
74   1 2   2  14  449.0800
77   0 2   1 242  283.0013
92   0 2   1 152  915.3200
69   6 2

Re: [R] arrange data


Hello,

Using Arun's data example, instead of creating a factor convert to 4 
digits years.


set.seed(1)
dat1 - data.frame(Tahun=rep(c(98:99,00),each=36),
Bahun=rep(rep(1:12,times=3),each=3),
x=sample(1:500,108,replace=TRUE))

dat2 - dat1  # operate on a copy
dat2$Tahun - with(dat2, ifelse(Tahun  71, 2000 + Tahun, 1900 + Tahun))

agg_dt1 - aggregate(x=dat2[,3],by=dat2[,c(1,2)],FUN=sum)
head(agg_dt1)

Hope this helps,

Rui Barradas
Em 06-10-2012 03:38, arun escreveu:

Hi,

I hope this helps you.
  I created a small dataset: 3 replications per month for 1998:2000.

set.seed(1)
dat1-data.frame(Tahun=rep(c(98:99,00),each=36),Bahun=rep(rep(1:12,times=3),each=3),
 x=sample(1:500,108,replace=TRUE))
dat2-within(dat1,{Tahun-factor(Tahun,levels=c(98,99,0))})


agg_dt1-aggregate(x=dat2[,3],by=dat2[,c(1,2)],FUN=sum)
  head(agg_dt1)
#  Tahun Bahunx
#198 1 1252
#299 1  680
#3 0 1  687
#498 2  761
#599 2  860
#6 0 2  786
I guess this is what you wanted.


In addition, you can also use ddply() with a different way of grouping: but 
with the same result.
library(plyr)
  dd_dt1-ddply(dat2,.(Tahun,Bahun),summarize, sum(x))
  head(dd_dt1)
#  Tahun Bahun  ..1
#198 1 1252
#298 2  761
#398 3  440
#498 4  597
#598 5  987
#698 6  692
  tail(dd_dt1)
#   Tahun Bahun  ..1
#31 0 7  685
#32 0 8  504
#33 0 9  633
#34 010  553
#35 011  914
#36 012 1039

A.K.






- Original Message -
From: Roslina Zakaria zrosl...@yahoo.com
To: r-help@r-project.org r-help@r-project.org
Cc:
Sent: Friday, October 5, 2012 8:09 PM
Subject: [R] arrange data

Dear r-users,

I have dailly rainfall data from year 1971 to 2000. I use aggregate to form 
monthly rainfall data.  What I don't understand is that the data for the year 
2000 become on the top, instead of year 1971.  Here are some codes and output:


agg_dt1 - aggregate(x=dt1[,4],by=dt1[,c(1,2)],FUN=sum)


head(agg_dt1,20); tail(agg_dt1,20)

Tahun Bulan x
1  0 1 398.6
2 71 1 934.9
3 72 1 107.2
4 73 1 236.4
5 74 1  10.5
6 75 1 744.6
7 76 1   9.2
8 77 1 108.7
9 78 1 251.5
1079 1 197.3
1180 1 144.1
1281 1 104.5
1382 1  17.7
1483 1 151.8
...

Thank you so much for your help.

Roslina
 [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Dúvida função Anova pacote car - Medidas repetidas


Hello,

Yes, your Spanish is close enough to Portuguese for you to understand it.
I thought it was homework and didn't read untill the end. Apologies to 
Diego, and thanks to John.


Rui Barradas
Em 05-10-2012 22:48, John Fox escreveu:

Dear Diego,

This is close enough to Spanish for me to understand it (I think).

Using Anova() in the car package for repeated-measures designs requires a
multivariate linear model for all of the responses, which in turn requires
that the data set be in wide format, with each response as a variable. In
your case, there are two crossed within-subjects factors and no
between-subjects factors. If this understanding is correct (but see below),
then you could proceed as follows, where the crucial step is reshaping the
data from long to wide:

- snip --

Pa2$type.day - with(Pa2, paste(Type, Day, sep=.))
(Wide - reshape(Pa2, direction=wide, v.names=logbiovolume,
idvar=Replicate, timevar=type.day, drop=c(Type, Day)))

day - ordered(rep(c(0, 2, 4), each=2))
type - factor(rep(c(c, t), 3))
(idata - data.frame(day, type))

mod - lm(cbind(logbiovolume.c.0, logbiovolume.t.0, logbiovolume.c.2,
logbiovolume.t.2, logbiovolume.c.4, logbiovolume.t.4) ~ 1, data=Wide)

Anova(mod, idata=idata, idesign=~day*type)

- snip --

This serves to analyze the data that you showed; you'll have to adapt it for
the full data set.

I'm assuming that the replicates are independent units, and that the
design is therefore entirely within replicate. If that's wrong, then the
analysis I've suggested is also incorrect.

I hope this helps,
  John

---
John Fox
Senator McMaster Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada





-Original Message-
From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
On Behalf Of Diego Pujoni
Sent: Friday, October 05, 2012 9:57 AM
To: r-help@r-project.org
Subject: [R] Dúvida função Anova pacote car - Medidas repetidas

Ola pessoal, estou realizando uma ANOVA com medidas repetidas e estou
utilizando a fungco Anova do pacote car.

Medi o biovolume de algas a cada dois dias durante 10 dias (no banco de
dados abaixo ss coloquei ati o 40 dia). Tenho 2 tratamentos (c,t) e
o
experimento foi realizado em triplicas (A,B,C).


Pa2

Day Type Replicate logbiovolume
10c A19.34
20c B18.27
30c C18.56
40t A18.41
50t B18.68
60t C18.86
72c A18.81
82c B18.84
92c C18.52
10   2t A18.29
11   2t B17.91
12   2t C17.67
13   4c A19.16
14   4c B18.85
15   4c C19.36
16   4t A19.05
17   4t B19.09
18   4t C18.26
.
.
.

Pa2.teste = within(Pa2,{group = factor(Type)
time = factor(Day)
id = factor(Replicate)})
matrix =
with(Pa2.teste,cbind(Pa2[,VAR][group==c],Pa2[,VAR][group==t]))
matrix
[,1]  [,2]
  [1,] 19.34 18.41
  [2,] 18.27 18.68
  [3,] 18.56 18.86
  [4,] 18.81 18.29
  [5,] 18.84 17.91
  [6,] 18.52 17.67
  [7,] 19.16 19.05
  [8,] 18.85 19.09
  [9,] 19.36 18.26
[10,] 19.63 18.96
[11,] 19.94 18.06
[12,] 19.54 18.37
[13,] 19.98 17.96
[14,] 20.99 17.93
[15,] 20.45 17.74
[16,] 21.12 17.60
[17,] 21.66 17.33
[18,] 21.51 18.12
  model - lm(matrix ~ 1)
  design - factor(c(c,t))

  options(contrasts=c(contr.sum, contr.poly))
  aov - Anova(model, idata=data.frame(design), idesign=~design,
type=III)
  summary(aov, multivariate=F)

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

  SS num Df Error SS den Df FPr(F)
(Intercept) 12951.2  1   6.3312 17 34775.336  2.2e-16 ***
design 19.1  1  17.3901 1718.697 0.0004606 ***
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1


O problema i que eu acho que esta fungco nco esta levando em
consideragco
os dias, nem as riplicas. Como fago para introduzir isto na analise.
Vocjs
conhecem alguma fungco correspondente nco paramitrica para este teste?
Tipo
um teste de Friedman com dois grupos (tratamento e riplica) e um bloco
(tempo)?

Muito Obrigado

Diego PJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide

Re: [R] Dúvida função Anova pacote car - Medidas repetidas


Sorry,

Phone, daughter, forgot to sign.

Rui Barradas
Em 06-10-2012 12:28, Rui Barradas escreveu:

Hello,

Yes, your Spanish is close enough to Portuguese for you to understand it.
I thought it was homework and didn't read untill the end. Apologies to 
Diego, and thanks to John.


Rui Barradas
Em 05-10-2012 22:48, John Fox escreveu:

Dear Diego,

This is close enough to Spanish for me to understand it (I think).

Using Anova() in the car package for repeated-measures designs 
requires a
multivariate linear model for all of the responses, which in turn 
requires
that the data set be in wide format, with each response as a 
variable. In

your case, there are two crossed within-subjects factors and no
between-subjects factors. If this understanding is correct (but see 
below),
then you could proceed as follows, where the crucial step is 
reshaping the

data from long to wide:

- snip --

Pa2$type.day - with(Pa2, paste(Type, Day, sep=.))
(Wide - reshape(Pa2, direction=wide, v.names=logbiovolume,
idvar=Replicate, timevar=type.day, drop=c(Type, Day)))

day - ordered(rep(c(0, 2, 4), each=2))
type - factor(rep(c(c, t), 3))
(idata - data.frame(day, type))

mod - lm(cbind(logbiovolume.c.0, logbiovolume.t.0, logbiovolume.c.2,
logbiovolume.t.2, logbiovolume.c.4, logbiovolume.t.4) ~ 1, data=Wide)

Anova(mod, idata=idata, idesign=~day*type)

- snip --

This serves to analyze the data that you showed; you'll have to adapt 
it for

the full data set.

I'm assuming that the replicates are independent units, and that the
design is therefore entirely within replicate. If that's wrong, then the
analysis I've suggested is also incorrect.

I hope this helps,
  John

---
John Fox
Senator McMaster Professor of Social Statistics
Department of Sociology
McMaster University
Hamilton, Ontario, Canada





-Original Message-
From: r-help-boun...@r-project.org 
[mailto:r-help-boun...@r-project.org]

On Behalf Of Diego Pujoni
Sent: Friday, October 05, 2012 9:57 AM
To: r-help@r-project.org
Subject: [R] Dúvida função Anova pacote car - Medidas repetidas

Ola pessoal, estou realizando uma ANOVA com medidas repetidas e estou
utilizando a fungco Anova do pacote car.

Medi o biovolume de algas a cada dois dias durante 10 dias (no banco de
dados abaixo ss coloquei ati o 40 dia). Tenho 2 tratamentos (c,t) e
o
experimento foi realizado em triplicas (A,B,C).


Pa2

Day Type Replicate logbiovolume
10c A19.34
20c B18.27
30c C18.56
40t A18.41
50t B18.68
60t C18.86
72c A18.81
82c B18.84
92c C18.52
10   2t A18.29
11   2t B17.91
12   2t C17.67
13   4c A19.16
14   4c B18.85
15   4c C19.36
16   4t A19.05
17   4t B19.09
18   4t C18.26
.
.
.

Pa2.teste = within(Pa2,{group = factor(Type)
time = factor(Day)
id = factor(Replicate)})
matrix =
with(Pa2.teste,cbind(Pa2[,VAR][group==c],Pa2[,VAR][group==t]))
matrix
[,1]  [,2]
  [1,] 19.34 18.41
  [2,] 18.27 18.68
  [3,] 18.56 18.86
  [4,] 18.81 18.29
  [5,] 18.84 17.91
  [6,] 18.52 17.67
  [7,] 19.16 19.05
  [8,] 18.85 19.09
  [9,] 19.36 18.26
[10,] 19.63 18.96
[11,] 19.94 18.06
[12,] 19.54 18.37
[13,] 19.98 17.96
[14,] 20.99 17.93
[15,] 20.45 17.74
[16,] 21.12 17.60
[17,] 21.66 17.33
[18,] 21.51 18.12
  model - lm(matrix ~ 1)
  design - factor(c(c,t))

  options(contrasts=c(contr.sum, contr.poly))
  aov - Anova(model, idata=data.frame(design), idesign=~design,
type=III)
  summary(aov, multivariate=F)

Univariate Type III Repeated-Measures ANOVA Assuming Sphericity

  SS num Df Error SS den Df F Pr(F)
(Intercept) 12951.2  1   6.3312 17 34775.336  2.2e-16 ***
design 19.1  1  17.3901 1718.697 0.0004606 ***
---
Signif. codes:  0 *** 0.001 ** 0.01 * 0.05 . 0.1   1


O problema i que eu acho que esta fungco nco esta levando em
consideragco
os dias, nem as riplicas. Como fago para introduzir isto na analise.
Vocjs
conhecem alguma fungco correspondente nco paramitrica para este teste?
Tipo
um teste de Friedman com dois grupos (tratamento e riplica) e um bloco
(tempo)?

Muito Obrigado

Diego PJ

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide 
http://www.R-project.org/posting-guide.html

and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org

Re: [R] Download limit

2012-10-06 Thread agiani99


Hi Michael,
I am not under  firewall, but I noticed that when I setInternet2=FALSE 
the problem disappears.

SetInternet2=TRUE is required to download Systematic Investor Toolbox (SIT).
I don't know why or whether it makes sense, but yes it seems a 
connection problem and no
it seems to have something to do with R or RStudio. Thanks for your 
time, though.

Best
Andrea



Am 06.10.2012 12:31, schrieb R. Michael Weylandt:

On Sat, Oct 6, 2012 at 8:19 AM, agiani99 agian...@hotmail.com wrote:

Hi all,
I am trying to use in RStudio the latest code given in
https://github.com/systematicinvestor/SIT/blob/master/R/bt.test.r,
which seems to work fine  but with the following warning for download
limits (one for each of the tickers).
I searched in options() something which could be related to this
setting, w/o success.
Any hint for me in order to raise or remove these limits? Where is this
limit set? I am using R 2.15-1 on Rstudio 0.96.331 in W7.
Best
Andrea


I don't believe this is an R or an RStudio problem as much as it is a
connectivity problem. I'd be willing to guess you're behind a firewall
of some sort?

Cheers,
Michael


tickers = spl('SPY,TLT,GLD,SHY')
data - new.env()
getSymbols(tickers, src = 'yahoo', from = '1980-01-01', env = data, auto.assign 
= T)

environment: 0x0b49ba98
Warnmeldungen:
1: In download.file(paste(yahoo.URL, s=, Symbols.name, a=, from.m,  :
heruntergeladene Länge 261497 != angegebener Länge 200


 [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.





__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LaTeX consistent publication graphics from R and Comparison of GLE and R

2012-10-06 Thread Frank Harrell

Hi Marc,

It would be interesting to compare with tikz for ease of use.

As an aside I've been wishing that someone would write an R function for
creating clinical trial disposition charts using tikz or pstricks ...

Best,
Frank

Marc Schwartz-3 wrote
On Oct 5, 2012, at 3:32 PM, clangkamp lt;

christian.langkamp@

gt; wrote:

Hi Everyone

I am at the moment preparing my thesis and am looking at producing a few
Organigrams / Flow charts (unrelated to the calculations in R) as well as
a
range of charts (barcharts, histograms, ...) based on calculations in R.

For the Organigrams I am looking at an Opensource package called GLE at
sourceforge, which produces the text part in Latex figures which is very
neat and also in the same style of the thesis, which I wrote in LaTeX. It
also offers a range of graphical features, and I am quite tempted.

It also produces barcharts and histograms with the options of legends
etc. I
have done most of my graphs so far with R, but with Organigrams and flow
charts I am at a loss (A pointer here would also be very welcome). For
some
charts I have used MS Visio, but it would be convenient to use just one
program for graphing throughout the thesis (i.e. same colour coding
etc.).

Does anybody have any experience with GLE, ideally working with it with
CSV
tables generated within R ? Or does there exist another way to generate
'visually LaTeX consistent' graphics within R ?

Any takers ?

If you are comfortable in LaTeX, I would suggest that you look at
PSTricks:

http://tug.org/PSTricks/main.cgi

I use that for creating subject disposition flow charts for clinical
trials with Sweave. I can then use \Sexpr{}'s to fill in various
annotations in the boxes, etc. so that all content is programmatically
created in a reproducible fashion.

There are some examples of flow charts and tree diagrams here:

http://tug.org/PSTricks/main.cgi?file=pst-node/psmatrix/psmatrix#flowchart

and there are various other online resources for using PSTricks.

Keep in mind that since this is PostScript based, you need to use a latex
+ dvips + ps2pdf sequence, rather than just pdflatex.

Regards,

Marc Schwartz

R-help@

mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

-
Frank Harrell
Department of Biostatistics, Vanderbilt University
--
View this message in context:
http://r.789695.n4.nabble.com/LaTeX-consistent-publication-graphics-from-R-and-Comparison-of-GLE-and-R-tp4645218p4645269.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vector is not assigned correctly in for loop

2012-10-06 Thread Bert Gunter

But the OP should not be doing this **at all.** He apparently has not
bothered to read the Intro to R tutorial as he appears not to know
about vectorized calculations.

-- Bert

On Sat, Oct 6, 2012 at 3:29 AM, R. Michael Weylandt
michael.weyla...@gmail.com wrote:
 Forgot to cc the list.

 RMW

 On Sat, Oct 6, 2012 at 11:29 AM, R. Michael Weylandt
 michael.weyla...@gmail.com wrote:
 A case study of a good question! Would that all posters did such a good job.

n Sat, Oct 6, 2012 at 7:14 AM, 周果 guo.c...@gmail.com wrote:
 Hi there,
 Here is a minimum working example:
 
 lower = 0
 upper = 1
 n_bins = 50
 interval = (upper - lower) / n_bins
 bins = vector(mode=numeric, length=n_bins)
 breaks = seq(from=lower  + interval, to=upper, by=interval)

 for(idx in breaks)
 {
 bins[idx / interval] = idx
 }


 Note that this could slightly move idiomatically be done as

 bins[breaks / interval] = breaks

 print(bins)
 
 which outputs:
 
  [1] 0.02 0.04 0.06 0.08 0.10 0.14 0.00 0.16 0.20 0.00 0.22 0.24 0.26 0.28
 [15] 0.30 0.32 0.34 0.36 0.38 0.40 0.42 0.44 0.46 0.48 0.50 0.52 0.54 0.56
 [29] 0.58 0.60 0.62 0.64 0.66 0.68 0.70 0.72 0.74 0.76 0.78 0.80 0.82 0.84
 [43] 0.86 0.88 0.90 0.92 0.94 0.96 0.98 1.00
 
 It turns out that some elements are incorrect, such as the 6th
 element 0.14, which should be 0.12 in fact.
 Is this a bug or I am missing something?

 Take a look at

 as.integer(breaks / interval)

 You're hitting up on floating-point issues (see the link in R FAQ 7.31
 for the definitive reference, but it's a large and complicated field
 with many little manifestations like this)

 What's basically happening is that the 7 you see in breaks / interval,
 is actually 6. (or so) which gets printed as a 7 by
 print() but truncated to a 6 for subsetting as mentioned in ?`[`. If
 you were to turn on more digits for printing, you'd see it's not
 really a 7.

 You'd probably rather have

 bins[round(breaks / interval)] = breaks

 Cheers and thanks again for spending so much time to make a good question,

 Michael

 And here is the output of sessionInfo():
 
 R version 2.15.0 (2012-03-30)
 Platform: x86_64-pc-mingw32/x64 (64-bit)

 locale:
 [1] LC_COLLATE=Chinese (Simplified)_People's Republic of China.936
 [2] LC_CTYPE=Chinese (Simplified)_People's Republic of China.936
 [3] LC_MONETARY=Chinese (Simplified)_People's Republic of China.936
 [4] LC_NUMERIC=C
 [5] LC_TIME=Chinese (Simplified)_People's Republic of China.936

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 loaded via a namespace (and not attached):
 [1] cubature_1.1-1 tools_2.15.0
 
 Thanks in advance.

 Regards,
 Guo

 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.



-- 

Bert Gunter
Genentech Nonclinical Biostatistics

Internal Contact Info:
Phone: 467-7374
Website:
http://pharmadevelopment.roche.com/index/pdb/pdb-functional-groups/pdb-biostatistics/pdb-ncb-home.htm

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Presence/ absence data from matrix to single column

2012-10-06 Thread agoijman

I've been trying to reshape this database but haven't succeed at it. I tried
using loops but can't get it right. I just want to reshape my database from
this matrix, to the one below, with only one column of data. 

YearRoute   Point   Sp1 Sp2 Sp3
2004123 123-1   0   1   0
2004123 123-2   0   1   1
2004123 123-10  1   1   0

What I want:

YearRoute   Point   
2004123 123-1   Sp1 0   
2004123 123-2   Sp1 0   
2004123 123-10  Sp1 1   
2004123 123-1   Sp2 1   
2004123 123-2   Sp2 1   
2004123 123-10  Sp2 1   
2004123 123-1   Sp3 0   
2004123 123-2   Sp3 1   
2004123 123-10  Sp3 0   




--
View this message in context: 
http://r.789695.n4.nabble.com/Presence-absence-data-from-matrix-to-single-column-tp4645271.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Presence/ absence data from matrix to single column

Hi,
Try this:
dat1-read.table(text=
Year    Route    Point    Sp1    Sp2    Sp3
2004    123    123-1    0    1    0
2004    123    123-2    0    1    1
2004    123    123-10    1    1    0
,header=TRUE,sep=,stringsAsFactors=FALSE)

library(reshape)
melt(dat1,id=c(Year,Route,Point))
  Year Route  Point variable value
1 2004   123  123-1  Sp1 0
2 2004   123  123-2  Sp1 0
3 2004   123 123-10  Sp1 1
4 2004   123  123-1  Sp2 1
5 2004   123  123-2  Sp2 1
6 2004   123 123-10  Sp2 1
7 2004   123  123-1  Sp3 0
8 2004   123  123-2  Sp3 1
9 2004   123 123-10  Sp3 0
A.K. 





- Original Message -
From: agoijman agoij...@cnia.inta.gov.ar
To: r-help@r-project.org
Cc: 
Sent: Saturday, October 6, 2012 11:03 AM
Subject: [R] Presence/ absence data from matrix to single column

I've been trying to reshape this database but haven't succeed at it. I tried
using loops but can't get it right. I just want to reshape my database from
this matrix, to the one below, with only one column of data. 

Year    Route    Point    Sp1    Sp2    Sp3
2004    123    123-1    0    1    0
2004    123    123-2    0    1    1
2004    123    123-10    1    1    0

What I want:

Year    Route    Point            
2004    123    123-1    Sp1    0    
2004    123    123-2    Sp1    0    
2004    123    123-10    Sp1    1    
2004    123    123-1    Sp2    1    
2004    123    123-2    Sp2    1    
2004    123    123-10    Sp2    1    
2004    123    123-1    Sp3    0    
2004    123    123-2    Sp3    1    
2004    123    123-10    Sp3    0    




--
View this message in context: 
http://r.789695.n4.nabble.com/Presence-absence-data-from-matrix-to-single-column-tp4645271.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Presence/ absence data from matrix to single column

Try the reshape2 package. You will probablly have to install the package.  
install.packages(reshape2)

with your data as xx :
library(reshape2)
melt(xx, id =c(Year, Route, Point))

seems to do what you want.

John Kane
Kingston ON Canada


 -Original Message-
 From: agoij...@cnia.inta.gov.ar
 Sent: Sat, 6 Oct 2012 08:03:11 -0700 (PDT)
 To: r-help@r-project.org
 Subject: [R] Presence/ absence data from matrix to single column
 
 I've been trying to reshape this database but haven't succeed at it. I
 tried
 using loops but can't get it right. I just want to reshape my database
 from
 this matrix, to the one below, with only one column of data.
 
 Year  Route   Point   Sp1 Sp2 Sp3
 2004  123 123-1   0   1   0
 2004  123 123-2   0   1   1
 2004  123 123-10  1   1   0
 
 What I want:
 
 Year  Route   Point
 2004  123 123-1   Sp1 0
 2004  123 123-2   Sp1 0
 2004  123 123-10  Sp1 1
 2004  123 123-1   Sp2 1
 2004  123 123-2   Sp2 1
 2004  123 123-10  Sp2 1
 2004  123 123-1   Sp3 0
 2004  123 123-2   Sp3 1
 2004  123 123-10  Sp3 0
 
 
 
 
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Presence-absence-data-from-matrix-to-single-column-tp4645271.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


Share photos  screenshots in seconds...
TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if1
Works in all emails, instant messengers, blogs, forums and social networks.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] arrange data

Hi Roslina,

Extending Rui's solution if you want only the last two digits for Year.
 

agg_dt1$Tahun-as.numeric(gsub(\\d{2}(\\d+),\\1,agg_dt1$Tahun))
 head(agg_dt1)
#  Tahun Bahun    x
#1    98 1  607
#2    99 1  814
#3 0 1  580
#4    98 2 1006
#5    99 2  941
#6 0 2 1075A.K.





- Original Message -
From: Rui Barradas ruipbarra...@sapo.pt
To: arun smartpink...@yahoo.com
Cc: Roslina Zakaria zrosl...@yahoo.com; R help r-help@r-project.org
Sent: Saturday, October 6, 2012 7:22 AM
Subject: Re: [R] arrange data

Hello,

Using Arun's data example, instead of creating a factor convert to 4 
digits years.

set.seed(1)
dat1 - data.frame(Tahun=rep(c(98:99,00),each=36),
             Bahun=rep(rep(1:12,times=3),each=3),
             x=sample(1:500,108,replace=TRUE))

dat2 - dat1  # operate on a copy
dat2$Tahun - with(dat2, ifelse(Tahun  71, 2000 + Tahun, 1900 + Tahun))

agg_dt1 - aggregate(x=dat2[,3],by=dat2[,c(1,2)],FUN=sum)
head(agg_dt1)

Hope this helps,

Rui Barradas
Em 06-10-2012 03:38, arun escreveu:
 Hi,

 I hope this helps you.
   I created a small dataset: 3 replications per month for 1998:2000.

 set.seed(1)
 dat1-data.frame(Tahun=rep(c(98:99,00),each=36),Bahun=rep(rep(1:12,times=3),each=3),
  x=sample(1:500,108,replace=TRUE))
 dat2-within(dat1,{Tahun-factor(Tahun,levels=c(98,99,0))})


 agg_dt1-aggregate(x=dat2[,3],by=dat2[,c(1,2)],FUN=sum)
   head(agg_dt1)
 #  Tahun Bahun    x
 #1    98     1 1252
 #2    99     1  680
 #3     0     1  687
 #4    98     2  761
 #5    99     2  860
 #6     0     2  786
 I guess this is what you wanted.


 In addition, you can also use ddply() with a different way of grouping: but 
 with the same result.
 library(plyr)
   dd_dt1-ddply(dat2,.(Tahun,Bahun),summarize, sum(x))
   head(dd_dt1)
 #  Tahun Bahun  ..1
 #1    98     1 1252
 #2    98     2  761
 #3    98     3  440
 #4    98     4  597
 #5    98     5  987
 #6    98     6  692
   tail(dd_dt1)
 #   Tahun Bahun  ..1
 #31     0     7  685
 #32     0     8  504
 #33     0     9  633
 #34     0    10  553
 #35     0    11  914
 #36     0    12 1039

 A.K.






 - Original Message -
 From: Roslina Zakaria zrosl...@yahoo.com
 To: r-help@r-project.org r-help@r-project.org
 Cc:
 Sent: Friday, October 5, 2012 8:09 PM
 Subject: [R] arrange data

 Dear r-users,

 I have dailly rainfall data from year 1971 to 2000. I use aggregate to form 
 monthly rainfall data.  What I don't understand is that the data for the year 
 2000 become on the top, instead of year 1971.  Here are some codes and output:


 agg_dt1     - aggregate(x=dt1[,4],by=dt1[,c(1,2)],FUN=sum)

 head(agg_dt1,20); tail(agg_dt1,20)
     Tahun Bulan     x
 1      0     1 398.6
 2     71     1 934.9
 3     72     1 107.2
 4     73     1 236.4
 5     74     1  10.5
 6     75     1 744.6
 7     76     1   9.2
 8     77     1 108.7
 9     78     1 251.5
 10    79     1 197.3
 11    80     1 144.1
 12    81     1 104.5
 13    82     1  17.7
 14    83     1 151.8
 ...

 Thank you so much for your help.

 Roslina
      [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Presence/ absence data from matrix to single column

Hi John,

Thanks for your comments.

I have both packages.  I am using R 2.15.  May be reshape is out-of-date.  I 
don't load reshape2 (may be lazy to add 2 at the end) that much except when I 
need dcast()   I tried the code with only reshape2 loaded, and is getting 
the same result.  


A.K.



- Original Message -
From: John Kane jrkrid...@inbox.com
To: arun smartpink...@yahoo.com
Cc: 
Sent: Saturday, October 6, 2012 11:24 AM
Subject: Re: [R] Presence/ absence data from matrix to single column

I think reshape is out of date.  reshape2 has been out for about a year I think.

John Kane
Kingston ON Canada


 -Original Message-
 From: smartpink...@yahoo.com
 Sent: Sat, 6 Oct 2012 08:15:34 -0700 (PDT)
 To:melt(dat1,id=c(Year,Route,Point))
 Subject: Re: [R] Presence/ absence data from matrix to single column
 
 Hi,
 Try this:
 dat1-read.table(text=
 Year    Route    Point    Sp1    Sp2    Sp3
 2004    123    123-1    0    1    0
 2004    123    123-2    0    1    1
 2004    123    123-10    1    1    0
 ,header=TRUE,sep=,stringsAsFactors=FALSE)
 
 library(reshape)
 melt(dat1,id=c(Year,Route,Point))
   Year Route  Point variable value
 1 2004   123  123-1  Sp1 0
 2 2004   123  123-2  Sp1 0
 3 2004   123 123-10  Sp1 1
 4 2004   123  123-1  Sp2 1
 5 2004   123  123-2  Sp2 1
 6 2004   123 123-10  Sp2 1
 7 2004   123  123-1  Sp3 0
 8 2004   123  123-2  Sp3 1
 9 2004   123 123-10  Sp3 0
 A.K.
 
 
 
 
 
 - Original Message -
 From: agoijman agoij...@cnia.inta.gov.ar
 To: r-help@r-project.org
 Cc:
 Sent: Saturday, October 6, 2012 11:03 AM
 Subject: [R] Presence/ absence data from matrix to single column
 
 I've been trying to reshape this database but haven't succeed at it. I
 tried
 using loops but can't get it right. I just want to reshape my database
 from
 this matrix, to the one below, with only one column of data.
 
 Year    Route    Point    Sp1    Sp2    Sp3
 2004    123    123-1    0    1    0
 2004    123    123-2    0    1    1
 2004    123    123-10    1    1    0
 
 What I want:
 
 Year    Route    Point
 2004    123    123-1    Sp1    0
 2004    123    123-2    Sp1    0
 2004    123    123-10    Sp1    1
 2004    123    123-1    Sp2    1
 2004    123    123-2    Sp2    1
 2004    123    123-10    Sp2    1
 2004    123    123-1    Sp3    0
 2004    123    123-2    Sp3    1
 2004    123    123-10    Sp3    0
 
 
 
 
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Presence-absence-data-from-matrix-to-single-column-tp4645271.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.
 
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


TRY FREE IM TOOLPACK at http://www.imtoolpack.com/default.aspx?rc=if5
Capture screenshots, upload images, edit and send them to your friends
through IMs, post on Twitter®, Facebook®, MySpace™, LinkedIn® – FAST!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating the mean in one column with empty cells

Where is the csv data coming from?  If it is an export from a spreadsheet, 
Excel (and others?) has a nasty habit of exporting as displayed rather than 
the actual number as it's default.  

John Kane
Kingston ON Canada


 -Original Message-
 From: f.seha...@gmail.com
 Sent: Sat, 6 Oct 2012 01:11:11 -0700 (PDT)
 To: r-help@r-project.org
 Subject: Re: [R] Calculating the mean in one column with empty cells
 
 Hi,
 
 the first command was bringing the numbers into R directly:
 * testdata - c(0.2006160108532920, 0.1321167173880490,
 0.0563941428921262,
 0.0264198664609803, 0.0200581303857603, -0.2971754213679500,
 -0.2353086361784190, 0.0667195538296534, 0.1755852636926560)
 mean(testdata)
 [1] 0.0161584*
 
 Here I tried to calculate the mean with the same numbers as given above,
 but
 taken from my dataset.
 *
 str(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
  num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18
 mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
 [1] 0.0167
 *
 
 It seems that in the second case he calculates the mean with rounded
 numbers
 (0.2 and not 0.20061601085...)
 Could it be that R imports only the rounded numbers?
 How can I build a CSV-file with numbers showing all decimal places?
 Because
 I think my current CSV-file only has numbers with 2 decimal places.
 
 
 Kind Regards,
 Felix
 
 
 
 
 --
 View this message in context:
 http://r.789695.n4.nabble.com/Calculating-the-mean-in-one-column-with-empty-cells-tp4645135p4645252.html
 Sent from the R help mailing list archive at Nabble.com.
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE 3D MARINE AQUARIUM SCREENSAVER - Watch dolphins, sharks  orcas on your 
desktop!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Download limit

On Sat, Oct 6, 2012 at 12:38 PM, agiani99 agian...@hotmail.com wrote:
 Hi Michael,
 I am not under  firewall, but I noticed that when I setInternet2=FALSE the
 problem disappears.
 SetInternet2=TRUE is required to download Systematic Investor Toolbox (SIT).
 I don't know why or whether it makes sense, but yes it seems a connection
 problem and no
 it seems to have something to do with R or RStudio. Thanks for your time,
 though.
 Best
 Andrea


I think this is one of those situations where non-Windows folks just
shake their heads and sigh. I'm afraid I don't know enough about
Windows internet settings to comment (though BDR, Duncan M, Uwe, or
many of the other folks on this list much smarter than I could likely
explain it) but for now, I'm just happy to hear you got it working.

Cheers,
Michael

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] smoothScatter plot

Hi Zhengyu,

You might want to have a look at 
http://gallery.r-enthusiasts.com/graph/Scatterplots_with_smoothed_densities_color_representation,139
which seems to be showing a smoothScatter() that seems like what you want.  

I've never used the  function so I am probably not much help

Something else that I thought of, late yesterday, was the ggplot2 approach 
shown using this code.  
d - ggplot(diamonds, aes(carat, price)) 
d + geom_point()  # graph all points with similar colour
d + geom_point(alpha = 1/10)  # graph points with transparency setting 

The alpha settings may give you something similar to smoothScatter() but 
probably without the colours though a question on the google groups ggplot2 
group might help.

Good luck

Good luck,

John Kane
Kingston ON Canada

-Original Message-
From: zhyjiang2...@hotmail.com
Sent: Sat, 6 Oct 2012 01:01:41 +0800
To: jrkrid...@inbox.com
Subject: RE: [R] smoothScatter plot

Hi John,

Thanks for your link. Those plots look pretty but way too complicated in terms 
of making R code.  

Maybe my decription is not clear.  But could you take a look at the attached 
png? I saw several publications showing smoothed plots like this but not sure 
how to make one...

Thanks,
Best,
Zhengyu

Date: Fri, 5 Oct 2012 06:36:38 -0800
From: jrkrid...@inbox.com
Subject: RE: [R] smoothScatter plot
To: zhyjiang2...@hotmail.com
CC: r-help@r-project.org

 In line 

John Kane
Kingston ON Canada

-Original Message-
From: zhyjiang2...@hotmail.com
Sent: Fri, 5 Oct 2012 05:41:29 +0800
To: jrkrid...@inbox.com
Subject: RE: [R] smoothScatter plot

 Hi John,

Thanks for your email. Your way works good. 

However, I was wondering if you can help with a smoothed scatter plot that has 
shadows with different darker blue color representing higher density of points.

Zhengyu 

Do you mean something like what is being discussed here? 
http://andrewgelman.com/2012/08/graphs-showing-uncertainty-using-lighter-intensities-for-the-lines-that-go-further-from-the-center-to-de-emphasize-the-edges/
 

If so I think there has been some discussion and accompanying ggplot2 code on 
google groups ggplot2 site.  

Otherwise can you explain a bit more clearly?

Date: Thu, 4 Oct 2012 05:46:46 -0800
From: jrkrid...@inbox.com
Subject: RE: [R] smoothScatter plot
To: zhyjiang2...@hotmail.com
CC: r-help@r-project.org

Hi,

Do you mean something like this?  
=
    scatter.smooth(x,y)scatter.smooth(x,y)
=

It looks like invoking that dcols - densCols(x,y) is callling in some package 
that is masking the basic::smoothScatter()  and applying some other version of 
smoothScatter, but I am not expert enough to be sure.

Another way to get the same result as mine with smoothScatter is to use the 
ggplot2 package.  it looks a bit more complicated but it is very good and in 
some ways easier to see exactly what is happening.

To try it you would need to install the ggplot2 package 
(install.packages(ggplot2)  then with your original x and y data frames
===
library(ggplot2)
xy  -  cbind(x, y)
names(xy)  -  c(xx, yy)

p  -  ggplot(xy , aes(xx, yy )) + geom_point( ) + 
 geom_smooth( method=loess, se =FALSE)
p 


Thanks for the data set.  However it really is easier to use dput()

To use dput() simply issue the command dput(myfile) where myfile is the file 
you are working with.  It will give you something like this:
==
1 dput(x)
structure(c(0.4543462924, 0.2671718761, 0.1641577016, 1.1593356462, 
0.0421177346, 0.3127782861, 0.4515537795, 0.5332559665, 0.0913911528, 
0.1472054054, 0.1340672893, 1.2599304224, 0.3872026125, 0.0368560053, 
0.0371828779, 0.3999714282, 0.0175815783, 0.8871547761, 0.2706762487, 
0.7401904063, 0.0991320236, 0.2565567348, 0.5854167363, 0.7515717421, 
0.7220388222, 1.3528297744, 0.9339971349, 0.0128652431, 0.4102527051
), .Dim = c(29L, 1L), .Dimnames = list(NULL, V1))

1 dput(y)
structure(list(V1 = c(0.8669898448, 0.6698647266, 0.1641577016, 
0.4779091929, 0.2109900366, 0.2915241414, 0.2363116664, 0.3808731568, 
0.379908928, 0.2565868263, 0.1986675964, 0.7589866876, 0.6496236922, 
0.1327986663, 0.4196107999, 0.3436442638, 0.1910728051, 0.5625817464, 
0.1429791079, 0.6441837334, 0.1477153617, 0.369079266, 0.3839842979, 
0.39044223, 0.4186374286, 0.7611640016, 0.446291999, 0.2943343355, 
0.3019098386)), .Names = V1, class = data.frame, row.names = c(NA, 
-29L))
1 

===

That is your x in dput() form.  You just copy it from the R terminal and paste 
it into your email message.  It is handy if you add the x  -  and y  -  to 
the output.  

Your method works just fine but it's a bit more cumbersome with a lot of data.

Also, please reply to the R-help list as well.  It is a source of much more

Re: [R] Expected number of events, Andersen-Gill model fit via coxph in package survival

2012-10-06 Thread David Winsemius


On Oct 5, 2012, at 8:48 PM, Omar De la Cruz C. wrote:

 Hello,
 
 I am interested in producing the expected number of events, in a
 recurring events setting. I am using the Andersen-Gill model, as fit
 by the function coxph in the package survival.
 
 I need to produce expected numbers of events for a cohort,
 cumulatively, at several fixed times. My ultimate goal is: To fit an
 AG model to a reference sample, then use that fitted model to generate
 expected numbers of events for a new cohort; then, comparing the
 expected vs. the observed numbers of events would give us some idea of
 whether the new cohort differs from the reference one.
 
 From my reading of the documentation and the text by Therneau and
 Grambsch, it seems that the function survexp is what I need. But
 using it I am not able to obtain expected numbers of events that match
 reasonably well the observed numbers *even for the same reference
 population.* So, I think I am misunderstanding something quite badly.
 
 Below is an example that illustrates the situation. At the end I
 include the sessionInfo().
 
 Thank you!
 
 Omar.
 
 
 
 
 # Example of unexpected behavior in computing estimated number of events
 # in using package survival for fitting the Andersen-Gill model
 
 require(survival)
 
 head(bladder2)  # this is the data, in interval format
 
 # Fit Andersen-Gill model
 cphfit = 
 coxph(Surv(start,stop,event)~rx+number+size+cluster(id),data=bladder2)
 
 # Choose some arbitrary time horizons
 t.horiz = seq(min(bladder2$start),max(bladder2$stop),length=6)
 
 # Compute the cohort expected survival
 s = survexp(~1,data=bladder2,ratetable=cphfit,times=t.horiz)
 # This are the expected survival values:
 s$surv
 
 # We are interested in the rate of events
 e.r = as.vector( 1 - s$surv )
 

Rates are events/n-exposed/time, so those are not rates as I understand the 
term.  And I do not see any accounting for the length of intervals at risk in 
the rest of your code. That vector does not even calculate interval event 
expectations as I read it.

-- 
David


 # How does this compare to the actual number of events, cumulative at
 # each time horizon?
 
 observed = numeric(length(t.horiz))
 
 for (i in 1:length(t.horiz)){
 
observed[i] = sum(bladder2$event[bladder2$stop = t.horiz[i]])
 
 }
 
 print(observed)
 
 # We would like to compute expected numbers of events that approximately
 # match these observed values.
 
 # We should multiply the expected survival rates by the number of individuals.
 
 # Now, one would think that this is the number of at-risk individuals:
 s$n.risk
 
 # But that is actually the total number of rows in the data. In any case,
 # these numbers do not match:
 
 rbind(expected = s$n.risk*e.r,observed=observed)
 
 # What if we multiply by the number of individuals?
 
 rbind(expected = length(unique(bladder2$id))*e.r,observed=observed)
 
 # This does not work either! The required factor seems to be about 133, but
 # I don't see an explanation for that.
 
 # In this example, multiplying by 133.182 gives a good match between observed
 # and expected values, but in other examples even the shape of the curves
 # are different.
 
 # Multiplying by a number of individuals at risk at each time point
 # (number of individuals
 # for which there is a time interval containing the time horizon) does
 # not work either.
 
 #
 
 sessionInfo()
 R version 2.15.1 (2012-06-22)
 Platform: i386-apple-darwin9.8.0/i386 (32-bit)
 
 locale:
 [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
 
 attached base packages:
 [1] splines   stats graphics  grDevices utils datasets
 methods   base
 
 other attached packages:
 [1] survival_2.36-14
 
 loaded via a namespace (and not attached):
 [1] tools_2.15.1
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sample

2012-10-06 Thread solafah bh

Hello
If I have x=c(3,2,6,1) and n=length(x), are the following codes equivalent??
sample(x,1,replace=TRUE)    and   sample(x,1,replace=TRUE,prob=rep(1/n , n) 
)
Regards
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating the mean in one column with empty cells

2012-10-06 Thread David Winsemius


On Oct 6, 2012, at 1:11 AM, fxen3k wrote:

 Hi, 
 
 the first command was bringing the numbers into R directly: 
 * testdata - c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262,
 0.0264198664609803, 0.0200581303857603, -0.2971754213679500,
 -0.2353086361784190, 0.0667195538296534, 0.1755852636926560)
 mean(testdata)
 [1] 0.0161584*
 
 Here I tried to calculate the mean with the same numbers as given above, but
 taken from my dataset.
 *
 str(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
 num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18
 mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
 [1] 0.0167
 *

This is something that has happened in data processing:

 dat - read.csv2(text=0,2006160108532920
+ 0,1321167173880490
+ 0,0563941428921262
+ 0,0264198664609803
+ 0,0200581303857603
+ -0,2971754213679500
+ -0,2353086361784190
+ 0,0667195538296534
+ 0,1755852636926560
+ , header=FALSE)
 mean(dat[[1]])
[1] 0.0161584

 

 It seems that in the second case he calculates the mean with rounded numbers
 (0.2 and not 0.20061601085...)
 Could it be that R imports only the rounded numbers? 
 How can I build a CSV-file with numbers showing all decimal places? Because
 I think my current CSV-file only has numbers with 2 decimal places.
 

That is more likely the fault of Excel than it is something R is responsible 
for.

-- 

David Winsemius, MD
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sample

Yes and no. Same effect, but you won't get the same random numbers
because -- I believe -- a different algorithm is used. grep the source
for sample and sample2 if you're interested.

Cheers,
Michael

On Sat, Oct 6, 2012 at 5:02 PM, solafah bh solafa...@yahoo.com wrote:
 Hello
 If I have x=c(3,2,6,1) and n=length(x), are the following codes equivalent??
 sample(x,1,replace=TRUE)and   sample(x,1,replace=TRUE,prob=rep(1/n , 
 n) )
 Regards
 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sample

Hi,
They get different results:
with the same set.seed()
 x=c(3,2,6,1) 
 n=length(x)
 set.seed(1)
 sample(x,1,replace=TRUE)  
#[1] 2
set.seed(1)
 sample(x,1,replace=TRUE,prob=rep(1/n , n) )
#[1] 6


 identical(sample(x,1,replace=TRUE),sample(x,1,replace=TRUE,prob=rep(1/n , n) ))
#[1] FALSE
A.K.



- Original Message -
From: solafah bh solafa...@yahoo.com
To: R help mailing list r-help@r-project.org
Cc: 
Sent: Saturday, October 6, 2012 12:02 PM
Subject: [R] sample

Hello
If I have x=c(3,2,6,1) and n=length(x), are the following codes equivalent??
sample(x,1,replace=TRUE)    and   sample(x,1,replace=TRUE,prob=rep(1/n , n) 
)
Regards
    [[alternative HTML version deleted]]


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] LaTeX consistent publication graphics from R and Comparison of GLE and R

2012-10-06 Thread Marc Schwartz

Hi Frank,

I have not used tikz, so am not sure.

I have been hand coding the TeX markup in the .Rnw files to date, since each 
study has been somewhat different in terms of various characteristics and the 
sponsors, in some cases, have requested some customizations to the flow charts. 
That has typically been done with psmatrix constructs 
(http://tug.org/PSTricks/main.cgi?file=pst-node/psmatrix/psmatrix).

I have also used PSTricks, with pst-tree constructs 
(http://tug.org/PSTricks/main.cgi?file=pst-tree/pst-tree), to create branching 
trees for stratified randomization flow charts. So you have a top level with 
all enrolled subjects, then branches from there showing each stratification 
level, each box showing the sample size (using \Sexpr{}s) within each strata 
level. Similar concept to the matrix-like orgchart style used for disposition 
charts, but just a different implementation, which allows for an imbalance in 
the tree structure (eg. differing strata in each arm based upon various 
criteria, etc.).

I suppose that if one were to think about it conceptually, R's list structures 
would be a suitable substrate for creating an object that could be passed to a 
print method of sorts and generate the TeX markup during Sweave (or knitr) 
processing. I just have not spent the time to consider how that would be done 
generically enough and still allow for some of the customizations that might be 
encountered.

Food for thought.

Best regards,

Marc


On Oct 6, 2012, at 8:14 AM, Frank Harrell f.harr...@vanderbilt.edu wrote:

 Hi Marc,
 
 It would be interesting to compare with tikz for ease of use.
 
 As an aside I've been wishing that someone would write an R function for
 creating clinical trial disposition charts using tikz or pstricks ...
 
 Best,
 Frank
 
 Marc Schwartz-3 wrote
 On Oct 5, 2012, at 3:32 PM, clangkamp lt;
 
 christian.langkamp@
 
 gt; wrote:
 
 Hi Everyone
 
 I am at the moment preparing my thesis and am looking at producing a few
 Organigrams / Flow charts (unrelated to the calculations in R) as well as
 a
 range of charts (barcharts, histograms, ...) based on calculations in R. 
 
 For the Organigrams I am looking at an Opensource package called GLE at
 sourceforge, which produces the text part in Latex figures which is very
 neat and also in the same style of the thesis, which I wrote in LaTeX. It
 also offers a range of graphical features, and I am quite tempted.
 
 It also produces barcharts and histograms with the options of legends
 etc. I
 have done most of my graphs so far with R, but with Organigrams and flow
 charts I am at a loss (A pointer here would also be very welcome). For
 some
 charts I have used MS Visio, but it would be convenient to use just one
 program for graphing throughout the thesis (i.e. same colour coding
 etc.).
 
 Does anybody have any experience with GLE, ideally working with it with
 CSV
 tables generated within R ? Or does there exist another way to generate
 'visually LaTeX consistent' graphics within R ?
 
 Any takers ?
 
 
 
 If you are comfortable in LaTeX, I would suggest that you look at
 PSTricks:
 
  http://tug.org/PSTricks/main.cgi
 
 I use that for creating subject disposition flow charts for clinical
 trials with Sweave. I can then use \Sexpr{}'s to fill in various
 annotations in the boxes, etc. so that all content is programmatically
 created in a reproducible fashion.
 
 There are some examples of flow charts and tree diagrams here:
 
 
 http://tug.org/PSTricks/main.cgi?file=pst-node/psmatrix/psmatrix#flowchart
 
 and there are various other online resources for using PSTricks.
 
 Keep in mind that since this is PostScript based, you need to use a latex
 + dvips + ps2pdf sequence, rather than just pdflatex.
 
 Regards,
 
 Marc Schwartz

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Calculating the mean in one column with empty cells

2012-10-06 Thread William Dunlap

For nine numbers, R-helpers should recommend that people
show their data with dput(obj) instead of str(obj).
dput() shows everything in the object to full precision.  str() shows
a summary of the object and rounds numbers to 2 digits -- it
is good for an overview of the data, but when the question is why
did I get a mean of .06 instead of .06547494 from my 9 numbers
str() is not useful.

Bill Dunlap
Spotfire, TIBCO Software
wdunlap tibco.com


 -Original Message-
 From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On 
 Behalf
 Of David Winsemius
 Sent: Saturday, October 06, 2012 9:08 AM
 To: fxen3k
 Cc: r-help@r-project.org
 Subject: Re: [R] Calculating the mean in one column with empty cells
 
 
 On Oct 6, 2012, at 1:11 AM, fxen3k wrote:
 
  Hi,
 
  the first command was bringing the numbers into R directly:
  * testdata - c(0.2006160108532920, 0.1321167173880490, 0.0563941428921262,
  0.0264198664609803, 0.0200581303857603, -0.2971754213679500,
  -0.2353086361784190, 0.0667195538296534, 0.1755852636926560)
  mean(testdata)
  [1] 0.0161584*
 
  Here I tried to calculate the mean with the same numbers as given above, but
  taken from my dataset.
  *
  str(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
  num [1:9] 0.2 0.13 0.06 0.03 0.02 -0.3 -0.24 0.07 0.18
  mean(dataSet2$ac_bhar_60d_4d_after_ann[1:9])
  [1] 0.0167
  *
 
 This is something that has happened in data processing:
 
  dat - read.csv2(text=0,2006160108532920
 + 0,1321167173880490
 + 0,0563941428921262
 + 0,0264198664609803
 + 0,0200581303857603
 + -0,2971754213679500
 + -0,2353086361784190
 + 0,0667195538296534
 + 0,1755852636926560
 + , header=FALSE)
  mean(dat[[1]])
 [1] 0.0161584
 
 
 
  It seems that in the second case he calculates the mean with rounded numbers
  (0.2 and not 0.20061601085...)
  Could it be that R imports only the rounded numbers?
  How can I build a CSV-file with numbers showing all decimal places? Because
  I think my current CSV-file only has numbers with 2 decimal places.
 
 
 That is more likely the fault of Excel than it is something R is responsible 
 for.
 
 --
 
 David Winsemius, MD
 Alameda, CA, USA
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Multiple graphs boxplot

Does something like this make any sense?

library(reshape2)
library(ggplot2)
yy  -  structure(list(A = c(23, 21, 21, 20, 19, 19), B = c(20, 18, 20, 
19, 20, 18), C = c(15, 15, 15, 12, 13, 13)), .Names = c(A, 
B, C), class = data.frame, row.names = c(NA, -6L))

y1  -  melt(yy)  # using reshape2 

ggplot(y1, aes(variable, value))+ geom_boxplot() 

# or

ggplot(y1, aes(variable, value))+ geom_boxplot()  + facet_grid(variable ~ .)




John Kane
Kingston ON Canada


 -Original Message-
 From: dagr...@hotmail.com
 Sent: Fri, 5 Oct 2012 18:01:39 +0200
 To: r-help@r-project.org
 Subject: [R] Multiple graphs  boxplot
 
 
 
 Dear all
 
 I am trying to represent a dependent variable (treatment) against
 different independent variables (v1, v2, v3v20). I am using the
 following command:
 
 boxplot(v1~treatment,data=y, main=xx,xlab=xx, ylab=xx)
 
 However, it provides me only one graph for v1~treatment. For the other
 comparisons, I have to repeat the same command but changing the
 parameters. My intentions is to get different plots in just one sheet
 using only one command. Is it possible to join the same order for all the
 comparisons in only one command?
 
 Thanks
 David
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide
 http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] help with making figures

2012-10-06 Thread Uwe Ligges




On 05.10.2012 21:59, megalops wrote:

Bert,

Can you help me understand your suggestion?


Megalops31,

which suggestion? You failed to quote former messages!

I don't understand how I can

include all 30 sites under the label called site in the xypot


What is an xypot example?
Please read the posting guide for this *mailing list*.

Uwe Ligges



code example
you provided.





--
View this message in context: 
http://r.789695.n4.nabble.com/help-with-making-figures-tp4645074p4645216.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] sample with equal probabilities

2012-10-06 Thread solafah bh

Hello
If I have this vector x=c(5,1,2,9) and n=length(x) and I want to sample one 
value from x , and each value of x has equal probability to appear (1/n).
Are the following codes equivalent??
sample(x,1,replace=TRUE)  and   sample(x,1,replace=TRUE,prob=rep(1/n , n))
 
Regards
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] sample with equal probabilities

Please don't double post.

And see my response to you here:
https://stat.ethz.ch/pipermail/r-help/2012-October/325470.html

Michael

On Sat, Oct 6, 2012 at 6:51 PM, solafah bh solafa...@yahoo.com wrote:
 Hello
 If I have this vector x=c(5,1,2,9) and n=length(x) and I want to sample one 
 value from x , and each value of x has equal probability to appear (1/n).
 Are the following codes equivalent??
 sample(x,1,replace=TRUE)  and   sample(x,1,replace=TRUE,prob=rep(1/n , n))

 Regards
 [[alternative HTML version deleted]]


 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.


__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] vector is not assigned correctly in for loop