[R] Fwd: Difficulty in finding some R functions

2014-05-01 Thread prachi jain
Hey, I am trying to find some of the following functions in R packages:

MLEt

pt3

cormatrix2vector

ParameterEst

tCopula

riskBT

I have checked every package from this link:
http://cran.r-project.org/web/packages/available_packages_by_name.html but
is unable to find the above functions. These functions belong to a code for
calculating Value-at-risk and expected shortfall using copula function and
backtesting the model.

Could you please help me find these functions by naming the packages they
belong to. Its really urgent for me so please reply asap.

If required i can send the entire code for better understanding.

If this is not the right place to mail a query, please do let me know where
can I send my query, its really urgent for me.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error using polymars argument

2014-05-01 Thread varin sacha
Hi Jim,

Many thanks. Sorry for disturbing just for a parenthesis...
But I couldn't go ahead...

Le Jeudi 1 mai 2014 23h22, Jim Lemon  a écrit :
 
On 05/01/2014 09:47 PM, varin sacha wrote:
> Hi Dear all,
> I got an error when I try to do a MARS regression with the "polymars" 
> argument on the A matrix and the V vector here below.
> I don't understand the error, I have nrow=22 for the response variable and 
> for the predictor matrix.
> So if somebody could tell me what is going wrong, it would be very nice.
>
> A=matrix(c(74.711,64.614,83.514,78.187,71.327,86.197,49.986,69.012,79.591,83.158,65.437,56.225,54.485,90.531,93.357,79.15,63.611,73.133,53.497,56.328,89.585,56.289,10.92,8.08,11.92,10.75,12.08,10.58,9.33,14.17,13.25,14.33,6.92,12.42,7.17,13.75,16.83,13.67,5.25,10,7.08,10.83,14.58,11.75,nrow=22,ncol=2))
>
> V=c(44986,18288,56147,44488,41018,40631,27301,39025,45688,47172,12300,21558,16103,48874,67245,36119,10398,42630,12879,34058,84443,30639)
>
> polymars(V,A)
> Erreur dans polymars(V, A) :
>    The number of rows (cases) of the response and predictor matricies should 
>be the same
>
Hi varin,
You have a misplaced parenthesis in your definition of A. It should be:


A<-matrix(c(74.711,64.614,83.514,78.187,71.327,86.197,49.986,69.012,
  79.591,83.158,65.437,56.225,54.485,90.531,93.357,79.15,63.611,73.133,
  53.497,56.328,89.585,56.289,10.92,8.08,11.92,10.75,12.08,10.58,9.33,
  14.17,13.25,14.33,6.92,12.42,7.17,13.75,16.83,13.67,5.25,10,7.08,10.83,
  14.58,11.75),nrow=22,ncol=2)

Jim
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Simulative data production

2014-05-01 Thread Merve Şahin
Thanks for your answer and attention :) I will try to do it
Merve

*Arş.Gör.Merve ŞAHİN*
*Abant İzzet Baysal Üniversitesi*
*Eğitim Bilimleri Bölümü*
*Ölçme ve Değerlendirme A.B.D.*


2014-04-30 15:57 GMT+03:00 Lorenz, David :

> Merve,
>   I'm not 100 percent sure I understand everything that you want. But
> start with the simulated likert scale data. The code that you have is not
> very efficient and it has at least one typo. I do not know if columns or
> rows represent the persons, so I'll set up as NROW and NCOL.
>   An efficient way to generate multiple columns of the same distribution
> is to generate all of the the random number and just make them a matrix.
> Example code below.
>
> NROW <- 20
> NCOL <- 4
> MAT <- matrix(sample(1:5, NROW*NCOL), ncol=NCOL)
>
>   Random normal deviates are typically generated from the rnorm function.
> But you stated you wanted to generate the normal distribution from total
> scores. I'm a bit confused because you refer to 200 people but that does
> not correspond to any number in the data that you have generated.
>   Hope this helps.
> Dave
>
> Date: Tue, 29 Apr 2014 09:38:52 +0300
>> From: Merve ?ahin 
>> To: r-help@r-project.org
>> Subject: [R] Simulative data production
>> Message-ID:
>> > 3hhswa7tn6a...@mail.gmail.com>
>> Content-Type: text/plain
>>
>> Hello,
>> My name is Merve from Abant Ä°zzt Baysal University. I want to produce
>>
>> simulative data using R, but I couldn't do. I want to produce n=200, 5
>> likert type, 20 items and normally distributed data. The normal
>> distribution is provided by total scores of each of 200 person. I produce
>> this kind of data;
>> veri.seti> lace=T),sample(1:5,25,replace=T),sample(1:5,25,replace=T
>> V1 V2 V3 V4
>> 1   4  1  5  1 
>> 2   2  4  2  2 
>> 3   5  5  1  4 
>> 4   4  5  3  4 
>> 5   3  2  3  1 
>> 6   3  1  2  1 
>> 7   1  3  5  4 
>> 8   2  4  1  1 
>> 9   3  1  5  4 
>> 10  4  5  4  5 
>> 11  2  1  4  5 
>> 12  2  3  1  5 
>> 13  1  4  2  4 
>> 14  1  1  1  4 
>> 15  4  3  4  1 
>> 16  2  2  5  2 
>> 17  4  4  1  4 
>> 18  5  5  2  4 
>> 19  4  2  1  3 
>> 20  3  5  3  2 
>> 21  2  4  4  4 
>> 22  4  3  4  4 
>> 23  5  1  5  2 
>> 24  4  2  2  2 
>> 25  2  2  1  3 
>>
>> But, I cannot check or provide the normal distribution. Also, I want to
>> add
>> this data on SPSS 20.0, How can I do this. Can you help me, please?
>>
>> *Arş.Gör.Merve ŞAHİN*
>> *Abant İzzet Baysal Üniversitesi*
>> *Eğitim Bilimleri Bölümü*
>> *Ölçme ve Değerlendirme A.B.D.*
>>
>> [[alternative HTML version deleted]]
>>
>>
>>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tobit model with panel data

2014-05-01 Thread phil
Hi Arne,

it's me once again. I just tried the same regression with less predictors
since so far I have used 16 for 52 observations. Having used less
predictors, R showed me some output for the command "summary()". Without
doubt, it was clear beforehand that this ratio between dependent and
independent variable(s) is not adequate. Haven't thought about that...

I think you will see it the same way.

Thanks in advance.

Regards,

Phil



--
View this message in context: 
http://r.789695.n4.nabble.com/Tobit-model-with-panel-data-tp4689760p4689847.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tobit model with panel data

2014-05-01 Thread phil
Hi Arne,

thanks a lot for your reply, it was really helpful!

Now, after having managed to apply the censReg to my data, I get the
following error message when I enter the command "summary()":

Error in printCoefmat(coef(x, logSigma = logSigma), digits = digits) : 
  'x' must be coefficient matrix/data frame

What does that mean? Actually, I fear that maybe the number of observations
(total: 52; uncensored: 48) is too small. Is that possible?





--
View this message in context: 
http://r.789695.n4.nabble.com/Tobit-model-with-panel-data-tp4689760p4689846.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] help using extrafont package | R graphics

2014-05-01 Thread Taiyun Wei
The showtext package is really nice!


On Thu, May 1, 2014 at 9:33 AM, Yixuan Qiu  wrote:

> Hi Evan,
> If you just need one font, you may try the showtext package. Here is a
> piece of code that you can test:
>
> library(showtext)
> # If you have this font installed
> font.add("gara", "gara.ttf")
> # Or you can install a similar one from Google Font
> # font.add.google("EB Garamond", "gara")
>
> # Try some plots
> pdf("test.pdf")
> showtext.begin()
> par(family = "gara")
> plot(1, main = "Garamond", type = "n")
> text(1, 1, "Some Text", cex = 5)
> showtext.end()
> dev.off()
>
>
> A more detailed introduction of the showtext package is available at
> http://statr.me/2014/01/using-system-fonts-in-r-graphs/. Hope this would
> be
> helpful for you.
>
>
> Best,
> Yixuan
>
>
>
>
>
>
> 2014-04-26 16:54 GMT-04:00 Evan Cooch :
> > Greetings --
> >
> > Submitted this a little while ago -- for some reason, still being held
> > up by the moderator. Trying again...
> >
> >
> > For a host of reasons, I need to use/embed Garamond font with various R
> > graphics for a particular publication. I've figured out how to more or
> > less get there from here, using the following sequence:
> >
> >
> > library(extrafont)
> >
> > Then, I need to import the system fonts (Windoze box, R 3.1.0).
> >
> > So, I use
> >
> > font_import()
> >
> > But, this takes a *huge* amount of time, and often throws errors as it
> > chokes on various fonts (I have probably 250+ fonts installed). I only
> > want *one* font (Garamond). But, for the life of me, I can't figure out
> > how to get font_import to select only the single font I want. In theory
> >
> > font_import(paths = NULL, recursive = TRUE, prompt = TRUE, pattern =
> NULL)
> >
> > as defaults, where pattern is a regex that font filenames must match.
> >
> > The file name for Garamong is gara.ttf, so I tried
> >
> > font_import(pattern="gara")
> >
> > R responds with 'Importing fonts make take a few minutes, depending on
> > the...etc, etc'.
> > Continue? [y/n]
> >
> > Hit 'y', and am presented with
> >
> > Scanning ttf files in C:\Windows\Fonts ...
> > Extracting .afm files from .ttf files...
> > Error in data.frame(fontfile = ttfiles, FontName = "", stringsAsFactors
> > = FALSE) :
> >arguments imply differing number of rows: 0, 1
> >
> > I have no idea what to do with this.
> >
> > Suggestions/pointers to the obvious welcome. And I thought futzing with
> > fonts in LaTeX was fun! ;-)
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> --
> Yixuan Qiu 
> Department of Statistics,
> Purdue University
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Regards,
Taiyun
--
Taiyun Wei 
Homepage: http://blog.cos.name/taiyun/
Phone: +86-15201142716
Renmin University of China

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] R and GML data from the Canadian Goverment

2014-05-01 Thread jcrosbie
I'm trying to create a map of transmission lines in Alberta. In addition, I'm
very new to creating maps.

The data can be found at: http://geogratis.gc.ca/site/eng/download

http://ftp2.cits.rncan.gc.ca/pub/canvec/doc/CanVec_distribution_formats_en.pdf

Would someone be able to point me in the right direction? I haven't been
able to find an R package which is able to work with the GML data on the
webset.  Does anyone know of which package I should use and a good example
out there?

Thank you



--
View this message in context: 
http://r.789695.n4.nabble.com/R-and-GML-data-from-the-Canadian-Goverment-tp4689843.html
Sent from the R help mailing list archive at Nabble.com.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] speeding up applying hist() over rows of a matrix

2014-05-01 Thread Ortiz-Bobea, Ariel
Hello everyone,



I'm trying to construct bins for each row in a matrix. I'm using apply() in 
combination with hist() to do this. Performing this binning for a 10K-by-50 
matrix takes about 5 seconds, but only 0.5 seconds for a 1K-by-500 matrix. This 
suggests the bottleneck is accessing rows in apply() rather than the 
calculations going on inside hist().



My initial idea is to process as many columns (as make sense for the intended 
use) at once. However, I still have many many rows to process and I would 
appreciate any feedback on how to speed this up.



Any thoughts?



Thanks,



Ariel



Here is the illustration:



# create data

m1 <- matrix(10*rnorm(50*10^4), ncol=50)

m2 <- matrix(10*rnorm(50*10^4), ncol=500)



# compute bins

bins <- seq(-100,100,1)

system.time({ out1 <- t(apply(m1,1, function(x) hist(x,breaks=bins, 
plot=FALSE)$counts)) })

system.time({ out2 <- t(apply(m2,1, function(x) hist(x,breaks=bins, 
plot=FALSE)$counts)) })


---
Ariel Ortiz-Bobea
Fellow
Resources for the Future
1616 P Street, N.W.
Washington, DC 20036

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] How do I create multiple orthogonal designs with the Conjoint package?

2014-05-01 Thread David Moskowitz
Hi,
I am trying to create multiple orthogonal designs using the Conjoint
package.  I have 4 factors with 5 levels in each.

library(conjoint)
experiment<-expand.grid(
  price<-c("a1","a2","a3","a4","a5"),
  tag<-c("b1","b2","b3","b4","b5"),
  smell<-c("c1","c2","c3","c4","c5"),
  aroma<-c("f1","f2","f3","f4","f5"))
design<-caFactorialDesign(data=experiment,type="fractional")
print(design)
print(cor(caEncodedDesign(design)))

The resulting design from the above code has 22 test combinations.= (listed
below).  I would like to be able to create multiple designs instead of just
having one.  Is there a way to do that?

Var1 Var2 Var3 Var4
19a4   b4   c1   f1
40a5   b3   c2   f1
96a1   b5   c4   f1
107   a2   b2   c5   f1
148   a3   b5   c1   f2
157   a2   b2   c2   f2
176   a1   b1   c3   f2
195   a5   b4   c3   f2
239   a4   b3   c5   f2
252   a2   b1   c1   f3
299   a4   b5   c2   f3
313   a3   b3   c3   f3
335   a5   b2   c4   f3
366   a1   b4   c5   f3
381   a1   b2   c1   f4
447   a2   b5   c3   f4
469   a4   b4   c4   f4
478   a3   b1   c5   f4
543   a3   b4   c2   f5
559   a4   b2   c3   f5
587   a2   b3   c4   f5
625   a5   b5   c5   f5

[1] "R version 3.0.3 (2014-03-06)"

Thank you

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] how to extract a part of objects from list ?

2014-05-01 Thread Jorge I Velez
Hi Kristi,

Try

out1970$smoot

HTH,
Jorge.-


On Fri, May 2, 2014 at 10:00 AM, Kristi Glover wrote:

> Hi R User,
> I am wonedring how I can extract a part of objects from list.
>
> For example
>
> > str(out1970)
> List of 8
>  $ comm : num [1:16, 1:57] 1 1 1 1 1 1 1 1 1 1 ...
>   ..- attr(*, "dimnames")=List of 2
>   .. ..$ : chr [1:16] "H_s5" "H_s1" "R_s2" "H_s2" ...
>   .. ..$ : chr [1:57] "Pimephales.promelas" "Semotilus.atromaculatus"
> "Rhinichthys.atratulus" "Pimephales.notatus" ...
>  $ u: num [1:16, 1:57] 0 0 0 0 0 0 0 0 0 0 ...
>   ..- attr(*, "dimnames")=List of 2
>   .. ..$ : chr [1:16] "H_s5" "H_s1" "R_s2" "H_s2" ...
>   .. ..$ : chr [1:57] "Pimephales.promelas" "Semotilus.atromaculatus"
> "Rhinichthys.atratulus" "Pimephales.notatus" ...
>  $ r: Named num [1:16] 0.0312 0.0938 0.1562 0.2188 0.2812 ...
>   ..- attr(*, "names")= chr [1:16] "H_s5" "H_s1" "R_s2" "H_s2" ...
>  $ c: Named num [1:57] 0.00877 0.02632 0.04386 0.0614 0.07895 ...
>   ..- attr(*, "names")= chr [1:57] "Pimephales.promelas"
> "Semotilus.atromaculatus" "Rhinichthys.atratulus" "Pimephales.notatus" ...
>  $ p: num 1.59
>  $ fill : num 0.294
>  $ statistic: num 11.4
>  $ smooth   :List of 2
>   ..$ x: num [1:51] 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 ...
>   ..$ y: num [1:51] 1 0.887 0.826 0.776 0.732 ...
>  - attr(*, "class")= chr "nestedtemp"
>
>
> I  wanted to extract the value $ x and $y from List of 2 (smooth), but I
> could not extarct it. I used following functions
>
> > sapply(out1970$smooth, "[[",1)
> x y
> 0 1
>
> > sapply(out1970$smooth, "[[",2)
> x y
> 0.020 0.8869147
> But when I tried to extarct all value using
>
> >sapply(out1970$smooth, "[[",1:51)
> Error in FUN(X[[1L]], ...) : attempt to select more than one element
>
> would you mind to give some suggestions in how I get the value for all
> (1:51) ?
> thanks
> KG
>
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] how to extract a part of objects from list ?

2014-05-01 Thread Kristi Glover
Hi R User, 
I am wonedring how I can extract a part of objects from list. 

For example 

> str(out1970)
List of 8
 $ comm : num [1:16, 1:57] 1 1 1 1 1 1 1 1 1 1 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:16] "H_s5" "H_s1" "R_s2" "H_s2" ...
  .. ..$ : chr [1:57] "Pimephales.promelas" "Semotilus.atromaculatus" 
"Rhinichthys.atratulus" "Pimephales.notatus" ...
 $ u: num [1:16, 1:57] 0 0 0 0 0 0 0 0 0 0 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : chr [1:16] "H_s5" "H_s1" "R_s2" "H_s2" ...
  .. ..$ : chr [1:57] "Pimephales.promelas" "Semotilus.atromaculatus" 
"Rhinichthys.atratulus" "Pimephales.notatus" ...
 $ r: Named num [1:16] 0.0312 0.0938 0.1562 0.2188 0.2812 ...
  ..- attr(*, "names")= chr [1:16] "H_s5" "H_s1" "R_s2" "H_s2" ...
 $ c: Named num [1:57] 0.00877 0.02632 0.04386 0.0614 0.07895 ...
  ..- attr(*, "names")= chr [1:57] "Pimephales.promelas" 
"Semotilus.atromaculatus" "Rhinichthys.atratulus" "Pimephales.notatus" ...
 $ p: num 1.59
 $ fill : num 0.294
 $ statistic: num 11.4
 $ smooth   :List of 2
  ..$ x: num [1:51] 0 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 ...
  ..$ y: num [1:51] 1 0.887 0.826 0.776 0.732 ...
 - attr(*, "class")= chr "nestedtemp"


I  wanted to extract the value $ x and $y from List of 2 (smooth), but I could 
not extarct it. I used following functions

> sapply(out1970$smooth, "[[",1)
x y 
0 1 

> sapply(out1970$smooth, "[[",2)
x y 
0.020 0.8869147 
But when I tried to extarct all value using 

>sapply(out1970$smooth, "[[",1:51)
Error in FUN(X[[1L]], ...) : attempt to select more than one element

would you mind to give some suggestions in how I get the value for all (1:51) ? 
thanks 
KG


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] outer() problem

2014-05-01 Thread David Winsemius

On Apr 30, 2014, at 11:44 PM, Marc Girondot wrote:

> Dear list-members,
> 
> Can someone explains me why the last command gives an error. Thanks a lot:
> > outer(0:1, 0:1, FUN=function(x, y) {x+y})
> [,1] [,2]
> [1,]01
> [2,]12
> > outer(0:1, 0:1, FUN=function(x, y) {x})
> [,1] [,2]
> [1,]00
> [2,]11
> > outer(0:1, 0:1, FUN=function(x, y) {1})
> Erreur dans outer(0:1, 0:1, FUN = function(x, y) { :
>  dims [produit 4] ne correspond pas à la longueur de l'objet [1]
> 
> Of course I simplify a lot my problem.

Try using rep() to get a vector of the correct length:

> outer(0:1, 0:1, FUN=function(x, y) {rep(1, length(x))})
 [,1] [,2]
[1,]11
[2,]11



> 
> Thanks a lot
> 
> Marc
> 
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius
Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] error using polymars argument

2014-05-01 Thread Jim Lemon

On 05/01/2014 09:47 PM, varin sacha wrote:

Hi Dear all,
I got an error when I try to do a MARS regression with the "polymars" argument 
on the A matrix and the V vector here below.
I don't understand the error, I have nrow=22 for the response variable and for 
the predictor matrix.
So if somebody could tell me what is going wrong, it would be very nice.

A=matrix(c(74.711,64.614,83.514,78.187,71.327,86.197,49.986,69.012,79.591,83.158,65.437,56.225,54.485,90.531,93.357,79.15,63.611,73.133,53.497,56.328,89.585,56.289,10.92,8.08,11.92,10.75,12.08,10.58,9.33,14.17,13.25,14.33,6.92,12.42,7.17,13.75,16.83,13.67,5.25,10,7.08,10.83,14.58,11.75,nrow=22,ncol=2))

V=c(44986,18288,56147,44488,41018,40631,27301,39025,45688,47172,12300,21558,16103,48874,67245,36119,10398,42630,12879,34058,84443,30639)

polymars(V,A)
Erreur dans polymars(V, A) :
   The number of rows (cases) of the response and predictor matricies should be 
the same


Hi varin,
You have a misplaced parenthesis in your definition of A. It should be:

A<-matrix(c(74.711,64.614,83.514,78.187,71.327,86.197,49.986,69.012,
 79.591,83.158,65.437,56.225,54.485,90.531,93.357,79.15,63.611,73.133,
 53.497,56.328,89.585,56.289,10.92,8.08,11.92,10.75,12.08,10.58,9.33,
 14.17,13.25,14.33,6.92,12.42,7.17,13.75,16.83,13.67,5.25,10,7.08,10.83,
 14.58,11.75),nrow=22,ncol=2)

Jim

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A combinatorial assignment problem

2014-05-01 Thread Charles Berry
Ravi Varadhan  jhu.edu> writes:

> 
> Thanks, Bert.   
> I have written this simple code, which is crude, but seems to do a decent
job.  It works perfectly when M is a
> factor of R. Otherwise, it gives decent balance (of course, balance is not
guaranteed).  I guess it is
> possible to take the results, that are somewhat unbalanced and then
reshuffle it semi-randomly to get
> better balance.  Any improvements are appreciated! 
> 
> assign <- function(K, R, M, iseed=1234) {
>   assignment <- matrix(NA, K, M)
>   N <- R %/% M
>   Nrem <- R %% M
>   iseq <- seq(1,K,N)
>   for (i in iseq){
>   size <- ifelse(K-i >= N, R-Nrem, M*(K-i+1))
>   sel <- sample(1:R, size=size, replace=FALSE)
>   end <- min((i+N-1),K)
>   assignment[i:end, ] <- sel
>   }
> assignment
> }
> 
> sol <- assign(40,16,3)
> table(sol)
> 
> Thanks,
> Ravi
> 


crossdes::find.BIB() seems to do better wrt balance and 'mixing' than
assign().

If you consider the usage of pairs of reviewers in this case,
find.BIB() comes closer to equal usage.
 

> assgn <- t(apply(assign(40,16,3),1,sort))
> bib <- find.BIB(16,40,3)
> adf <- data.frame(r1=factor(assgn[,c(1,1,2)],1:16),
+r2 = factor(assgn[,c(2,3,3)],1:16))
> bdf <- data.frame(r1=factor(bib[,c(1,1,2)],1:16),
+ r2 = factor(bib[,c(2,3,3)],1:16))
> table(xtabs(~.,adf)[upper.tri(diag(16))])

 0  1  2  3  4 
44 42 26  6  2 
> table(xtabs(~.,bdf)[upper.tri(diag(16))])

  0   1   2 
  5 110   5 
>



In the assign(10,7,3) case, the balance is typically better with 
find.BIB(7,10,3):

 apply(replicate(10,table(assign(10,7,3))),2,range)
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]322333233 3
[2,]555555555 5
> apply(replicate(10,table(find.BIB(7,10,3))),2,range)
 [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
[1,]444444444 4
[2,]555555555 5
> 


You will pay a price for find.BIB in CPU time, however.

HTH,

Chuck

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] [EXTERNAL] Re: Understanding power analysis in glm and binomial proportion test

2014-05-01 Thread Wall, Wade A ERDC-RDE-CERL-IL
Thanks for the information. I will pose the question on Stack Exchange.

Wade A. Wall
US Army ERDC-CERL
P.O. Box 9005
Champaign, IL  61826-9005
1-217-373-4420
wade.a.w...@usace.army.mil




-Original Message-
From: Bert Gunter [mailto:gunter.ber...@gene.com] 
Sent: Thursday, May 01, 2014 1:42 PM
To: Wall, Wade A ERDC-RDE-CERL-IL
Cc: r-h...@stat.math.ethz.ch
Subject: [EXTERNAL] Re: [R] Understanding power analysis in glm and binomial 
proportion test

While R is certainly used for statistical simulations as you showed, this list 
is really for questions about R programming, not statistics.
While they certainly overlap and someone may respond here, I suggest you post 
this to stats.stackexchange.com or other statistics site that is specifically 
for such issues.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge is 
certainly not wisdom."
H. Gilbert Welch




On Thu, May 1, 2014 at 7:00 AM, Wall, Wade A ERDC-RDE-CERL-IL 
 wrote:
> Hi all,
>
> I am trying to run a power analysis using simulated data to compare the power 
> of a glm versus a binomial proportion test to detect differences in 
> proportions. For example, suppose you have some proportion that decreases by 
> some amount over X number of time steps.
> .4,.39,.38,.37 . . . . .01 and simulate data over those time steps based on 
> the decrease. Does a glm approach (differences in slope) "outperform" an 
> approach whereby you simply look at proportional differences. I ran 
> simulations and basically summed up the number of times p<.05 divided by the 
> number of trials.  I was expecting that a glm approach would have more power 
> because it would be utilizing all the data across multiple time steps, 
> whereas a binomial proportion test is only comparing two populations ( the 
> beginning and the end points).
>
> However the results indicated that a binomial proportion test had more power 
> relative to a glm at fewer time steps and that, depending on the simulated 
> decrease in proportions, the relationship between glm power and binomial 
> proportion test power changed. Interestingly, a greater decreases, a binomial 
> proportion test seems have greater power compared to a glm, which to me is 
> counterintuitive since the slope should be greater.
>
> I am attaching the code below my questions in case anyone is interested.
>
> My questions are:
>
>
> (1)Am I interpreting the results and p-values correctly?
>
> (2)If I am interested in trends, does glm results really have lower power 
> and, if so, is there a way to combine the two tests?
>
> I know that I am ignoring a lot of issues such as autocorrelation, but I am 
> really just trying to understand the output.
>
> Any suggestions or insights would be appreciated.
>
> # Attempt at using glm model ##
> ltProb <- 0.4   ## longterm average of nest survival 
> probability
> change <- c(.01,.03,.05)
> yrs <- seq(1,20, by=1)## 
> Years of inquiry, probability of detecting a yearly change nest survival
> samplesize <- 50   ## Reasonable 
> sample size range
> reps <- 1000  
>  ## of simulations per
> SurvProb <- ltProb   
> ## initiating survival probablity for later use, nuisance
> power <- matrix(nrow=length(change)*length(yrs)*length(samplesize), 
> ncol=5) ## creating a matrix to hold data
>
> scenario <- 0 
>  ## initializing scenarios which will be a placeholder later on
> for (a in 1:length(change)){  
>## loop through yearly pop change scenarios
>   for (c in 1:length(samplesize)){
> ## loop through sample size scenarios
> for (b in 1:length(yrs)){ 
>  ## loop through years sceanarios
>   scenario=scenario+1
>   power[scenario,1] <- change[a]  
>   
>   ## filling 
> matrix with pop change used
>   power[scenario,2]<-ltProb*((1-change[a])**yrs[b])
>   power[scenario,3] <- yrs[b] 
>   
>## filling matrix with 
> number of years used
>   power[scenario,4] <- samplesize[c]  
>   
>  

Re: [R] Understanding power analysis in glm and binomial proportion test

2014-05-01 Thread Bert Gunter
While R is certainly used for statistical simulations as you showed,
this list is really for questions about R programming, not statistics.
While they certainly overlap and someone may respond here, I suggest
you post this to stats.stackexchange.com or other statistics site that
is specifically for such issues.

Cheers,
Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Thu, May 1, 2014 at 7:00 AM, Wall, Wade A ERDC-RDE-CERL-IL
 wrote:
> Hi all,
>
> I am trying to run a power analysis using simulated data to compare the power 
> of a glm versus a binomial proportion test to detect differences in 
> proportions. For example, suppose you have some proportion that decreases by 
> some amount over X number of time steps.
> .4,.39,.38,.37 . . . . .01 and simulate data over those time steps based on 
> the decrease. Does a glm approach (differences in slope) "outperform" an 
> approach whereby you simply look at proportional differences. I ran 
> simulations and basically summed up the number of times p<.05 divided by the 
> number of trials.  I was expecting that a glm approach would have more power 
> because it would be utilizing all the data across multiple time steps, 
> whereas a binomial proportion test is only comparing two populations ( the 
> beginning and the end points).
>
> However the results indicated that a binomial proportion test had more power 
> relative to a glm at fewer time steps and that, depending on the simulated 
> decrease in proportions, the relationship between glm power and binomial 
> proportion test power changed. Interestingly, a greater decreases, a binomial 
> proportion test seems have greater power compared to a glm, which to me is 
> counterintuitive since the slope should be greater.
>
> I am attaching the code below my questions in case anyone is interested.
>
> My questions are:
>
>
> (1)Am I interpreting the results and p-values correctly?
>
> (2)If I am interested in trends, does glm results really have lower power 
> and, if so, is there a way to combine the two tests?
>
> I know that I am ignoring a lot of issues such as autocorrelation, but I am 
> really just trying to understand the output.
>
> Any suggestions or insights would be appreciated.
>
> # Attempt at using glm model ##
> ltProb <- 0.4   ## longterm average of nest survival 
> probability
> change <- c(.01,.03,.05)
> yrs <- seq(1,20, by=1)## 
> Years of inquiry, probability of detecting a yearly change nest survival
> samplesize <- 50   ## Reasonable 
> sample size range
> reps <- 1000  
>  ## of simulations per
> SurvProb <- ltProb   
> ## initiating survival probablity for later use, nuisance
> power <- matrix(nrow=length(change)*length(yrs)*length(samplesize), ncol=5) 
> ## creating a matrix to hold data
>
> scenario <- 0 
>  ## initializing scenarios which will be a placeholder later on
> for (a in 1:length(change)){  
>## loop through yearly pop change scenarios
>   for (c in 1:length(samplesize)){
> ## loop through sample size scenarios
> for (b in 1:length(yrs)){ 
>  ## loop through years sceanarios
>   scenario=scenario+1
>   power[scenario,1] <- change[a]  
>   
>   ## filling 
> matrix with pop change used
>   power[scenario,2]<-ltProb*((1-change[a])**yrs[b])
>   power[scenario,3] <- yrs[b] 
>   
>## filling matrix with 
> number of years used
>   power[scenario,4] <- samplesize[c]  
>   
>## filling 
> matrix with sample size used
> }
>   }
> }
> colnames(power) <- c("PopChange", "Proportion2","yrs", "Sample_Size", "Power")
> power=as.data.frame(power)
> power$Sample_Size=as.numeric(power$Sample_Size)
>
> scenario=levels(as.factor(power$PopChange)) ## To subset power
> ps = levels(as.factor(power$Sample_Size))
> results <- matrix(nrow=0, ncol=5) ## final matrix
> results=as.data.frame(results)
> rep

Re: [R] A combinatorial assignment problem

2014-05-01 Thread Ravi Varadhan
Thank you, Dan and Bert.


Bert - Your approach provides a solution.  However, it has the undesired 
property of referees lumping together (I apologize that I did not state this as 
a condition).  In other words, it does not "mix" the referees in some random 
fashion.

Dan - your approach attempts to have the desired properties, but is not 
guaranteed to work.  Here is a counterexample:

> set.seed(1234)
> a <- assignment(40,15,3)
> table(a)
a
1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
10  7 12  7  4 10  8  6  8 13  7  7 11  3  7

Notice that the difference between maximum and minimum candidates for referees 
is 13 - 3 = 10.  Of course, I have to increase the # iters to get a better 
solution, but for large  K and R this may not converge at all.

Best regards,
Ravi

From: Ravi Varadhan
Sent: Wednesday, April 30, 2014 1:49 PM
To: r-help@r-project.org
Subject: A combinatorial assignment problem

Hi,

I have this problem:  K candidates apply for a job.  There are R referees 
available to review their resumes and provide feedback.  Suppose that we would 
like M referees to review each candidate (M < R).  How would I assign 
candidates to referees (or, conversely, referees to candidates)?  There are two 
important cases:  (a) K > (R choose M) and (b) K < (R chooses M).

Case (a) actually reduces to case (b), so we only have to consider case (b).  
Without any other constraints, the problem is quite easy to solve.  Here is an 
example that shows this.

require(gtools)
set.seed(12345)
K <- 10  # number of candidates
R <- 7# number of referees
M <- 3   # overlap, number of referees reviewing  each candidate

allcombs <- combinations(R, M, set=TRUE, repeats.allowed=FALSE)
assignment <- allcombs[sample(1:nrow(allcombs), size=K, replace=FALSE), ]
assignment
> assignment
  [,1] [,2] [,3]
[1,]345
[2,]357
[3,]567
[4,]356
[5,]167
[6,]127
[7,]145
[8,]367
[9,]245
[10,]457
>

Here each row stands for a candidate and the column stands for the referees who 
review that candidate.

Of course, there are some constraints that make the assignment challenging.  We 
would like to have an even distribution of the number of candidates reviewed by 
each referee.  For example, the above code produces an assignment where referee 
#2 gets only 2 candidates and referee #5 gets 7 candidates.  We would like to 
avoid such uneven assignments.

> table(assignment)
assignment
1 2 3 4 5 6 7
3 2 4 4 7 4 6
>

Note that in this example there are 35 possible triplets of referees and 10 
candidates.  Therefore, a perfectly even assignment is not possible.

I tried some obvious, greedy search methods but they are not guaranteed to 
work.   Any hints or suggestions would be greatly appreciated.

Best,
Ravi


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Understanding power analysis in glm and binomial proportion test

2014-05-01 Thread Wall, Wade A ERDC-RDE-CERL-IL
Hi all,

I am trying to run a power analysis using simulated data to compare the power 
of a glm versus a binomial proportion test to detect differences in 
proportions. For example, suppose you have some proportion that decreases by 
some amount over X number of time steps.
.4,.39,.38,.37 . . . . .01 and simulate data over those time steps based on the 
decrease. Does a glm approach (differences in slope) "outperform" an approach 
whereby you simply look at proportional differences. I ran simulations and 
basically summed up the number of times p<.05 divided by the number of trials.  
I was expecting that a glm approach would have more power because it would be 
utilizing all the data across multiple time steps, whereas a binomial 
proportion test is only comparing two populations ( the beginning and the end 
points).

However the results indicated that a binomial proportion test had more power 
relative to a glm at fewer time steps and that, depending on the simulated 
decrease in proportions, the relationship between glm power and binomial 
proportion test power changed. Interestingly, a greater decreases, a binomial 
proportion test seems have greater power compared to a glm, which to me is 
counterintuitive since the slope should be greater.

I am attaching the code below my questions in case anyone is interested.

My questions are:


(1)Am I interpreting the results and p-values correctly?

(2)If I am interested in trends, does glm results really have lower power 
and, if so, is there a way to combine the two tests?

I know that I am ignoring a lot of issues such as autocorrelation, but I am 
really just trying to understand the output.

Any suggestions or insights would be appreciated.

# Attempt at using glm model ##
ltProb <- 0.4   ## longterm average of nest survival 
probability
change <- c(.01,.03,.05)
yrs <- seq(1,20, by=1)## 
Years of inquiry, probability of detecting a yearly change nest survival
samplesize <- 50   ## Reasonable 
sample size range
reps <- 1000
   ## of simulations per
SurvProb <- ltProb   ## 
initiating survival probablity for later use, nuisance
power <- matrix(nrow=length(change)*length(yrs)*length(samplesize), ncol=5) ## 
creating a matrix to hold data

scenario <- 0   
   ## initializing scenarios which will be a placeholder later on
for (a in 1:length(change)){
 ## loop through yearly pop change scenarios
  for (c in 1:length(samplesize)){  
  ## loop through sample size scenarios
for (b in 1:length(yrs)){   
   ## loop through years sceanarios
  scenario=scenario+1
  power[scenario,1] <- change[a]

  ## filling matrix 
with pop change used
  power[scenario,2]<-ltProb*((1-change[a])**yrs[b])
  power[scenario,3] <- yrs[b]   

   ## filling matrix with 
number of years used
  power[scenario,4] <- samplesize[c]

   ## filling matrix 
with sample size used
}
  }
}
colnames(power) <- c("PopChange", "Proportion2","yrs", "Sample_Size", "Power")
power=as.data.frame(power)
power$Sample_Size=as.numeric(power$Sample_Size)

scenario=levels(as.factor(power$PopChange)) ## To subset power
ps = levels(as.factor(power$Sample_Size))
results <- matrix(nrow=0, ncol=5) ## final matrix
results=as.data.frame(results)
reps=1000 ## set number of reps

for (k in 1:length(scenario)){
  for (m in 1:length(ps)){
sub=power[(power$PopChange==scenario[k]) & power$Sample_Size==ps[m],] ## 
Subset by scenario and sample size
probs = matrix(nrow=1000,ncol=16) ## this is matrix for probabilities.
dat=matrix(nrow=0,ncol=3,NA)
dat=as.data.frame(dat)
for (l in 1:reps){ ##This should be for the replicates
  dat=matrix(nrow=0,ncol=3,NA)
  dat=as.data.frame(dat)
  print(l)
  for (j in 1:nrow(sub)){
tmp=rbinom(n=sub$Sample_Size[j],size=1,prob=sub[j,2])

tmp.dat=cbind(tmp,rep(sub$yrs[j],sub$Sample_Size[j]),rep(sub$Proportion2[j],sub$Sample_Size[j]))
dat=rbind(dat,tmp.dat)
if (sub$yrs[j]>4){
 

[R] error using polymars argument

2014-05-01 Thread varin sacha
Hi Dear all,
I got an error when I try to do a MARS regression with the "polymars" argument 
on the A matrix and the V vector here below.
I don't understand the error, I have nrow=22 for the response variable and for 
the predictor matrix.
So if somebody could tell me what is going wrong, it would be very nice. 

A=matrix(c(74.711,64.614,83.514,78.187,71.327,86.197,49.986,69.012,79.591,83.158,65.437,56.225,54.485,90.531,93.357,79.15,63.611,73.133,53.497,56.328,89.585,56.289,10.92,8.08,11.92,10.75,12.08,10.58,9.33,14.17,13.25,14.33,6.92,12.42,7.17,13.75,16.83,13.67,5.25,10,7.08,10.83,14.58,11.75,nrow=22,ncol=2))

V=c(44986,18288,56147,44488,41018,40631,27301,39025,45688,47172,12300,21558,16103,48874,67245,36119,10398,42630,12879,34058,84443,30639)
 
polymars(V,A)
Erreur dans polymars(V, A) : 
  The number of rows (cases) of the response and predictor matricies should be 
the same

Best

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A combinatorial assignment problem

2014-05-01 Thread Ravi Varadhan
Thanks, Bert.   
I have written this simple code, which is crude, but seems to do a decent job.  
It works perfectly when M is a factor of R. Otherwise, it gives decent balance 
(of course, balance is not guaranteed).  I guess it is possible to take the 
results, that are somewhat unbalanced and then reshuffle it semi-randomly to 
get better balance.  Any improvements are appreciated! 

assign <- function(K, R, M, iseed=1234) {
  assignment <- matrix(NA, K, M)
  N <- R %/% M
  Nrem <- R %% M
  iseq <- seq(1,K,N)
  for (i in iseq){
size <- ifelse(K-i >= N, R-Nrem, M*(K-i+1))
sel <- sample(1:R, size=size, replace=FALSE)
end <- min((i+N-1),K)
assignment[i:end, ] <- sel
}
assignment
}

sol <- assign(40,16,3)
table(sol)

Thanks,
Ravi

-Original Message-
From: Bert Gunter [mailto:gunter.ber...@gene.com] 
Sent: Thursday, May 01, 2014 10:46 AM
To: Ravi Varadhan
Cc: r-help@r-project.org; djnordl...@frontier.com
Subject: Re: A combinatorial assignment problem

Ravi:

You cannot simultaneously have balance and guarantee random mixing.
That is, you would need to specify precisely what you mean by balance and 
random mixing in this context, as these terms are now subjective and undefined.

You could, of course, randomize the initial assignment of referees to positions 
and note also that some mixing of referee groups would occur if m does not 
divide r evenly. More explicitly, note that a very fast way to generate the 
groups I described is:

rmk <- function(nrefs,size,ncands){
  n <- ncands * size
  matrix(rep(seq_len(nrefs), n %/% nrefs +1)[seq_len(n)],nrow=
ncands,byrow=TRUE)
}
## corner case checks and adjustments need to be made to this, of course.

>  rmk(7,3,10)
  [,1] [,2] [,3]
 [1,]123
 [2,]456
 [3,]712
 [4,]345
 [5,]671
 [6,]234
 [7,]567
 [8,]123
 [9,]456
[10,]712

You could then modify this by randomly reordering the referees every time a 
complete cycle of the groupings occurred, i.e. after each
nrefs/gcd(nrefs,size) candidates = rows. This would give variable groups and 
even assignment. This algorithm could be further fiddled with by choosing nrefs 
and size to assure they are relatively prime and then adding and removing 
further refs randomly during the cycling.
etc.

Cheers,
Bert


Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge is 
certainly not wisdom."
H. Gilbert Welch




On Thu, May 1, 2014 at 6:09 AM, Ravi Varadhan  wrote:
> Thank you, Dan and Bert.
>
>
>
> Bert – Your approach provides a solution.  However, it has the 
> undesired property of referees lumping together (I apologize that I 
> did not state this as a condition).  In other words, it does not "mix" 
> the referees in some random fashion.
>
>
>
> Dan – your approach attempts to have the desired properties, but is 
> not guaranteed to work.  Here is a counterexample:
>
>
>
>> set.seed(1234)
>
>> a <- assignment(40,15,3)
>
>> table(a)
>
> a
>
> 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
>
> 10  7 12  7  4 10  8  6  8 13  7  7 11  3  7
>
>
>
> Notice that the difference between maximum and minimum candidates for 
> referees is 13 – 3 = 10.  Of course, I have to increase the # iters to 
> get a better solution, but for large  K and R this may not converge at all.
>
>
>
> Best regards,
>
> Ravi
>
>
>
> From: Ravi Varadhan
> Sent: Wednesday, April 30, 2014 1:49 PM
> To: r-help@r-project.org
> Subject: A combinatorial assignment problem
>
>
>
> Hi,
>
>
>
> I have this problem:  K candidates apply for a job.  There are R 
> referees available to review their resumes and provide feedback.  
> Suppose that we would like M referees to review each candidate (M < 
> R).  How would I assign candidates to referees (or, conversely, 
> referees to candidates)?  There are two important cases:  (a) K > (R choose 
> M) and (b) K < (R chooses M).
>
>
>
> Case (a) actually reduces to case (b), so we only have to consider case (b).
> Without any other constraints, the problem is quite easy to solve.  
> Here is an example that shows this.
>
>
>
> require(gtools)
>
> set.seed(12345)
>
> K <- 10  # number of candidates
>
> R <- 7# number of referees
>
> M <- 3   # overlap, number of referees reviewing  each candidate
>
>
>
> allcombs <- combinations(R, M, set=TRUE, repeats.allowed=FALSE)
>
> assignment <- allcombs[sample(1:nrow(allcombs), size=K, 
> replace=FALSE), ]
>
> assignment
>
>> assignment
>
>   [,1] [,2] [,3]
>
> [1,]345
>
> [2,]357
>
> [3,]567
>
> [4,]356
>
> [5,]167
>
> [6,]127
>
> [7,]145
>
> [8,]367
>
> [9,]245
>
> [10,]457
>
>>
>
>
>
> Here each row stands for a candidate and the column stands for the 
> referees who review that candidate.
>
>
>
> Of course, there a

Re: [R] help using extrafont package | R graphics

2014-05-01 Thread Evan Cooch
Thanks very much. I'll give it a try.

On 4/30/2014 9:33 PM, Yixuan Qiu wrote:
> Hi Evan,
> If you just need one font, you may try the showtext package. Here is a 
> piece of code that you can test:
>
> library(showtext)
> # If you have this font installed
> font.add("gara", "gara.ttf")
> # Or you can install a similar one from Google Font
> # font.add.google("EB Garamond", "gara")
>
> # Try some plots
> pdf("test.pdf")
> showtext.begin()
> par(family = "gara")
> plot(1, main = "Garamond", type = "n")
> text(1, 1, "Some Text", cex = 5)
> showtext.end()
> dev.off()
>
>
> A more detailed introduction of the showtext package is available at 
> http://statr.me/2014/01/using-system-fonts-in-r-graphs/. Hope this 
> would be helpful for you.
>
>
> Best,
> Yixuan
>
>
>
>
>
>
> 2014-04-26 16:54 GMT-04:00 Evan Cooch  >:
> > Greetings --
> >
> > Submitted this a little while ago -- for some reason, still being held
> > up by the moderator. Trying again...
> >
> >
> > For a host of reasons, I need to use/embed Garamond font with various R
> > graphics for a particular publication. I've figured out how to more or
> > less get there from here, using the following sequence:
> >
> >
> > library(extrafont)
> >
> > Then, I need to import the system fonts (Windoze box, R 3.1.0).
> >
> > So, I use
> >
> > font_import()
> >
> > But, this takes a *huge* amount of time, and often throws errors as it
> > chokes on various fonts (I have probably 250+ fonts installed). I only
> > want *one* font (Garamond). But, for the life of me, I can't figure out
> > how to get font_import to select only the single font I want. In theory
> >
> > font_import(paths = NULL, recursive = TRUE, prompt = TRUE, pattern = 
> NULL)
> >
> > as defaults, where pattern is a regex that font filenames must match.
> >
> > The file name for Garamong is gara.ttf, so I tried
> >
> > font_import(pattern="gara")
> >
> > R responds with 'Importing fonts make take a few minutes, depending on
> > the...etc, etc'.
> > Continue? [y/n]
> >
> > Hit 'y', and am presented with
> >
> > Scanning ttf files in C:\Windows\Fonts ...
> > Extracting .afm files from .ttf files...
> > Error in data.frame(fontfile = ttfiles, FontName = "", stringsAsFactors
> > = FALSE) :
> >arguments imply differing number of rows: 0, 1
> >
> > I have no idea what to do with this.
> >
> > Suggestions/pointers to the obvious welcome. And I thought futzing with
> > fonts in LaTeX was fun! ;-)
> >
> >
> > [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org  mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide 
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
> -- 
> Yixuan Qiu mailto:yixuan@cos.name>>
> Department of Statistics,
> Purdue University


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply with more than one matrix

2014-05-01 Thread William Dunlap
> Thank you, A.K.  I learned from both of your solutions.  I find the one that 
> uses alply easier to follow and understand intuitively.

Another approach is to only loop over the 3rd dimensional slices of
a1.  Your original code, converted to a function so it is easier to
test and think about is

f0 <- function(a1, m1, m2) {
stopifnot(length(dim(a1))==3, length(dim(m1))==2, length(dim(m2))==2,
  all(dim(m1)==dim(m2)), all(dim(m1)==dim(a1)[1:2]),
  dim(a1)[3] > 1)
condition1 <- !is.na(m1)& m1 > m2
ans <- array(NA_real_, dim(m1)) # initialize
for(i in seq_len(dim(m1)[1])) {
for(j in seq_len(dim(m1)[2])) {
ind.not.na <- which(!is.na(a1[i,j,]))
if(condition1[i,j] && length(ind.not.na) > 0) {
ans[i,j] <- a1[i,j,ind.not.na[1]] + m2[i,j]
}
}
}
ans
}

I believe the following is an equivalent function, but I haven't
checked it much.
Note that the only loop is over the third dimension of a1, which I assume is not
too big.  That updating loop could probably be replaced by a call to Reduce.

f1 <- function(a1, m1, m2) {
stopifnot(length(dim(a1))==3, length(dim(m1))==2, length(dim(m2))==2,
  all(dim(m1)==dim(m2)), all(dim(m1)==dim(a1)[1:2]),
  dim(a1)[3] > 1)

firstGoodInA1 <- array(NA_real_, dim(a1)[1:2])
for(k in seq_len(dim(a1)[3])) {
isStillBad <- is.na(firstGoodInA1)
firstGoodInA1[isStillBad] <- a1[, ,k][isStillBad]
}
condition1 <- !is.na(firstGoodInA1) & !is.na(m1) & (m1 > m2)
ans <- array(NA_real_, dim(m1))
ans[condition1] <- firstGoodInA1[condition1] + m2[condition1]
ans
}


Bill Dunlap
TIBCO Software
wdunlap tibco.com


On Thu, May 1, 2014 at 8:30 AM, Waichler, Scott R
 wrote:
> Thank you, A.K.  I learned from both of your solutions.  I find the one that 
> uses alply easier to follow and understand intuitively.  I guess I'll want to 
> learn more about what plyr can do.  I've been using R for years but hadn't 
> pushed vectorization far enough in my code.  Now my computing will be more 
> efficient.
>
> Thanks also to David and the others who responded to my question.  It all 
> helped.  --Scott Waichler
>
>> -Original Message-
>> From: arun [mailto:smartpink...@yahoo.com]
>> Sent: Thursday, May 01, 2014 6:24 AM
>> To: R. Help
>> Cc: Waichler, Scott R
>> Subject: Re: [R] Using apply with more than one matrix
>>
>> Hi,
>> Sorry, a typo in the previous function:
>> -
>> if (condition1[i] && !is.na(indx3)) {
>> arr[x1][indx3] + m2[i]  ## should be mat2[i]
>> } else NA
>>
>> ---
>>
>> Also, you can try:
>> library(plyr)
>> evaluateNew <- function(arr, mat1, mat2){ if (!all(dim(mat1) ==
>> dim(mat2))) {
>> stop("both matrices should have equal dimensions")
>> }
>>  indx1 <- unlist(alply(arr, c(1,2), function(x) {x[!is.na(x)][1]}))
>> condition1 <- !is.na(mat1) & mat1 > mat2 ifelse(condition1, indx1+mat2,
>> NA) } evaluateNew(a1, m1, m2)
>>
>> #[,1] [,2]
>> #[1,]   NA 1.66
>> #[2,] 3.14   NA
>> A.K.
>>
>>
>>
>>
>> On Thursday, May 1, 2014 5:49 AM, arun  wrote:
>>
>>
>> Hi,
>>
>> You may try:
>> evaluate <- function(arr, mat1, mat2) {
>> if (!all(dim(mat1) == dim(mat2))) {
>> stop("both matrices should have equal dimensions")
>> }
>> indx1 <- as.matrix(do.call(expand.grid, lapply(dim(arr), sequence)))
>> indx2 <- paste0(indx1[, 1], indx1[, 2])
>> condition1 <- !is.na(mat1) & mat1 > mat2
>> ans <- sapply(seq_along(unique(indx2)), function(i) {
>> x1 <- indx1[indx2 %in% unique(indx2)[i], ]
>> indx3 <- which(!is.na(arr[x1]))[1]
>> if (condition1[i] && !is.na(indx3)) {
>> arr[x1][indx3] + m2[i]
>> } else NA
>> })
>> dim(ans) <- dim(mat1)
>> ans
>> }
>> evaluate(a1,m1,m2)
>> # [,1] [,2]
>> #[1,]   NA 1.66
>> #[2,] 3.14   NA
>>
>> A.K.
>>
>>
>>
>> On Thursday, May 1, 2014 2:53 AM, "Waichler, Scott R"
>>  wrote:
>> > I would ask you to look at this loop-free approach and ask if this is
>> > not equally valid?
>> >
>> > ans <- matrix(NA, ncol=2, nrow=2)
>> > ind.not.na <- which(!is.na(a1))
>> > ans[] <- condition1*a1[,,ind.not.na[1]]+ m2  # two matrices of equal
>> >dimensions, one logical.
>> >  ans
>> >  [,1] [,2]
>> > [1,]   NA 1.66
>> > [2,] 2.74   NA
>>
>> Thanks, I am learning something.  I didn't know you could multiply a
>> logical object by a numerical one.  But notice the answer is not the same
>> as mine, because I am doing an operation on the vector of values a1[i,j,]
>> first.
>>
>> I tried a modification on sapply below, but it doesn't work  because I
>> haven't referenced the 3d array a1 properly.  So I guess I must try to get
>> a 2d result from a1 first, then use that in matrix arithmetic.
>>
>> Sapply or mapply may work, I haven't used these much and will try to learn
>> better how to use them.  Your use of sapply looks good; but I'm tryin

Re: [R] A combinatorial assignment problem

2014-05-01 Thread Charles Berry
Ravi Varadhan  jhu.edu> writes:

> 
> Hi,
> 
> I have this problem:  K candidates apply for a job.  There are R referees
available to review their resumes and
> provide feedback.  Suppose that we would like M referees to review each
candidate (M < R).  How would I assign
> candidates to referees (or, conversely, referees to candidates)?  There
are two important cases:  (a) K >
> (R choose M) and (b) K < (R chooses M).
> 
> Case (a) actually reduces to case (b), so we only have to consider case
(b).  Without any other constraints,
> the problem is quite easy to solve.  Here is an example that shows this.
> 
> require(gtools)
> set.seed(12345)
> K <- 10  # number of candidates
> R <- 7# number of referees
> M <- 3   # overlap, number of referees reviewing  each candidate
> 
> allcombs <- combinations(R, M, set=TRUE, repeats.allowed=FALSE)
> assignment <- allcombs[sample(1:nrow(allcombs), size=K, replace=FALSE), ]
> assignment
> > assignment
>   [,1] [,2] [,3]
> [1,]345
> [2,]357
> [3,]567
> [4,]356
> [5,]167
> [6,]127
> [7,]145
> [8,]367
> [9,]245
> [10,]457
> >
> 


Isn't this the problem of constructing a balanced incomplete block design?

The problem and an R package to handle it are described here:

http://www.r-bloggers.com/generating-balanced-incomplete-block-designs-bibd/

As noted there, you cannot always get balance.

A Google Scholar search on  "balanced incomplete block design" will
pop up many classical works on this problem.

Maybe try the ExperimentalDesigns Task view. 


HTH,

Chuck

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply with more than one matrix

2014-05-01 Thread Waichler, Scott R
Thank you, A.K.  I learned from both of your solutions.  I find the one that 
uses alply easier to follow and understand intuitively.  I guess I'll want to 
learn more about what plyr can do.  I've been using R for years but hadn't 
pushed vectorization far enough in my code.  Now my computing will be more 
efficient.  

Thanks also to David and the others who responded to my question.  It all 
helped.  --Scott Waichler

> -Original Message-
> From: arun [mailto:smartpink...@yahoo.com]
> Sent: Thursday, May 01, 2014 6:24 AM
> To: R. Help
> Cc: Waichler, Scott R
> Subject: Re: [R] Using apply with more than one matrix
> 
> Hi,
> Sorry, a typo in the previous function:
> -
> if (condition1[i] && !is.na(indx3)) {
>     arr[x1][indx3] + m2[i]  ## should be mat2[i]
>     } else NA
> 
> ---
> 
> Also, you can try:
> library(plyr)
> evaluateNew <- function(arr, mat1, mat2){ if (!all(dim(mat1) ==
> dim(mat2))) {
>     stop("both matrices should have equal dimensions")
>     }
>  indx1 <- unlist(alply(arr, c(1,2), function(x) {x[!is.na(x)][1]}))
> condition1 <- !is.na(mat1) & mat1 > mat2 ifelse(condition1, indx1+mat2,
> NA) } evaluateNew(a1, m1, m2)
> 
> #    [,1] [,2]
> #[1,]   NA 1.66
> #[2,] 3.14   NA
> A.K.
> 
> 
> 
> 
> On Thursday, May 1, 2014 5:49 AM, arun  wrote:
> 
> 
> Hi,
> 
> You may try:
> evaluate <- function(arr, mat1, mat2) {
>     if (!all(dim(mat1) == dim(mat2))) {
>     stop("both matrices should have equal dimensions")
>     }
>     indx1 <- as.matrix(do.call(expand.grid, lapply(dim(arr), sequence)))
>     indx2 <- paste0(indx1[, 1], indx1[, 2])
>     condition1 <- !is.na(mat1) & mat1 > mat2
>     ans <- sapply(seq_along(unique(indx2)), function(i) {
>     x1 <- indx1[indx2 %in% unique(indx2)[i], ]
>     indx3 <- which(!is.na(arr[x1]))[1]
>     if (condition1[i] && !is.na(indx3)) {
>     arr[x1][indx3] + m2[i]
>     } else NA
>     })
>     dim(ans) <- dim(mat1)
>     ans
> }
> evaluate(a1,m1,m2)
> # [,1] [,2]
> #[1,]   NA 1.66
> #[2,] 3.14   NA
> 
> A.K.
> 
> 
> 
> On Thursday, May 1, 2014 2:53 AM, "Waichler, Scott R"
>  wrote:
> > I would ask you to look at this loop-free approach and ask if this is
> > not equally valid?
> >
> > ans <- matrix(NA, ncol=2, nrow=2)
> > ind.not.na <- which(!is.na(a1))
> > ans[] <- condition1*a1[,,ind.not.na[1]]+ m2  # two matrices of equal
> >dimensions, one logical.
> >  ans
> >      [,1] [,2]
> > [1,]   NA 1.66
> > [2,] 2.74   NA
> 
> Thanks, I am learning something.  I didn't know you could multiply a
> logical object by a numerical one.  But notice the answer is not the same
> as mine, because I am doing an operation on the vector of values a1[i,j,]
> first.
> 
> I tried a modification on sapply below, but it doesn't work  because I
> haven't referenced the 3d array a1 properly.  So I guess I must try to get
> a 2d result from a1 first, then use that in matrix arithmetic.
> 
> Sapply or mapply may work, I haven't used these much and will try to learn
> better how to use them.  Your use of sapply looks good; but I'm trying to
> understand if and how I can bring in the operation on a1.  This doesn't
> work:
> 
> evaluate <- function(idx) {
>   ind.not.na <- which(!is.na(a1[idx,])) ]))  # doesn't work; improper
> indexing for a1
>   if(length(ind.not.na) > 0) {
>     return(condition1*(a1[idx,ind.not.na[1]] + m2[idx]))  # doesn't work;
> improper indexing for a1
>   }
> }
> vec <- sapply(seq(length(m2)), evaluate)
> 
> Scott Waichler
> 
> 
> > -Original Message-
> > From: David Winsemius [mailto:dwinsem...@comcast.net]
> > Sent: Wednesday, April 30, 2014 8:46 PM
> > To: Waichler, Scott R
> > Cc: Bert Gunter; r-help@r-project.org
> > Subject: Re: [R] Using apply with more than one matrix
> >
> >
> > On Apr 30, 2014, at 6:03 PM, Waichler, Scott R wrote:
> >
> > > Here is a working example with no random parts.  Thanks for your
> > patience and if I'm still off the mark with my presentation I'll drop
> > the matter.
> > >
> > > v <- c(NA, 1.5, NA, NA,
> > >       NA, 1.1, 0.5, NA,
> > >       NA, 1.3, 0.4, 0.9)
> > > a1 <- array(v, dim=c(2,2,3))
> > > m1 <- matrix(c(NA, 1.5, 2.1, NA), ncol=2, byrow=T)
> > > m2 <- matrix(c(1.56, 1.64, 1.16, 2.92), ncol=2)
> > > condition1 <- !is.na(m1)& m1 > m2
> > >
> > > ans <- matrix(NA, ncol=2, nrow=2) # initialize for(i in 1:2) {
> > >for(j  in 1:2) {
> > >    ind.not.na <- which(!is.na(a1[i,j,]))
> > >    if(condition1[i,j] && length(ind.not.na) > 0) ans[i,j] <-
> > >a1[i,j,ind.not.na[1]] + m2[i,j]  } } ans
> > >     [,1] [,2]
> > > [1,]   NA 1.66
> > > [2,] 3.14   NA
> >
> > I would ask you to look at this loop-free approach and ask if this is
> > not equally valid?
> >
> > ans <- matrix(NA, ncol=2, nrow=2)
> > ind.not.na <- which(!is.na(a1))
> > ans[] <- condition1*a1[,,ind.not.na[1]]+ m2  # two matrices of equal
> >dimensions, one logical.
> >  ans
> >      [,1] [,2]
> > [1,]   NA 1.66
> > [2,] 2.74   NA
> > >
> > > L

Re: [R] Using apply with more than one matrix

2014-05-01 Thread Waichler, Scott R
> > Sapply or mapply may work, I haven't used these much and will try to
> learn better how to use them.  Your use of sapply looks good; but I'm
> trying to understand if and how I can bring in the operation on a1.  This
> doesn't work:
> >
> > evaluate <- function(idx) {
> >  ind.not.na <- which(!is.na(a1[idx,])) ]))  # doesn't work; improper
> > indexing for a1
> >  if(length(ind.not.na) > 0) {
> >return(condition1*(a1[idx,ind.not.na[1]] + m2[idx]))  # doesn't
> > work; improper indexing for a1  } } vec <- sapply(seq(length(m2)),
> > evaluate)
> 
> Are we to assume that the length of `which(!is.na(a1[idx,])) ]))` is
> guaranteed to equal the length of the two other matrices? If not then what
> sort of relationships should be assumed?

ind.not.na is just a vector that could be any length.  It is a selection on the 
vector a1[i,j,].  I want to get the first element of that selection.  The key 
relationship is that the dimensions of the matrices and the first two 
dimensions of the 3d arrays are the same.  That is, i and j have the same range 
for all of them.  

Scott

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A combinatorial assignment problem

2014-05-01 Thread Bert Gunter
Ravi:

You cannot simultaneously have balance and guarantee random mixing.
That is, you would need to specify precisely what you mean by balance
and random mixing in this context, as these terms are now subjective
and undefined.

You could, of course, randomize the initial assignment of referees to
positions and note also that some mixing of referee groups would occur
if m does not divide r evenly. More explicitly, note that a very fast
way to generate the groups I described is:

rmk <- function(nrefs,size,ncands){
  n <- ncands * size
  matrix(rep(seq_len(nrefs), n %/% nrefs +1)[seq_len(n)],nrow=
ncands,byrow=TRUE)
}
## corner case checks and adjustments need to be made to this, of course.

>  rmk(7,3,10)
  [,1] [,2] [,3]
 [1,]123
 [2,]456
 [3,]712
 [4,]345
 [5,]671
 [6,]234
 [7,]567
 [8,]123
 [9,]456
[10,]712

You could then modify this by randomly reordering the referees every
time a complete cycle of the groupings occurred, i.e. after each
nrefs/gcd(nrefs,size) candidates = rows. This would give variable
groups and even assignment. This algorithm could be further fiddled
with by choosing nrefs and size to assure they are relatively prime
and then adding and removing further refs randomly during the cycling.
etc.

Cheers,
Bert


Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Thu, May 1, 2014 at 6:09 AM, Ravi Varadhan  wrote:
> Thank you, Dan and Bert.
>
>
>
> Bert – Your approach provides a solution.  However, it has the undesired
> property of referees lumping together (I apologize that I did not state this
> as a condition).  In other words, it does not "mix" the referees in some
> random fashion.
>
>
>
> Dan – your approach attempts to have the desired properties, but is not
> guaranteed to work.  Here is a counterexample:
>
>
>
>> set.seed(1234)
>
>> a <- assignment(40,15,3)
>
>> table(a)
>
> a
>
> 1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
>
> 10  7 12  7  4 10  8  6  8 13  7  7 11  3  7
>
>
>
> Notice that the difference between maximum and minimum candidates for
> referees is 13 – 3 = 10.  Of course, I have to increase the # iters to get a
> better solution, but for large  K and R this may not converge at all.
>
>
>
> Best regards,
>
> Ravi
>
>
>
> From: Ravi Varadhan
> Sent: Wednesday, April 30, 2014 1:49 PM
> To: r-help@r-project.org
> Subject: A combinatorial assignment problem
>
>
>
> Hi,
>
>
>
> I have this problem:  K candidates apply for a job.  There are R referees
> available to review their resumes and provide feedback.  Suppose that we
> would like M referees to review each candidate (M < R).  How would I assign
> candidates to referees (or, conversely, referees to candidates)?  There are
> two important cases:  (a) K > (R choose M) and (b) K < (R chooses M).
>
>
>
> Case (a) actually reduces to case (b), so we only have to consider case (b).
> Without any other constraints, the problem is quite easy to solve.  Here is
> an example that shows this.
>
>
>
> require(gtools)
>
> set.seed(12345)
>
> K <- 10  # number of candidates
>
> R <- 7# number of referees
>
> M <- 3   # overlap, number of referees reviewing  each candidate
>
>
>
> allcombs <- combinations(R, M, set=TRUE, repeats.allowed=FALSE)
>
> assignment <- allcombs[sample(1:nrow(allcombs), size=K, replace=FALSE), ]
>
> assignment
>
>> assignment
>
>   [,1] [,2] [,3]
>
> [1,]345
>
> [2,]357
>
> [3,]567
>
> [4,]356
>
> [5,]167
>
> [6,]127
>
> [7,]145
>
> [8,]367
>
> [9,]245
>
> [10,]457
>
>>
>
>
>
> Here each row stands for a candidate and the column stands for the referees
> who review that candidate.
>
>
>
> Of course, there are some constraints that make the assignment challenging.
> We would like to have an even distribution of the number of candidates
> reviewed by each referee.  For example, the above code produces an
> assignment where referee #2 gets only 2 candidates and referee #5 gets 7
> candidates.  We would like to avoid such uneven assignments.
>
>
>
>> table(assignment)
>
> assignment
>
> 1 2 3 4 5 6 7
>
> 3 2 4 4 7 4 6
>
>>
>
>
>
> Note that in this example there are 35 possible triplets of referees and 10
> candidates.  Therefore, a perfectly even assignment is not possible.
>
>
>
> I tried some obvious, greedy search methods but they are not guaranteed to
> work.   Any hints or suggestions would be greatly appreciated.
>
>
>
> Best,
>
> Ravi
>
>

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] R 3.1 changes to type.convert causing strings where I used to get numeric

2014-05-01 Thread Bos, Roger
Duncan,

Yes, I admit I haven't updated since 2.15.2 so I was behind.  I am seeing the 
same problem with reading from RODBC (sql server) tables.  I write data to a 
table and then read it back.  In R 2.15 it would come in numeric.  Under R 3.1 
it comes in as a string.

Thanks,

Roger


-Original Message-
From: Duncan Murdoch [mailto:murdoch.dun...@gmail.com]
Sent: Wednesday, April 30, 2014 4:40 PM
To: Bos, Roger; r-help@r-project.org
Subject: Re: [R] R 3.1 changes to type.convert causing strings where I used to 
get numeric

On 30/04/2014, 4:20 PM, Bos, Roger wrote:
> Dear R-help,
>
> I recently upgraded to R 3.1 patched and code that ran fine previously and 
> now giving a lot of errors because the data is coming in as strings instead 
> of numeric.  I can fix my code to wrapping each item I want to use with 
> as.numeric(), but that seems very inefficient.
>
> I looked at the change list for R 3.1 and I see the first item is a change in 
> type.convert() that seems to be causing me grief.  The suggestion is to use 
> colClasses, but when I try to do so I get an error regarding the quotes 
> again...
>
>>ann1  <- read.table(driveletter %+% "/snap/ann/snap_fyr_ann1_"
>> %+% i %+% ".txt", header=TRUE, quote="", as.is=TRUE,
>> colClasses='numeric')
> Error in scan(file, what, nmax, sep, dec, quote, skip, nlines, na.strings,  :
>scan() expected 'a real', got '"1033.7"'
>
> Does anyone have any suggestions?

This sounds like two separate problems.  The first is the change to 
type.convert().  That's going to be dealt with (it's already in R-devel, will 
eventually be handled in R-patched and 3.1.1).  You deserve part of the blame 
for this for never testing the pre-release version.  It would have been easier 
to fix before release, but nobody who tested then bothered to report it.

The second problem is one I don't recall hearing reported before.  If you have 
a .csv file containing the lines

X
"1"

then it should be readable as a .csv, because the quotes should be stripped.  
In fact it is readable now if you *don't* specify that the column is numeric, 
and it will be converted to a numeric value.
However, if you do use colClasses="numeric" you'll get an error.  That looks 
wrong, though it is consistent with ?read.table.

 > x <- c("X", '"1"')
 > read.csv(textConnection(x))
   X
1 1
 > read.csv(textConnection(x), colClasses="numeric") Error in scan(file, what, 
 > nmax, sep, dec, quote, skip, nlines, na.strings,  :
   scan() expected 'a real', got '"1"'
 >


Duncan Murdoch

>
>
> CHANGES IN R 3.1.0:
> NEW FEATURES
>
>
> type.convert() (and hence by default read.table()) returns a character vector 
> or factor when representing a numeric input as a double would lose accuracy. 
> Similarly for complex inputs.
> If a file contains numeric data with unrepresentable numbers of decimal 
> places that are intended to be read as numeric, specify colClasses in 
> read.table() to be"numeric".
>
>
> ***
> This message is for the named person's use only. It may
> contain confidential, proprietary or legally privileged
> information. No right to confidential or privileged treatment
> of this message is waived or lost by an error in transmission.
> If you have received this message in error, please immediately
> notify the sender by e-mail, delete the message and all
> copies from your system and destroy any hard copies.  You must
> not, directly or indirectly, use, disclose, distribute,
> print or copy any part of this message if you are not
> the intended recipient.
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>


***
This message is for the named person's use only. It may
contain confidential, proprietary or legally privileged
information. No right to confidential or privileged treatment
of this message is waived or lost by an error in transmission.
If you have received this message in error, please immediately
notify the sender by e-mail, delete the message and all
copies from your system and destroy any hard copies.  You must
not, directly or indirectly, use, disclose, distribute,
print or copy any part of this message if you are not
the intended recipient.
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply with more than one matrix

2014-05-01 Thread arun
Hi,
Sorry, a typo in the previous function:
-
if (condition1[i] && !is.na(indx3)) {
    arr[x1][indx3] + m2[i]  ## should be mat2[i]
    } else NA
 
---

Also, you can try:
library(plyr)
evaluateNew <- function(arr, mat1, mat2){
if (!all(dim(mat1) == dim(mat2))) {
    stop("both matrices should have equal dimensions")
    }
 indx1 <- unlist(alply(arr, c(1,2), function(x) {x[!is.na(x)][1]}))
condition1 <- !is.na(mat1) & mat1 > mat2
ifelse(condition1, indx1+mat2, NA)
}
evaluateNew(a1, m1, m2)

#    [,1] [,2]
#[1,]   NA 1.66
#[2,] 3.14   NA
A.K.




On Thursday, May 1, 2014 5:49 AM, arun  wrote:


Hi,

You may try:
evaluate <- function(arr, mat1, mat2) {
    if (!all(dim(mat1) == dim(mat2))) {
    stop("both matrices should have equal dimensions")
    }
    indx1 <- as.matrix(do.call(expand.grid, lapply(dim(arr), sequence)))
    indx2 <- paste0(indx1[, 1], indx1[, 2])
    condition1 <- !is.na(mat1) & mat1 > mat2
    ans <- sapply(seq_along(unique(indx2)), function(i) {
    x1 <- indx1[indx2 %in% unique(indx2)[i], ]
    indx3 <- which(!is.na(arr[x1]))[1]
    if (condition1[i] && !is.na(indx3)) {
    arr[x1][indx3] + m2[i]
    } else NA
    })
    dim(ans) <- dim(mat1)
    ans
}
evaluate(a1,m1,m2)
# [,1] [,2]
#[1,]   NA 1.66
#[2,] 3.14   NA

A.K.



On Thursday, May 1, 2014 2:53 AM, "Waichler, Scott R"  
wrote:
> I would ask you to look at this loop-free approach and ask if this is not
> equally valid?
> 
> ans <- matrix(NA, ncol=2, nrow=2)
> ind.not.na <- which(!is.na(a1))
> ans[] <- condition1*a1[,,ind.not.na[1]]+ m2  # two matrices of equal
> dimensions, one logical.
>  ans
>      [,1] [,2]
> [1,]   NA 1.66
> [2,] 2.74   NA

Thanks, I am learning something.  I didn't know you could multiply a logical 
object by a numerical one.  But notice the answer is not the same as mine, 
because I am doing an operation on the vector of values a1[i,j,] first.  

I tried a modification on sapply below, but it doesn't work  because I haven't 
referenced the 3d array a1 properly.  So I guess I must try to get a 2d result 
from a1 first, then use that in matrix arithmetic.

Sapply or mapply may work, I haven't used these much and will try to learn 
better how to use them.  Your use of sapply looks good; but I'm trying to 
understand if and how I can bring in the operation on a1.  This doesn't work:

evaluate <- function(idx) {
  ind.not.na <- which(!is.na(a1[idx,])) ]))  # doesn't work; improper indexing 
for a1
  if(length(ind.not.na) > 0) {
    return(condition1*(a1[idx,ind.not.na[1]] + m2[idx]))  # doesn't work; 
improper indexing for a1
  }
}
vec <- sapply(seq(length(m2)), evaluate)

Scott Waichler


> -Original Message-
> From: David Winsemius [mailto:dwinsem...@comcast.net]
> Sent: Wednesday, April 30, 2014 8:46 PM
> To: Waichler, Scott R
> Cc: Bert Gunter; r-help@r-project.org
> Subject: Re: [R] Using apply with more than one matrix
> 
> 
> On Apr 30, 2014, at 6:03 PM, Waichler, Scott R wrote:
> 
> > Here is a working example with no random parts.  Thanks for your
> patience and if I'm still off the mark with my presentation I'll drop the
> matter.
> >
> > v <- c(NA, 1.5, NA, NA,
> >       NA, 1.1, 0.5, NA,
> >       NA, 1.3, 0.4, 0.9)
> > a1 <- array(v, dim=c(2,2,3))
> > m1 <- matrix(c(NA, 1.5, 2.1, NA), ncol=2, byrow=T)
> > m2 <- matrix(c(1.56, 1.64, 1.16, 2.92), ncol=2)
> > condition1 <- !is.na(m1)& m1 > m2
> >
> > ans <- matrix(NA, ncol=2, nrow=2) # initialize for(i in 1:2) {  for(j
> > in 1:2) {
> >    ind.not.na <- which(!is.na(a1[i,j,]))
> >    if(condition1[i,j] && length(ind.not.na) > 0) ans[i,j] <-
> > a1[i,j,ind.not.na[1]] + m2[i,j]  } } ans
> >     [,1] [,2]
> > [1,]   NA 1.66
> > [2,] 3.14   NA
> 
> I would ask you to look at this loop-free approach and ask if this is not
> equally valid?
> 
> ans <- matrix(NA, ncol=2, nrow=2)
> ind.not.na <- which(!is.na(a1))
> ans[] <- condition1*a1[,,ind.not.na[1]]+ m2  # two matrices of equal
> dimensions, one logical.
>  ans
>      [,1] [,2]
> [1,]   NA 1.66
> [2,] 2.74   NA
> >
> > Let me try asking again in words.  If I have multiple matrices or slices
> of 3d arrays that are the same dimension, is there a way to pass them all
> to apply, and have apply take care of looping through i,j?
> 
> I don't think `apply` is the correct function for this. Either `mapply` or
> basic matrix operation seem more likely to deliver correct results:
> 
> 
> > I understand that apply has just one input object x.  I want to work on
> more than one array object at once using a custom function that has this
> characteristic:  in order to compute the answer at i,j I need a result
> from higher order array at the same i,j.
> 
> If you want to iterate over matrix indices you can either use the vector
> version e.g. m2[3] or the matrix version, m2[2,1[.
> 
> vec <- sapply(seq(length(m2) , function(idx) m2[idx]*condition1[idx] )
> 
> 
> 
> > This is what I tried to demonstrate in my example 

[R] Tobit model with L1 and L2 penalties (elastic net)

2014-05-01 Thread Sicotte, Hugues, Ph.D.
Does anyone know of package that implements a tobit model with L1 and L2 (lasso 
and ridge) penalties aka elastic net.

I am interested in an elastic net implementation of tobit because my variables 
are SNPs genotypes/dosage in Linkage disequilibrium and elastic net type models 
to well for that.

The closest I found was
The R Brq package implement the Bayesian Tobit quantile regression model with 
adaptive lasso penalty and will limit itself to an L1 penalty.

p.s. I cannot do this by combining the penalized package and survreg as the 
penalized package does not yet allow for survreg models (as per Terry Therneau).



Hugues Sicotte, Ph.D.
Assistant Professor in Medical Informatics
Bioinformatics Core
Division of Biomedical Statistics and Informatics
Phone: 507-538-3866
Fax: 507-284-0360
Email: sicotte.hug...@mayo.edu

Mayo Clinic
200 First Street SW
Rochester, MN 55905
www.mayoclinic.org


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] legendre quadrature

2014-05-01 Thread Enrico Schumann
On Thu, 01 May 2014, pari hesabi  writes:

> Hello everybody
> I need to approximate the amount of integral by using
>  legendre quadrature. I have written a program which doesn't give me a 
> logical answer; Can anybody help me and send the correct program? For 
> example  the approximated amount of integral of ( x ^2)  on (-1,1) based
>  on legendre quad rule. 
>   

One possibility:

  require("NMOF")
  xw <-xwGauss(10, "legendre")

  fun <- function(x)
  x^2

  sum(fun(xw$nodes) * xw$weights)

>
>
> integrand<-function(x) {x^2}
> rules <- legendre.quadrature.rules( 50 )

  Error: object 'legendre.quadrature.rules' not found

PLEASE "provide commented, minimal, self-contained, reproducible code".

> order.rule <- rules[[50]]
> chebyshev.c.quadrature(integrand, order.rule, lower = -1, upper = 1) 
>
> Thank you
> Diba  
>   

-- 
Enrico Schumann
Lucerne, Switzerland
http://enricoschumann.net

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] legendre quadrature

2014-05-01 Thread Prof. Dr. Matthias Kohl

you could use package distrEx:

library(distrEx)
GLIntegrate(function(x) x^2, lower = -1, upper = 1, order = 50)

hth
Matthias

On 01.05.2014 09:43, pari hesabi wrote:

Hello everybody
I need to approximate the amount of integral by using
  legendre quadrature. I have written a program which doesn't give me a
logical answer; Can anybody help me and send the correct program? For
example  the approximated amount of integral of ( x ^2)  on (-1,1) based
  on legendre quad rule.




integrand<-function(x) {x^2}
rules <- legendre.quadrature.rules( 50 )
order.rule <- rules[[50]]
chebyshev.c.quadrature(integrand, order.rule, lower = -1, upper = 1)

Thank you
Diba
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Prof. Dr. Matthias Kohl
www.stamats.de

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] legendre quadrature

2014-05-01 Thread pari hesabi
Hello everybody
I need to approximate the amount of integral by using
 legendre quadrature. I have written a program which doesn't give me a 
logical answer; Can anybody help me and send the correct program? For 
example  the approximated amount of integral of ( x ^2)  on (-1,1) based
 on legendre quad rule. 
  



integrand<-function(x) {x^2}
rules <- legendre.quadrature.rules( 50 )
order.rule <- rules[[50]]
chebyshev.c.quadrature(integrand, order.rule, lower = -1, upper = 1) 

Thank you
Diba
  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Legendre quadrature

2014-05-01 Thread pari hesabi
Hello everybody
I need to approximate the amount of integral by using
 legendre quadrature. I have written a program which doesn't give me a 
logical answer; Can anybody help me and send the correct program? For 
example  the approximated amount of integral of ( x ^2)  on (-1,1) based
 on legendre quad rule. 
  



integrand<-function(x){x^2}
rules <- legendre.quadrature.rules( 50 )
order.rule <- rules[[50]]
chebyshev.c.quadrature(integrand, order.rule, lower = -1, upper = 1) 

Thank you
Diba  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unbalanced manova

2014-05-01 Thread Sergio Fonda
Thank you John, I solved exploiting Anova call type 2 in your car package
:)
Il 25/apr/2014 14:57 "John Fox"  ha scritto:

> Dear Sergio,
>
> The Anova() function in the car package can perform MANOVA with a
> multivariate linear model fit to unbalanced data by lm() -- see the
> examples in ?Anova. I'm not sure what you mean by "avoiding NA values,"
> however. With the default na.action, which is na.omit, lm() will perform a
> complete-case analysis, omitting cases with NA for any variable in the
> model.
>
> I hope this helps,
>  John
>
> 
> John Fox, Professor
> McMaster University
> Hamilton, Ontario, Canada
> http://socserv.mcmaster.ca/jfox/
>
>
> On Fri, 25 Apr 2014 11:17:44 +0200
>  Sergio Fonda  wrote:
> > Hello list,
> > I would be very grateful if somebody could suggest me which kind of call
> > may be used to perform a manova with unbalanced data avoiding NA values.
> > Thanks a lot
> >
> >   [[alternative HTML version deleted]]
> >
> > __
> > R-help@r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-help
> > PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> > and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Using apply with more than one matrix

2014-05-01 Thread arun


Hi,

You may try:
evaluate <- function(arr, mat1, mat2) {
    if (!all(dim(mat1) == dim(mat2))) {
    stop("both matrices should have equal dimensions")
    }
    indx1 <- as.matrix(do.call(expand.grid, lapply(dim(arr), sequence)))
    indx2 <- paste0(indx1[, 1], indx1[, 2])
    condition1 <- !is.na(mat1) & mat1 > mat2
    ans <- sapply(seq_along(unique(indx2)), function(i) {
    x1 <- indx1[indx2 %in% unique(indx2)[i], ]
    indx3 <- which(!is.na(arr[x1]))[1]
    if (condition1[i] && !is.na(indx3)) {
    arr[x1][indx3] + m2[i]
    } else NA
    })
    dim(ans) <- dim(mat1)
    ans
}
evaluate(a1,m1,m2)
# [,1] [,2]
#[1,]   NA 1.66
#[2,] 3.14   NA

A.K.


On Thursday, May 1, 2014 2:53 AM, "Waichler, Scott R"  
wrote:
> I would ask you to look at this loop-free approach and ask if this is not
> equally valid?
> 
> ans <- matrix(NA, ncol=2, nrow=2)
> ind.not.na <- which(!is.na(a1))
> ans[] <- condition1*a1[,,ind.not.na[1]]+ m2  # two matrices of equal
> dimensions, one logical.
>  ans
>      [,1] [,2]
> [1,]   NA 1.66
> [2,] 2.74   NA

Thanks, I am learning something.  I didn't know you could multiply a logical 
object by a numerical one.  But notice the answer is not the same as mine, 
because I am doing an operation on the vector of values a1[i,j,] first.  

I tried a modification on sapply below, but it doesn't work  because I haven't 
referenced the 3d array a1 properly.  So I guess I must try to get a 2d result 
from a1 first, then use that in matrix arithmetic.

Sapply or mapply may work, I haven't used these much and will try to learn 
better how to use them.  Your use of sapply looks good; but I'm trying to 
understand if and how I can bring in the operation on a1.  This doesn't work:

evaluate <- function(idx) {
  ind.not.na <- which(!is.na(a1[idx,])) ]))  # doesn't work; improper indexing 
for a1
  if(length(ind.not.na) > 0) {
    return(condition1*(a1[idx,ind.not.na[1]] + m2[idx]))  # doesn't work; 
improper indexing for a1
  }
}
vec <- sapply(seq(length(m2)), evaluate)

Scott Waichler


> -Original Message-
> From: David Winsemius [mailto:dwinsem...@comcast.net]
> Sent: Wednesday, April 30, 2014 8:46 PM
> To: Waichler, Scott R
> Cc: Bert Gunter; r-help@r-project.org
> Subject: Re: [R] Using apply with more than one matrix
> 
> 
> On Apr 30, 2014, at 6:03 PM, Waichler, Scott R wrote:
> 
> > Here is a working example with no random parts.  Thanks for your
> patience and if I'm still off the mark with my presentation I'll drop the
> matter.
> >
> > v <- c(NA, 1.5, NA, NA,
> >       NA, 1.1, 0.5, NA,
> >       NA, 1.3, 0.4, 0.9)
> > a1 <- array(v, dim=c(2,2,3))
> > m1 <- matrix(c(NA, 1.5, 2.1, NA), ncol=2, byrow=T)
> > m2 <- matrix(c(1.56, 1.64, 1.16, 2.92), ncol=2)
> > condition1 <- !is.na(m1)& m1 > m2
> >
> > ans <- matrix(NA, ncol=2, nrow=2) # initialize for(i in 1:2) {  for(j
> > in 1:2) {
> >    ind.not.na <- which(!is.na(a1[i,j,]))
> >    if(condition1[i,j] && length(ind.not.na) > 0) ans[i,j] <-
> > a1[i,j,ind.not.na[1]] + m2[i,j]  } } ans
> >     [,1] [,2]
> > [1,]   NA 1.66
> > [2,] 3.14   NA
> 
> I would ask you to look at this loop-free approach and ask if this is not
> equally valid?
> 
> ans <- matrix(NA, ncol=2, nrow=2)
> ind.not.na <- which(!is.na(a1))
> ans[] <- condition1*a1[,,ind.not.na[1]]+ m2  # two matrices of equal
> dimensions, one logical.
>  ans
>      [,1] [,2]
> [1,]   NA 1.66
> [2,] 2.74   NA
> >
> > Let me try asking again in words.  If I have multiple matrices or slices
> of 3d arrays that are the same dimension, is there a way to pass them all
> to apply, and have apply take care of looping through i,j?
> 
> I don't think `apply` is the correct function for this. Either `mapply` or
> basic matrix operation seem more likely to deliver correct results:
> 
> 
> > I understand that apply has just one input object x.  I want to work on
> more than one array object at once using a custom function that has this
> characteristic:  in order to compute the answer at i,j I need a result
> from higher order array at the same i,j.
> 
> If you want to iterate over matrix indices you can either use the vector
> version e.g. m2[3] or the matrix version, m2[2,1[.
> 
> vec <- sapply(seq(length(m2) , function(idx) m2[idx]*condition1[idx] )
> 
> 
> 
> > This is what I tried to demonstrate in my example above.
> >
> > Thanks,
> > Scott
> 
> David Winsemius
> Alameda, CA, USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Tobit model with panel data

2014-05-01 Thread Arne Henningsen
Hi Phil

On 30 April 2014 23:06, phil  wrote:
> I just started to work with R a couple of weeks ago. Right now I would like
> to regress an independent variable on a couple of explanatory variables. The
> dependent variable is left censored in the sense that all negative values
> and zero are set equal to one.

That is most likely not a suitable data transformation.

> This is done because I want to take the logarithm which, as you all know,
> is only defined for values bigger than zero.
>
> However, what I don't know now is how to compute this on R. Does anybody
> know how to proceed? I already checked
> http://cran.r-project.org/web/packages/censReg/vignettes/censReg.pdf
>    but
> I am not able to apply the commands shown on page 8 correctly on my data
> set.
>
> Hence, I would really appreciate if somebody could give me a step-by-step
> instruction for my own dataset.

1st step: load the data set into R (see
http://cran.r-project.org/doc/manuals/r-release/R-data.html)

2nd step: use pdata.frame() or plm.data() to specify the panel
structure (see http://www.jstatsoft.org/v27/i02/)

3rd step: use censReg() to estimate the model (see
http://cran.r-project.org/web/packages/censReg/vignettes/censReg.pdf)

Please do read the posting guide http://www.R-project.org/posting-guide.html

Best regards,
Arne

-- 
Arne Henningsen
http://www.arne-henningsen.name

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A combinatorial assignment problem

2014-05-01 Thread Bert Gunter
I had  trouble with my email and it went before it should. Here's the
solution I meant to send:

Arrange the r referees in a circle.

start <- 0
Replicate k times{
end <- (start + m-1)%% r
output: c(start,end) +1
start <- (end+1)%% r
}


The start and end pairs give the subsets of referees around the
circle. The distributes the referees to the candidates as evenly as
possible. I leave it to you to translate this into code. It should be
very fast as no combinatorics are required.


-- Bert

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch


> On Wed, Apr 30, 2014 at 10:48 AM, Ravi Varadhan  wrote:
>> Hi,
>>
>> I have this problem:  K candidates apply for a job.  There are R referees 
>> available to review their resumes and provide feedback.  Suppose that we 
>> would like M referees to review each candidate (M < R).  How would I assign 
>> candidates to referees (or, conversely, referees to candidates)?  There are 
>> two important cases:  (a) K > (R choose M) and (b) K < (R chooses M).
>>
>> Case (a) actually reduces to case (b), so we only have to consider case (b). 
>>  Without any other constraints, the problem is quite easy to solve.  Here is 
>> an example that shows this.
>>
>> require(gtools)
>> set.seed(12345)
>> K <- 10  # number of candidates
>> R <- 7# number of referees
>> M <- 3   # overlap, number of referees reviewing  each candidate
>>
>> allcombs <- combinations(R, M, set=TRUE, repeats.allowed=FALSE)
>> assignment <- allcombs[sample(1:nrow(allcombs), size=K, replace=FALSE), ]
>> assignment
>>> assignment
>>   [,1] [,2] [,3]
>> [1,]345
>> [2,]357
>> [3,]567
>> [4,]356
>> [5,]167
>> [6,]127
>> [7,]145
>> [8,]367
>> [9,]245
>> [10,]457
>>>
>>
>> Here each row stands for a candidate and the column stands for the referees 
>> who review that candidate.
>>
>> Of course, there are some constraints that make the assignment challenging.  
>> We would like to have an even distribution of the number of candidates 
>> reviewed by each referee.  For example, the above code produces an 
>> assignment where referee #2 gets only 2 candidates and referee #5 gets 7 
>> candidates.  We would like to avoid such uneven assignments.
>>
>>> table(assignment)
>> assignment
>> 1 2 3 4 5 6 7
>> 3 2 4 4 7 4 6
>>>
>>
>> Note that in this example there are 35 possible triplets of referees and 10 
>> candidates.  Therefore, a perfectly even assignment is not possible.
>>
>> I tried some obvious, greedy search methods but they are not guaranteed to 
>> work.   Any hints or suggestions would be greatly appreciated.
>>
>> Best,
>> Ravi
>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A combinatorial assignment problem

2014-05-01 Thread Bert Gunter
This is not really a combinatorial problem, I'll use small letters
instead of caps.

Arrange the r referees in a circle.

start <- 1
Replicate k times{
end <- (start + m-1)%% r
output: c(start,end)
start <- (end+1)%% r
}

Bert Gunter
Genentech Nonclinical Biostatistics
(650) 467-7374

"Data is not information. Information is not knowledge. And knowledge
is certainly not wisdom."
H. Gilbert Welch




On Wed, Apr 30, 2014 at 10:48 AM, Ravi Varadhan  wrote:
> Hi,
>
> I have this problem:  K candidates apply for a job.  There are R referees 
> available to review their resumes and provide feedback.  Suppose that we 
> would like M referees to review each candidate (M < R).  How would I assign 
> candidates to referees (or, conversely, referees to candidates)?  There are 
> two important cases:  (a) K > (R choose M) and (b) K < (R chooses M).
>
> Case (a) actually reduces to case (b), so we only have to consider case (b).  
> Without any other constraints, the problem is quite easy to solve.  Here is 
> an example that shows this.
>
> require(gtools)
> set.seed(12345)
> K <- 10  # number of candidates
> R <- 7# number of referees
> M <- 3   # overlap, number of referees reviewing  each candidate
>
> allcombs <- combinations(R, M, set=TRUE, repeats.allowed=FALSE)
> assignment <- allcombs[sample(1:nrow(allcombs), size=K, replace=FALSE), ]
> assignment
>> assignment
>   [,1] [,2] [,3]
> [1,]345
> [2,]357
> [3,]567
> [4,]356
> [5,]167
> [6,]127
> [7,]145
> [8,]367
> [9,]245
> [10,]457
>>
>
> Here each row stands for a candidate and the column stands for the referees 
> who review that candidate.
>
> Of course, there are some constraints that make the assignment challenging.  
> We would like to have an even distribution of the number of candidates 
> reviewed by each referee.  For example, the above code produces an assignment 
> where referee #2 gets only 2 candidates and referee #5 gets 7 candidates.  We 
> would like to avoid such uneven assignments.
>
>> table(assignment)
> assignment
> 1 2 3 4 5 6 7
> 3 2 4 4 7 4 6
>>
>
> Note that in this example there are 35 possible triplets of referees and 10 
> candidates.  Therefore, a perfectly even assignment is not possible.
>
> I tried some obvious, greedy search methods but they are not guaranteed to 
> work.   Any hints or suggestions would be greatly appreciated.
>
> Best,
> Ravi
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A combinatorial assignment problem

2014-05-01 Thread Daniel Nordlund



> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Daniel Nordlund
> Sent: Thursday, May 01, 2014 1:10 AM
> To: r-help@r-project.org
> Subject: Re: [R] A combinatorial assignment problem
> 
> > -Original Message-
> > From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> > On Behalf Of Ravi Varadhan
> > Sent: Wednesday, April 30, 2014 10:49 AM
> > To: r-help@r-project.org
> > Subject: [R] A combinatorial assignment problem
> >
> > Hi,
> >
> > I have this problem:  K candidates apply for a job.  There are R
> referees
> > available to review their resumes and provide feedback.  Suppose that we
> > would like M referees to review each candidate (M < R).  How would I
> > assign candidates to referees (or, conversely, referees to candidates)?
> > There are two important cases:  (a) K > (R choose M) and (b) K < (R
> > chooses M).
> >
> > Case (a) actually reduces to case (b), so we only have to consider case
> > (b).  Without any other constraints, the problem is quite easy to solve.
> > Here is an example that shows this.
> >
> > require(gtools)
> > set.seed(12345)
> > K <- 10  # number of candidates
> > R <- 7# number of referees
> > M <- 3   # overlap, number of referees reviewing  each candidate
> >
> > allcombs <- combinations(R, M, set=TRUE, repeats.allowed=FALSE)
> > assignment <- allcombs[sample(1:nrow(allcombs), size=K, replace=FALSE),
> ]
> > assignment
> > > assignment
> >   [,1] [,2] [,3]
> > [1,]345
> > [2,]357
> > [3,]567
> > [4,]356
> > [5,]167
> > [6,]127
> > [7,]145
> > [8,]367
> > [9,]245
> > [10,]457
> > >
> >
> > Here each row stands for a candidate and the column stands for the
> > referees who review that candidate.
> >
> > Of course, there are some constraints that make the assignment
> > challenging.  We would like to have an even distribution of the number
> of
> > candidates reviewed by each referee.  For example, the above code
> produces
> > an assignment where referee #2 gets only 2 candidates and referee #5
> gets
> > 7 candidates.  We would like to avoid such uneven assignments.
> >
> > > table(assignment)
> > assignment
> > 1 2 3 4 5 6 7
> > 3 2 4 4 7 4 6
> > >
> >
> > Note that in this example there are 35 possible triplets of referees and
> > 10 candidates.  Therefore, a perfectly even assignment is not possible.
> >
> > I tried some obvious, greedy search methods but they are not guaranteed
> to
> > work.   Any hints or suggestions would be greatly appreciated.
> >
> > Best,
> > Ravi
> >
> 
> Well, if you don't need clever, a brute force approach could work
> (depending on the values of k, r, and m).  Something like this
> 
> 

I apologize for the noise.  I made some last second edits and cut and pasted 
the wrong code in the first email.  Obviously, you would want to add some error 
checking to the code to trap incompatible parameter specification.


assignment <- function(k,r,m,max_iter=120) {
   n <- 0
   cmb <- combn(r,m)
   repeat {
 n <- n+1 
 tbl <- table(map<-cmb[,sample(1:choose(r,m),k)])
 if((max(tbl)-min(tbl)) <= 1) break
 if(n > max_iter) break
   }
   return(t(map))
}
a <- assignment(10,7,3)


Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] A combinatorial assignment problem

2014-05-01 Thread Daniel Nordlund
> -Original Message-
> From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org]
> On Behalf Of Ravi Varadhan
> Sent: Wednesday, April 30, 2014 10:49 AM
> To: r-help@r-project.org
> Subject: [R] A combinatorial assignment problem
> 
> Hi,
> 
> I have this problem:  K candidates apply for a job.  There are R referees
> available to review their resumes and provide feedback.  Suppose that we
> would like M referees to review each candidate (M < R).  How would I
> assign candidates to referees (or, conversely, referees to candidates)?
> There are two important cases:  (a) K > (R choose M) and (b) K < (R
> chooses M).
> 
> Case (a) actually reduces to case (b), so we only have to consider case
> (b).  Without any other constraints, the problem is quite easy to solve.
> Here is an example that shows this.
> 
> require(gtools)
> set.seed(12345)
> K <- 10  # number of candidates
> R <- 7# number of referees
> M <- 3   # overlap, number of referees reviewing  each candidate
> 
> allcombs <- combinations(R, M, set=TRUE, repeats.allowed=FALSE)
> assignment <- allcombs[sample(1:nrow(allcombs), size=K, replace=FALSE), ]
> assignment
> > assignment
>   [,1] [,2] [,3]
> [1,]345
> [2,]357
> [3,]567
> [4,]356
> [5,]167
> [6,]127
> [7,]145
> [8,]367
> [9,]245
> [10,]457
> >
> 
> Here each row stands for a candidate and the column stands for the
> referees who review that candidate.
> 
> Of course, there are some constraints that make the assignment
> challenging.  We would like to have an even distribution of the number of
> candidates reviewed by each referee.  For example, the above code produces
> an assignment where referee #2 gets only 2 candidates and referee #5 gets
> 7 candidates.  We would like to avoid such uneven assignments.
> 
> > table(assignment)
> assignment
> 1 2 3 4 5 6 7
> 3 2 4 4 7 4 6
> >
> 
> Note that in this example there are 35 possible triplets of referees and
> 10 candidates.  Therefore, a perfectly even assignment is not possible.
> 
> I tried some obvious, greedy search methods but they are not guaranteed to
> work.   Any hints or suggestions would be greatly appreciated.
> 
> Best,
> Ravi
> 

Well, if you don't need clever, a brute force approach could work (depending on 
the values of k, r, and m).  Something like this 


assignment <- function(k,r,m,max_iter=120) {
   n <- 0
   cmb <- combn(r,m)
   repeat {
 n <- n+1 
 tbl <- table(map<-cmb[,sample(1:choose(r,m),k)])
 if(min(tbl) == max(tbl)-1) break
 if(n > max_iter) break
   }
   return(t(map))
}
a <- assignment(10,7,3)


Dan

Daniel Nordlund
Bothell, WA USA

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] outer() problem

2014-05-01 Thread Berend Hasselman

On 01-05-2014, at 08:44, Marc Girondot  wrote:

> Dear list-members,
> 
> Can someone explains me why the last command gives an error. Thanks a lot:
> > outer(0:1, 0:1, FUN=function(x, y) {x+y})
> [,1] [,2]
> [1,]01
> [2,]12
> > outer(0:1, 0:1, FUN=function(x, y) {x})
> [,1] [,2]
> [1,]00
> [2,]11
> > outer(0:1, 0:1, FUN=function(x, y) {1})
> Erreur dans outer(0:1, 0:1, FUN = function(x, y) { :
>  dims [produit 4] ne correspond pas à la longueur de l'objet [1]
> 
> Of course I simplify a lot my problem.

As the documentation for outer says the function arguments x and y are vectors 
not scalars.
Try this to see that

outer(0:1, 0:1, FUN=function(x, y) {print(x);x}) 
outer(0:1, 0:1, FUN=function(x, y) {print(x);print(y);x}) 

You are returning a scalar whereas outer expects the function to return a 
vector of the correct length.

A possible solution would be

outer(0:1, 0:1, FUN=function(x, y) {1+0*x}) 
outer(0:1, 0:1, FUN=function(x, y) {rep(1,length(x))}) 

Berend
__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] outer() problem

2014-05-01 Thread Ken Knoblauch
Marc Girondot  yahoo.fr> writes:
>  > outer(0:1, 0:1, FUN=function(x, y) {1})
> Erreur dans outer(0:1, 0:1, FUN = function(x, y) { :
>dims [produit 4] ne correspond pas à la longueur de l'objet [1]

Because whatever the dimensions of your 2 input vectors,
this function simply returns the value 1 and outer expects
to generate an output that has a value for each pair of x and
y values, not a single value for the whole  set of xy pairs.

Note the lines in the source where the error occurs

 robj <- FUN(X, Y, ...)
 dim(robj) <- c(dX, dY)

where dX and dY are the lengths of x and y, respectively
in your case;

> Thanks a lot
> 
> Marc
> 
-- 
Kenneth Knoblauch
Inserm U846
Stem-cell and Brain Research Institute
Department of Integrative Neurosciences
18 avenue du Doyen Lépine
69500 Bron
France
tel: +33 (0)4 72 91 34 77
fax: +33 (0)4 72 91 34 61
portable: +33 (0)6 84 10 64 10
http://www.sbri.fr/members/kenneth-knoblauch.html

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] outer() problem

2014-05-01 Thread Marc Girondot

Dear list-members,

Can someone explains me why the last command gives an error. Thanks a lot:
> outer(0:1, 0:1, FUN=function(x, y) {x+y})
 [,1] [,2]
[1,]01
[2,]12
> outer(0:1, 0:1, FUN=function(x, y) {x})
 [,1] [,2]
[1,]00
[2,]11
> outer(0:1, 0:1, FUN=function(x, y) {1})
Erreur dans outer(0:1, 0:1, FUN = function(x, y) { :
  dims [produit 4] ne correspond pas à la longueur de l'objet [1]

Of course I simplify a lot my problem.

Thanks a lot

Marc

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.