Re: [R] r-markdown - keeping figures

2015-10-21 Thread Yihui Xie
Yes, setting the fig.path option will prevent rmarkdown from deleting
the figure files, and the more natural way to preserve these
intermediate files is to set the rmarkdown option keep_tex or keep_md
(depending on your output format) to yes, e.g.

---
output:
  pdf_document:
keep_tex: yes
  html_document:
keep_md: yes
---

Regards,
Yihui
--
Yihui Xie 
Web: http://yihui.name


On Wed, Oct 21, 2015 at 7:21 AM, Jeff Newmiller
 wrote:
> I think the default now is to not save them unless you set the fig.path chunk 
> option.
>
> http://kbroman.org/knitr_knutshell/pages/Rmarkdown.html
> ---
> Jeff NewmillerThe .   .  Go Live...
> DCN:Basics: ##.#.   ##.#.  Live Go...
>   Live:   OO#.. Dead: OO#..  Playing
> Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
> /Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
> ---
> Sent from my phone. Please excuse my brevity.
>
> On October 21, 2015 1:47:33 PM GMT+02:00, Bob O'Hara  
> wrote:
>>The figures should be saved somewhere. e.g. if you have x.Rmd, you
>>should have a X_files/ folder with subfolders for the figures (e.g.
>>X-html or X-latex). At least that's what I have.
>>
>>Bob
>>
>>On 20 October 2015 at 18:18, Witold E Wolski 
>>wrote:
>>> I am running r-markdown from r-studio and can't work out how to keep
>>> the figures.
>>> I mean I have a few figures in the document and would like to have
>>> them as separate pdf's too as I have been used to have them when
>>using
>>> Sweave.
>>>
>>>
>>>
>>> best regards
>>> Witold
>>>
>>>
>>> --
>>> Witold Eryk Wolski
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] threshold and replace values in a matrix

2015-10-21 Thread PIKAL Petr
Hi

I wonder why you are asking that after quite a long use of R.

test[test < (-.5)] <- (-1)

Double loops seems to me the last resort in R if any other approach fails.

Cheers
Petr


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Alaios
> via R-help
> Sent: Wednesday, October 21, 2015 3:35 PM
> To: R-help Mailing List
> Subject: [R] threshold and replace values in a matrix
>
> Dear all I have a table as that.
> test<-matrix(data=rnorm(100),ncol=10)
> and I want to find the values that are below my thresholdthreshold<- -
> 0.5and replace them with a -1 instead.
>
> I can of course write a double nested for loop to check one by one
> elementif (test[i,j]<= threshold)   test[i,j]<- -1 but that would be
> rather inneficient sice I have very large tables.
> Does R offer any "automation" for matrix data types?
> I would like to thank you in advance for your replyRegardsAlex
>
>   [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person represented by the recipient.
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] threshold and replace values in a matrix

2015-10-21 Thread Alaios via R-help
Thanks. replace also looks to work okay. 


 On Wednesday, October 21, 2015 3:43 PM, PIKAL Petr 
 wrote:
   

 Hi

I wonder why you are asking that after quite a long use of R.

test[test < (-.5)] <- (-1)

Double loops seems to me the last resort in R if any other approach fails.

Cheers
Petr


> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Alaios
> via R-help
> Sent: Wednesday, October 21, 2015 3:35 PM
> To: R-help Mailing List
> Subject: [R] threshold and replace values in a matrix
>
> Dear all I have a table as that.
> test<-matrix(data=rnorm(100),ncol=10)
> and I want to find the values that are below my thresholdthreshold<- -
> 0.5and replace them with a -1 instead.
>
> I can of course write a double nested for loop to check one by one
> elementif (test[i,j]<= threshold)  test[i,j]<- -1 but that would be
> rather inneficient sice I have very large tables.
> Does R offer any "automation" for matrix data types?
> I would like to thank you in advance for your replyRegardsAlex
>
>      [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.


Tento e-mail a jakékoliv k němu připojené dokumenty jsou důvěrné a jsou určeny 
pouze jeho adresátům.
Jestliže jste obdržel(a) tento e-mail omylem, informujte laskavě neprodleně 
jeho odesílatele. Obsah tohoto emailu i s přílohami a jeho kopie vymažte ze 
svého systému.
Nejste-li zamýšleným adresátem tohoto emailu, nejste oprávněni tento email 
jakkoliv užívat, rozšiřovat, kopírovat či zveřejňovat.
Odesílatel e-mailu neodpovídá za eventuální škodu způsobenou modifikacemi či 
zpožděním přenosu e-mailu.

V případě, že je tento e-mail součástí obchodního jednání:
- vyhrazuje si odesílatel právo ukončit kdykoliv jednání o uzavření smlouvy, a 
to z jakéhokoliv důvodu i bez uvedení důvodu.
- a obsahuje-li nabídku, je adresát oprávněn nabídku bezodkladně přijmout; 
Odesílatel tohoto e-mailu (nabídky) vylučuje přijetí nabídky ze strany příjemce 
s dodatkem či odchylkou.
- trvá odesílatel na tom, že příslušná smlouva je uzavřena teprve výslovným 
dosažením shody na všech jejích náležitostech.
- odesílatel tohoto emailu informuje, že není oprávněn uzavírat za společnost 
žádné smlouvy s výjimkou případů, kdy k tomu byl písemně zmocněn nebo písemně 
pověřen a takové pověření nebo plná moc byly adresátovi tohoto emailu případně 
osobě, kterou adresát zastupuje, předloženy nebo jejich existence je adresátovi 
či osobě jím zastoupené známá.

This e-mail and any documents attached to it may be confidential and are 
intended only for its intended recipients.
If you received this e-mail by mistake, please immediately inform its sender. 
Delete the contents of this e-mail with all attachments and its copies from 
your system.
If you are not the intended recipient of this e-mail, you are not authorized to 
use, disseminate, copy or disclose this e-mail in any manner.
The sender of this e-mail shall not be liable for any possible damage caused by 
modifications of the e-mail or by delay with transfer of the email.

In case that this e-mail forms part of business dealings:
- the sender reserves the right to end negotiations about entering into a 
contract in any time, for any reason, and without stating any reasoning.
- if the e-mail contains an offer, the recipient is entitled to immediately 
accept such offer; The sender of this e-mail (offer) excludes any acceptance of 
the offer on the part of the recipient containing any amendment or variation.
- the sender insists on that the respective contract is concluded only upon an 
express mutual agreement on all its aspects.
- the sender of this e-mail informs that he/she is not authorized to enter into 
any contracts on behalf of the company except for cases in which he/she is 
expressly authorized to do so in writing, and such authorization or power of 
attorney is submitted to the recipient or the person represented by the 
recipient, or the existence of such authorization is known to the recipient of 
the person represented by the recipient.


  
[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] threshold and replace values in a matrix

2015-10-21 Thread Alaios via R-help
Dear all I have a table as that.
test<-matrix(data=rnorm(100),ncol=10)
and I want to find the values that are below my thresholdthreshold<- -0.5and 
replace them with a -1 instead.

I can of course write a double nested for loop to check one by one elementif 
(test[i,j]<= threshold)   test[i,j]<- -1
but that would be rather inneficient sice I have very large tables.
Does R offer any "automation" for matrix data types?
I would like to thank you in advance for your replyRegardsAlex

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Linear regression with a rounded response variable

2015-10-21 Thread Ravi Varadhan
Hi,
I am dealing with a regression problem where the response variable, time 
(second) to walk 15 ft, is rounded to the nearest integer.  I do not care for 
the regression coefficients per se, but my main interest is in getting the 
prediction equation for walking speed, given the predictors (age, height, sex, 
etc.), where the predictions will be real numbers, and not integers.  The hope 
is that these predictions should provide unbiased estimates of the "unrounded" 
walking speed. These sounds like a measurement error problem, where the 
measurement error is due to rounding and hence would be uniformly distributed 
(-0.5, 0.5).

Are there any canonical approaches for handling this type of a problem? What is 
wrong with just doing the standard linear regression?

I googled and saw that this question was asked by someone else in a 
stackexchange post, but it was unanswered.  Any suggestions?

Thank you,
Ravi

Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
Associate Professor,  Department of Oncology
Division of Biostatistics & Bionformatics
Sidney Kimmel Comprehensive Cancer Center
Johns Hopkins University
550 N. Broadway, Suite -E
Baltimore, MD 21205
410-502-2619


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Can't Get Contents and Centers of gg2plot stat_hexbin Histograms

2015-10-21 Thread SirDoctorGentleman .
To find the bin centers, I've tried gg2plot_build(), but the
center-coordinates it provides are not what's plotted. In the below example
there are several misplaced "o" symbols and some bins w/o a symbol on them.

library(ggplot2)
dat <- data.frame(x = rnorm(10, 6, 2), y = rnorm(10, 6, 2))
hexHist = ggplot(dat, aes(x, y)) + stat_binhex(bins=15);
hexDat = ggplot_build(hexHist)$data[[1]]
hexHistFinal = hexHist + annotate("text", hexDat$x, hexDat$y, label="o")
hexHistFinal

I've also given hexbin::hcell2xy(hexbin(dat$x, dat$y, 15)) a shot, but
that's even further away from what's plotted.

I'm not sure how to go about figuring out what data is in what bin. (My
ultimate goal is to take the data in a given bin, take the averages of a
third and forth column in said data, and use *annotate* to superimpose said
average on a hexagonal density histogram). I've been working on this since
yesterday morning with no results. :(

Any ideas?

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Reordering of numerical vector

2015-10-21 Thread Marlin Keith Cox
I do not have a dataset to share as I do not believe it needs it and I am
not sure I could reproduce easily.
I have a column of numerical data (Days) and another of a a measurement
(Resistance).  After subsetting, I do a linear regression of the two, and
it reorders the day (x axis) into some other order.

I am a fairly experienced R user and I cannot find the answer anywhere.

I am certain it is simple as these things usually are for me.

Thank you ahead of time.  Keith



M. Keith Cox, Ph.D.
Principal
MKConsulting
17105 Glacier Hwy
Juneau, AK 99801
U.S. 907.957.4606

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Reordering of numerical vector

2015-10-21 Thread Marlin Keith Cox
Got it.  During a read.csv I used stringsAsFactors=FALSE

M. Keith Cox, Ph.D.
Principal
MKConsulting
17105 Glacier Hwy
Juneau, AK 99801
U.S. 907.957.4606

On Wed, Oct 21, 2015 at 10:31 AM, Marlin Keith Cox 
wrote:

> I do not have a dataset to share as I do not believe it needs it and I am
> not sure I could reproduce easily.
> I have a column of numerical data (Days) and another of a a measurement
> (Resistance).  After subsetting, I do a linear regression of the two, and
> it reorders the day (x axis) into some other order.
>
> I am a fairly experienced R user and I cannot find the answer anywhere.
>
> I am certain it is simple as these things usually are for me.
>
> Thank you ahead of time.  Keith
>
>
>
> M. Keith Cox, Ph.D.
> Principal
> MKConsulting
> 17105 Glacier Hwy
> Juneau, AK 99801
> U.S. 907.957.4606
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear regression with a rounded response variable

2015-10-21 Thread Charles C. Berry

On Wed, 21 Oct 2015, Ravi Varadhan wrote:

Hi, I am dealing with a regression problem where the response variable, 
time (second) to walk 15 ft, is rounded to the nearest integer.  I do 
not care for the regression coefficients per se, but my main interest is 
in getting the prediction equation for walking speed, given the 
predictors (age, height, sex, etc.), where the predictions will be real 
numbers, and not integers.  The hope is that these predictions should 
provide unbiased estimates of the "unrounded" walking speed. These 
sounds like a measurement error problem, where the measurement error is 
due to rounding and hence would be uniformly distributed (-0.5, 0.5).




Not the usual "measurement error model" problem, though, where the errors 
are in X and not independent of XB.


Look back at the proof of the unbiasedness of least squares under the 
Gauss-Markov setup. The errors in Y need to have expectation zero.


From your description (but see caveat below) this is true of walking 
*time*, but not not exactly true of walking *speed* (modulo the usual 
assumptions if they apply to time). In fact if E(epsilon) = 0 were true of 
unrounded time, it would not be true of unrounded speed (and vice versa).




Are there any canonical approaches for handling this type of a problem?


Work out the bias analytically? Parametric bootstrap? Data augmentation 
and friends?



What is wrong with just doing the standard linear regression?



Well, what do the actual values look like?

If half the subjects have a value of 5 seconds and the rest are split 
between 4 and 6, your assertion that rounding induces an error of 
dunif(epsilon,-0.5,0.5) is surely wrong (more positive errors in the 6 
second group and more negative errors in the 4 second group under any 
plausible model).



HTH,

Chuck

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Crear variable con condiciones

2015-10-21 Thread Javier Rubén Marcuzzi
Estimado Jorge I Velez

Recién llego y leo los correos mientras me acuesto a descansar un rato, puede 
ser que no alcance a razonarlo bien, pero posiblemente por la resolución a un 
problema parecido, lo que yo use es la creación de una nueva columna y se la 
agregue al mismo data.frame, pero a esta nueva columna, que es una lista copia 
de los originales pero le quite el primero y le agregue un valor al final (para 
que los n sean iguales, creo que usted tendría que realizar lo contrario), 
luego utilicé una función porque habría algo de proceso, pero básicamente con 
un if pueda decidir si utilizar una columna o la otra ( TOENDREF O 
TOENDREF_modificada, todos corridos una fila hacia abajo).

Tendría que probarlo, pero seguro que usted lo realiza mas rápido y seguro 
(cansado no debo escribir R).

Javier Rubén Marcuzzi
Técnico en Industrias Lácteas
Veterinario



De: Jorge I Velez
Enviado: miércoles, 21 de octubre de 2015 10:26
Para: Javier Rubén Marcuzzi
CC: R-help-es
Asunto: Re: [R-es] Crear variable con condiciones


Muchas gracias Javier por tu respuesta.

Si. Para obtener "dataout" se utilizan filas anteriores de acuerdo con la 
disponibilidad de la variable TOENDREF para cada valor de la variable REF.  Por 
ejemplo, las filas 3 y 4 de "datain" son

#REF TIMEREF TOENDREF
#3  999     360      150
#4 1099      30      480

En la fila 3, el valor de TOENDREF es 150. Esto indica que hay 150 unidades 
disponibles de esa referencia. Ahora, en la fila 4, TIMEREF es 30 para REF = 
1099.  Como en esta fila TIMEREF es menor que TOENDREF para la referencia 
anterior, entonces la nueva variable NEWREF debe ser 999 y no 1099.  El nuevo 
valor de TOENDREF en esta fila sera 150 - 30 = 120.  Esta seria la fila 4 de 
"dataout":

REF TIMEREF TOENDREF NEWREF
#4  1099      30      120    999

Para la fila 5 de "dataout", los recursos disponibles corresponden al _nuevo_ 
valor de TOENDREF en NEWREF (i.e., 120).  Siguiendo la misma logica anterior, 
obtenemos entonces las filas 5 a 12 de "dataout":

REF TIMEREF TOENDREF NEWREF
#5   731      30       90    999
#6   731      60       30    999
#7   731      90      420    731
#8   731     120      300    731
#9  1442      30      270    731
#10 1442      60      210    731
#11 1442      90      120    731
#12 1442     120        0    731

Observa que en la ultima fila se agotaron todos los recursos de TOENDREF para 
NEWREF = 731, por lo que no fue necesario utilizar la REF = 1442.

Espero que esta vez las cosas sean un poco mas claras.

Los datos se pueden agrupar por la variable REF, que basicamente se refiere a 
la referencia de un producto.  Si aun tengo disponibilidad de ese producto 
(variable TOENDREF) entonces lo utilizo y cancelo la referencia siguiente.  Las 
unidades que se piden de cada producto corresponden a la variable TIMEREF.

Gracias a todos de antemano por sus sugerencias.

Saludos,
Jorge 
​Velez.-


2015-10-20 22:30 GMT-05:00 Javier Rubén Marcuzzi 
:
Estimado Jorge I Velez
 
Yo hace unos años tuve un problema parecido y creo que había dos posibles 
soluciones y una era aportada por usted.
 
No alcanzo a entender correctamente, al procesar el ejemplo hay números que no 
me coinciden con lo que comenta, le sugiero lo siguiente, cree nuevamente el 
ejemplo, con un cambio, un data.frame donde está lo que ahora es datain y 
dataout, que entiendo que es lo que quiere y lo que desea, pero al describir el 
ejemplo ref 1099, indique las coordenadas entre las filas (que son 12) y las 
columnas. 
 
Yo no alcanzo a comprender correctamente, ¿usa una fila con valores de otra 
fila anterior?, O se me acomodaron mal los datos y me perdí.  El caso parecido 
que yo tuve que resolver, de acuerdo a ciertos valores tomaba los de filas 
anteriores para realizar el cálculo. También necesito saber si hay una 
identificación tipo id de una base de datos o son números que no se pueden 
agrupar. Le pregunto porque en mi caso que puede ser parecido, pude realizar 
agrupación y columnas auxiliares (para procesar las condiciones).
 
Javier Rubén Marcuzzi
Técnico en Industrias Lácteas
Veterinario
 
 

De: Jorge I Velez
Enviado: martes, 20 de octubre de 2015 19:17
Para: R-help-es
Asunto: [R-es] Crear variable con condiciones
 
 
Buenas tardes a todos,
 
Quisiera crear una variable de acuerdo a ciertas condiciones.  Me gustaria
llegar de "datain" a "dataout".
 
## entrada
datain <- structure(list(REF = c("999", "999", "999", "1099", "731", "731",
"731", "731", "1442", "1442", "1442", "1442"), TIMEREF = c(120,
240, 360, 30, 30, 60, 90, 120, 30, 60, 90, 120), TOENDREF = c(390,
270, 150, 480, 480, 450, 420, 390, 480, 450, 420, 390)), .Names = c("REF",
"TIMEREF", "TOENDREF"), row.names = c(NA, 12L), class = "data.frame")
datain
 
## salida
dataout <- structure(list(REF = c(999L, 999L, 999L, 1099L, 731L, 731L,
731L,
731L, 1442L, 1442L, 1442L, 1442L), TIMEREF = c(120L, 240L, 360L,
30L, 30L, 60L, 90L, 120L, 30L, 60L, 90L, 120L), TOENDREF = c(390L,
270L, 150L, 

Re: [R] Linear regression with a rounded response variable

2015-10-21 Thread Victor Tian
Hi Ravi,

Thanks for this interesting question. My thoughts are given below.

If you believe the rounding is indeed uniformly distributed, then the
problem is equivalent with adding a uniform random error between (-0.5,
0.5) for every observation in addition to the standard normal error, which
will make the new error term have a mixture distribution.

Intuitively, the impact of this newly added term depends on the relative
scale of the original normal and the new uniform error terms. To see the
exact impact, you can simulate sets of new response variables by adding
uniform errors from (-0.5, 0.5) to the original response variables and see
the results.

I wish I could have more theoretical answers and hope this helps as well.

Best,
Xu

Xu Tian, Ph.D.
Senior Statistician
Validus Research
New York, NY 10005

On Wed, Oct 21, 2015 at 10:53 AM, Ravi Varadhan 
wrote:

> Hi,
> I am dealing with a regression problem where the response variable, time
> (second) to walk 15 ft, is rounded to the nearest integer.  I do not care
> for the regression coefficients per se, but my main interest is in getting
> the prediction equation for walking speed, given the predictors (age,
> height, sex, etc.), where the predictions will be real numbers, and not
> integers.  The hope is that these predictions should provide unbiased
> estimates of the "unrounded" walking speed. These sounds like a measurement
> error problem, where the measurement error is due to rounding and hence
> would be uniformly distributed (-0.5, 0.5).
>
> Are there any canonical approaches for handling this type of a problem?
> What is wrong with just doing the standard linear regression?
>
> I googled and saw that this question was asked by someone else in a
> stackexchange post, but it was unanswered.  Any suggestions?
>
> Thank you,
> Ravi
>
> Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
> Associate Professor,  Department of Oncology
> Division of Biostatistics & Bionformatics
> Sidney Kimmel Comprehensive Cancer Center
> Johns Hopkins University
> 550 N. Broadway, Suite -E
> Baltimore, MD 21205
> 410-502-2619
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
*Xu Tian*

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear regression with a rounded response variable

2015-10-21 Thread peter salzman
here is one thought:

if you plug in your numbers into any kind of regression you will get
prediction that are real numbers and not necessarily integers, it may be
that you predictions are good enough with this approximate value of Y. you
could test this by randomly shuffling your data by +- 0.5 and compare the
results with the original result.

let me add another idea:

if data is not fully observed this falls under the umbrella of censored
data, in this case you have interval censoring. if you see 5 then the
observations is in interval [4.5, 5.5]
i'm not familiar with the field but i'd search for 'regression with
interval censoring'


peter


On Wed, Oct 21, 2015 at 10:53 AM, Ravi Varadhan 
wrote:

> Hi,
> I am dealing with a regression problem where the response variable, time
> (second) to walk 15 ft, is rounded to the nearest integer.  I do not care
> for the regression coefficients per se, but my main interest is in getting
> the prediction equation for walking speed, given the predictors (age,
> height, sex, etc.), where the predictions will be real numbers, and not
> integers.  The hope is that these predictions should provide unbiased
> estimates of the "unrounded" walking speed. These sounds like a measurement
> error problem, where the measurement error is due to rounding and hence
> would be uniformly distributed (-0.5, 0.5).
>
> Are there any canonical approaches for handling this type of a problem?
> What is wrong with just doing the standard linear regression?
>
> I googled and saw that this question was asked by someone else in a
> stackexchange post, but it was unanswered.  Any suggestions?
>
> Thank you,
> Ravi
>
> Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
> Associate Professor,  Department of Oncology
> Division of Biostatistics & Bionformatics
> Sidney Kimmel Comprehensive Cancer Center
> Johns Hopkins University
> 550 N. Broadway, Suite -E
> Baltimore, MD 21205
> 410-502-2619
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Peter Salzman, PhD
Department of Biostatistics and Computational Biology
University of Rochester

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Update dataframe based on some conditions

2015-10-21 Thread Jorge I Velez
Dear R-help,

I am working on what it seems to be a simple problem, but after several
hours trying to come up with a solution, unfortunately I have not been able
to.

I would like to go from "datain" to "dataout", that is, create the NEWREF
variable according with some restrictions, and update the values for the
remaining variables in the original data set (which is way more bigger than
this example). The problem can be described as having products (coded as
REF) in stock. Here, the total nomber of units in stock are named TOENDREF
and those required for the customer are given by TIMEREF. The idea is to
use as many units of the previous REF as possible before using a new REF.

## input
datain <- structure(list(REF = c("999", "999", "999", "1099", "731", "731",
"731", "731", "1442", "1442", "1442", "1442"), TIMEREF = c(120,
240, 360, 30, 30, 60, 90, 120, 30, 60, 90, 120), TOENDREF = c(390,
270, 150, 480, 480, 450, 420, 390, 480, 450, 420, 390)), .Names = c("REF",
"TIMEREF", "TOENDREF"), row.names = c(NA, 12L), class = "data.frame")
datain

## output
dataout <- structure(list(REF = c(999L, 999L, 999L, 1099L, 731L, 731L,
731L,
731L, 1442L, 1442L, 1442L, 1442L), TIMEREF = c(120L, 240L, 360L,
30L, 30L, 60L, 90L, 120L, 30L, 60L, 90L, 120L), TOENDREF = c(390L,
270L, 150L, 120L, 90L, 30L, 420L, 300L, 270L, 210L, 120L, 0L),
NEWREF = c(999L, 999L, 999L, 999L, 999L, 999L, 731L, 731L,
731L, 731L, 731L, 731L)), .Names = c("REF", "TIMEREF", "TOENDREF",
"NEWREF"), row.names = c(NA, 12L), class = "data.frame")
dataout


I what follows I will try to explain what I want to accomplish:

* Example 1
Take rows 3 and 4 of "datain"

#REF TIMEREF TOENDREF
#3   999 360  150
#4  1099  30  480

As 150 units of REF 999 are available, we could substitute the 30 units of
REF 1099 with them. Hence, the 4th row of the _updated_ "datain" becomes

#REF TIMEREF TOENDREF NEWREF
#3   999 360  150  999
#4  1099  30  120  999

* Example 2
Now, let's take rows 3 to 8 of the _updated_ "datain":

#REF TIMEREF TOENDREF
#3   999 360  150
#4   999  30  120
#5   731  30  480
#6   731  60  450
#7   731  90  420
#8   731 120  390

In row 4, there 120 units available to be used. The number of units
required of REF 731 is 30, which can be easily covered by the remaining 120
units of REF 999. By doing so, the remaining units of REF 999 would then be
90.  Hence, the newly _updated_ "datain" becomes

#REF TIMEREF TOENDREF
#3   999 360  150
#4   999  30  120
#5   999  30   90
#6   999  60   30
#7   731  90  420
#8   731 120  300

Finally, the updated "datain" file after processing the remaining REF would
be

#REF TIMEREF TOENDREF
#9  731  30  270
#10 731  60  210
#11 731  90  120
#12 731 1200

Hopefully I have explained well what I would like to end up with.  If this
is not the case, I will be more than happy to provide more information.

Any help would be very much appreciated.  Thanks in advance.

Best regards,
Jorge Velez.-

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Linear regression with a rounded response variable

2015-10-21 Thread Gabor Grothendieck
This could be modeled directly using Bayesian techniques. Consider the
Bayesian version of the following model where we only observe y and X.  y0
is not observed.

   y0 <- X b + error
   y <- round(y0)

The following code is based on modifying the code in the README of the CRAN
rcppbugs R package.


library(rcppbugs)
set.seed(123)

# set up the test data - y and X are observed but not y0
NR <- 1e2L
NC <- 2L
X <- cbind(1, rnorm(10))
y0 <- X %*% 1:2
y <- round(y0)

# for comparison run a normal linear model w/ lm.fit using X and y
lm.res <- lm.fit(X,y)
print(coef(lm.res))
##x1x2
## 0.9569366 1.9170808

# RCppBugs Model
b <- mcmc.normal(rnorm(NC),mu=0,tau=0.0001)
tau.y <- mcmc.gamma(sd(as.vector(y)),alpha=0.1,beta=0.1)
y.hat <- deterministic(function(X,b) { round(X %*% b) }, X, b)
y.lik <- mcmc.normal(y,mu=y.hat,tau=tau.y,observed=TRUE)
m <- create.model(b, tau.y, y.hat, y.lik)

# run the Bayesian model based on y and X
cat("running model...\n")
runtime <- system.time(ans <- run.model(m, iterations=1e5L, burn=1e4L,
adapt=1e3L, thin=10L))
print(apply(ans[["b"]],2,mean))
## [1] 0.9882485 2.0009989


On Wed, Oct 21, 2015 at 10:53 AM, Ravi Varadhan 
wrote:

> Hi,
> I am dealing with a regression problem where the response variable, time
> (second) to walk 15 ft, is rounded to the nearest integer.  I do not care
> for the regression coefficients per se, but my main interest is in getting
> the prediction equation for walking speed, given the predictors (age,
> height, sex, etc.), where the predictions will be real numbers, and not
> integers.  The hope is that these predictions should provide unbiased
> estimates of the "unrounded" walking speed. These sounds like a measurement
> error problem, where the measurement error is due to rounding and hence
> would be uniformly distributed (-0.5, 0.5).
>
> Are there any canonical approaches for handling this type of a problem?
> What is wrong with just doing the standard linear regression?
>
> I googled and saw that this question was asked by someone else in a
> stackexchange post, but it was unanswered.  Any suggestions?
>
> Thank you,
> Ravi
>
> Ravi Varadhan, Ph.D. (Biostatistics), Ph.D. (Environmental Engg)
> Associate Professor,  Department of Oncology
> Division of Biostatistics & Bionformatics
> Sidney Kimmel Comprehensive Cancer Center
> Johns Hopkins University
> 550 N. Broadway, Suite -E
> Baltimore, MD 21205
> 410-502-2619
>
>
> [[alternative HTML version deleted]]
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>



-- 
Statistics & Software Consulting
GKX Group, GKX Associates Inc.
tel: 1-877-GKX-GROUP
email: ggrothendieck at gmail.com

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Crear variable con condiciones

2015-10-21 Thread Javier Rubén Marcuzzi
Estimado Jorge I Velez

Le envío un código, la primer parte es lo que tiene, la segunda lo que desea, a 
ambos los puse en un mismo data.frame (originales), luego agrego la idea de 
quitar un elemento en otra columna y lo sumo a todo esto en un nuevo data.frame.

A este nuevo data.frame (mis_datos) le ejecuto una función para contar, creo 
que en su caso no hay problemas para escribir la función que realiza la tarea 
que usted necesita.

Ejecute el siguiente código y creo que entenderá cuál es mi idea para su 
problema

## entrada
datain <- structure(list(REF = c("999", "999", "999", "1099", "731", "731", 
"731", "731", "1442", "1442", "1442", "1442"),
 TIMEREF = c(120,240, 360, 30, 30, 60, 90, 120, 30, 60, 
90, 120),
 TOENDREF = c(390,270, 150, 480, 480, 450, 420, 390, 
480, 450, 420, 390)),
  .Names = c("REF","TIMEREF", "TOENDREF"),row.names = c(NA, 
12L), class = "data.frame")
datain

## salida
dataout <- structure(list(REF = c(999L, 999L, 999L, 1099L, 731L, 731L, 731L, 
731L, 1442L, 1442L, 1442L, 1442L),
  TIMEREF = c(120L, 240L, 360L,30L, 30L, 60L, 90L, 
120L, 30L, 60L, 90L, 120L),
  TOENDREF = c(390L, 270L, 150L, 120L, 90L, 30L, 420L, 
300L, 270L, 210L, 120L, 0L),
  NEWREF = c(999L, 999L, 999L, 999L, 999L, 999L, 731L, 
731L, 731L, 731L, 731L, 731L)),
 .Names = c("REF", "TIMEREF", "TOENDREF", "NEWREF"), 
row.names = c(NA, 12L), class = "data.frame")
dataout

originales <- data.frame(datain, dataout)
originales

aux0 <- originales$TOENDREF
#
# primer elemento 0
# luego todos menos el ultimo
# este es eliminado posición lenght(aux0)
aux <- c(0,(aux0[-(length(aux0))]))

mis_datos <- data.frame(originales, aux)
mis_datos
cuento<-do.call(rbind, by(mis_datos, mis_datos$REF, function(x) 
cbind(1:nrow(x 

Básicamente modificando cuento <- ….. podría solucionarlo (creo).

Javier Rubén Marcuzzi
Técnico en Industrias Lácteas
Veterinario



De: Javier Rubén Marcuzzi
Enviado: miércoles, 21 de octubre de 2015 15:25
Para: Jorge I Velez
CC: R-help-es
Asunto: RE: [R-es] Crear variable con condiciones


Estimado Jorge I Velez

Recién llego y leo los correos mientras me acuesto a descansar un rato, puede 
ser que no alcance a razonarlo bien, pero posiblemente por la resolución a un 
problema parecido, lo que yo use es la creación de una nueva columna y se la 
agregue al mismo data.frame, pero a esta nueva columna, que es una lista copia 
de los originales pero le quite el primero y le agregue un valor al final (para 
que los n sean iguales, creo que usted tendría que realizar lo contrario), 
luego utilicé una función porque habría algo de proceso, pero básicamente con 
un if pueda decidir si utilizar una columna o la otra ( TOENDREF O 
TOENDREF_modificada, todos corridos una fila hacia abajo).

Tendría que probarlo, pero seguro que usted lo realiza mas rápido y seguro 
(cansado no debo escribir R).

Javier Rubén Marcuzzi
Técnico en Industrias Lácteas
Veterinario



De: Jorge I Velez
Enviado: miércoles, 21 de octubre de 2015 10:26
Para: Javier Rubén Marcuzzi
CC: R-help-es
Asunto: Re: [R-es] Crear variable con condiciones


Muchas gracias Javier por tu respuesta.

Si. Para obtener "dataout" se utilizan filas anteriores de acuerdo con la 
disponibilidad de la variable TOENDREF para cada valor de la variable REF.  Por 
ejemplo, las filas 3 y 4 de "datain" son

#REF TIMEREF TOENDREF
#3  999     360      150
#4 1099      30      480

En la fila 3, el valor de TOENDREF es 150. Esto indica que hay 150 unidades 
disponibles de esa referencia. Ahora, en la fila 4, TIMEREF es 30 para REF = 
1099.  Como en esta fila TIMEREF es menor que TOENDREF para la referencia 
anterior, entonces la nueva variable NEWREF debe ser 999 y no 1099.  El nuevo 
valor de TOENDREF en esta fila sera 150 - 30 = 120.  Esta seria la fila 4 de 
"dataout":

REF TIMEREF TOENDREF NEWREF
#4  1099      30      120    999

Para la fila 5 de "dataout", los recursos disponibles corresponden al _nuevo_ 
valor de TOENDREF en NEWREF (i.e., 120).  Siguiendo la misma logica anterior, 
obtenemos entonces las filas 5 a 12 de "dataout":

REF TIMEREF TOENDREF NEWREF
#5   731      30       90    999
#6   731      60       30    999
#7   731      90      420    731
#8   731     120      300    731
#9  1442      30      270    731
#10 1442      60      210    731
#11 1442      90      120    731
#12 1442     120        0    731

Observa que en la ultima fila se agotaron todos los recursos de TOENDREF para 
NEWREF = 731, por lo que no fue necesario utilizar la REF = 1442.

Espero que esta vez las cosas sean un poco mas claras.

Los datos se pueden agrupar por la variable REF, que basicamente se refiere a 
la referencia de un producto.  Si aun tengo disponibilidad de ese producto 
(variable TOENDREF) entonces lo utilizo y cancelo la referencia siguiente.  Las 
unidades que se piden de cada producto corresponden a la 

Re: [R-es] Crear variable con condiciones

2015-10-21 Thread Javier Rubén Marcuzzi

Jorge

Creo que se perdió un correo donde decía algunas cosas y envié lo siguiente 
(básicamente, modificando lo que yo nombre “cuento”, con una función específica 
para usted…).

## entrada
datain <- structure(list(REF = c("999", "999", "999", "1099", "731", "731", 
"731", "731", "1442", "1442", "1442", "1442"),
 TIMEREF = c(120,240, 360, 30, 30, 60, 90, 120, 30, 60, 
90, 120),
 TOENDREF = c(390,270, 150, 480, 480, 450, 420, 390, 
480, 450, 420, 390)),
  .Names = c("REF","TIMEREF", "TOENDREF"),row.names = c(NA, 
12L), class = "data.frame")
datain

## salida
dataout <- structure(list(REF = c(999L, 999L, 999L, 1099L, 731L, 731L, 731L, 
731L, 1442L, 1442L, 1442L, 1442L),
  TIMEREF = c(120L, 240L, 360L,30L, 30L, 60L, 90L, 
120L, 30L, 60L, 90L, 120L),
  TOENDREF = c(390L, 270L, 150L, 120L, 90L, 30L, 420L, 
300L, 270L, 210L, 120L, 0L),
  NEWREF = c(999L, 999L, 999L, 999L, 999L, 999L, 731L, 
731L, 731L, 731L, 731L, 731L)),
 .Names = c("REF", "TIMEREF", "TOENDREF", "NEWREF"), 
row.names = c(NA, 12L), class = "data.frame")
dataout

originales <- data.frame(datain, dataout)
originales

aux0 <- originales$TOENDREF
#
# primer elemento 0
# luego todos menos el ultimo
# este es eliminado posición lenght(aux0)
aux <- c(0,(aux0[-(length(aux0))]))

mis_datos <- data.frame(originales, aux)
mis_datos
cuento<-do.call(rbind, by(mis_datos, mis_datos$REF, function(x) 
cbind(1:nrow(x

 <- data.frame(mis_datos, cuento)


Escribiría más pero ya me estoy yendo y a mi regreso posiblemente usted tenga 
el problema resuelto (es muy inteligente).



Javier Rubén Marcuzzi
Técnico en Industrias Lácteas
Veterinario



De: Javier Rubén Marcuzzi
Enviado: miércoles, 21 de octubre de 2015 15:25
Para: Jorge I Velez
CC: R-help-es
Asunto: RE: [R-es] Crear variable con condiciones


Estimado Jorge I Velez

Recién llego y leo los correos mientras me acuesto a descansar un rato, puede 
ser que no alcance a razonarlo bien, pero posiblemente por la resolución a un 
problema parecido, lo que yo use es la creación de una nueva columna y se la 
agregue al mismo data.frame, pero a esta nueva columna, que es una lista copia 
de los originales pero le quite el primero y le agregue un valor al final (para 
que los n sean iguales, creo que usted tendría que realizar lo contrario), 
luego utilicé una función porque habría algo de proceso, pero básicamente con 
un if pueda decidir si utilizar una columna o la otra ( TOENDREF O 
TOENDREF_modificada, todos corridos una fila hacia abajo).

Tendría que probarlo, pero seguro que usted lo realiza mas rápido y seguro 
(cansado no debo escribir R).

Javier Rubén Marcuzzi
Técnico en Industrias Lácteas
Veterinario



De: Jorge I Velez
Enviado: miércoles, 21 de octubre de 2015 10:26
Para: Javier Rubén Marcuzzi
CC: R-help-es
Asunto: Re: [R-es] Crear variable con condiciones


Muchas gracias Javier por tu respuesta.

Si. Para obtener "dataout" se utilizan filas anteriores de acuerdo con la 
disponibilidad de la variable TOENDREF para cada valor de la variable REF.  Por 
ejemplo, las filas 3 y 4 de "datain" son

#REF TIMEREF TOENDREF
#3  999     360      150
#4 1099      30      480

En la fila 3, el valor de TOENDREF es 150. Esto indica que hay 150 unidades 
disponibles de esa referencia. Ahora, en la fila 4, TIMEREF es 30 para REF = 
1099.  Como en esta fila TIMEREF es menor que TOENDREF para la referencia 
anterior, entonces la nueva variable NEWREF debe ser 999 y no 1099.  El nuevo 
valor de TOENDREF en esta fila sera 150 - 30 = 120.  Esta seria la fila 4 de 
"dataout":

REF TIMEREF TOENDREF NEWREF
#4  1099      30      120    999

Para la fila 5 de "dataout", los recursos disponibles corresponden al _nuevo_ 
valor de TOENDREF en NEWREF (i.e., 120).  Siguiendo la misma logica anterior, 
obtenemos entonces las filas 5 a 12 de "dataout":

REF TIMEREF TOENDREF NEWREF
#5   731      30       90    999
#6   731      60       30    999
#7   731      90      420    731
#8   731     120      300    731
#9  1442      30      270    731
#10 1442      60      210    731
#11 1442      90      120    731
#12 1442     120        0    731

Observa que en la ultima fila se agotaron todos los recursos de TOENDREF para 
NEWREF = 731, por lo que no fue necesario utilizar la REF = 1442.

Espero que esta vez las cosas sean un poco mas claras.

Los datos se pueden agrupar por la variable REF, que basicamente se refiere a 
la referencia de un producto.  Si aun tengo disponibilidad de ese producto 
(variable TOENDREF) entonces lo utilizo y cancelo la referencia siguiente.  Las 
unidades que se piden de cada producto corresponden a la variable TIMEREF.

Gracias a todos de antemano por sus sugerencias.

Saludos,
Jorge 
​Velez.-


2015-10-20 22:30 GMT-05:00 Javier Rubén Marcuzzi 
:
Estimado Jorge I Velez
 
Yo hace unos años tuve un problema 

[R-es] help Mapa de Calor con Google Maps de fondo

2015-10-21 Thread Javier Villacampa González
Como te comenta Alex te podría ser util la entreda de mi blog es con
imagenes pero la lógica es la misma (
http://ncymat.blogspot.com.es/2015/10/plotear-datos-de-eye-tracker.html).
Te puede interesar especialmente la última parte.
scale_fill_manual(values = Colors , breaks = Breaks, guide = F) + # We lost
the guide. Si quieres algo más por zonas Ahora estoy investigando esto
stat_binhex(bins = 10, colour = "gray", alpha = 0.5)
coord_fixed()
http://thedatagame.com.au/2015/09/27/how-to-create-nba-shot-charts-in-r/

Soy muy novel en esto, así que tal vez no seá lo que necesitas

Un cordial saludo y buean suerta.



--

[[alternative HTML version deleted]]

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Linear regression with a rounded response variable

2015-10-21 Thread peter dalgaard

> On 21 Oct 2015, at 19:57 , Charles C. Berry  wrote:
> 
> On Wed, 21 Oct 2015, Ravi Varadhan wrote:
> 
>> [snippage]
> 
> If half the subjects have a value of 5 seconds and the rest are split between 
> 4 and 6, your assertion that rounding induces an error of 
> dunif(epsilon,-0.5,0.5) is surely wrong (more positive errors in the 6 second 
> group and more negative errors in the 4 second group under any plausible 
> model).

Yes, and I think that the suggestion in another post to look at censored 
regression is more in the right direction. 

In general, I'd expect the bias caused by rounding the response to quite small, 
except at very high granularity. I did a few small experiments with the 
simplest possible linear model: estimating a mean based on highly rounded data,

> y <- round(rnorm(1e2,pi,.5))
> mean(y)
[1] 3.12
> table(y)
y
 2  3  4  5 
13 63 23  1 

Or, using a bigger sample:

> mean(round(rnorm(1e8,pi,.5)))
[1] 3.139843

in which there is a visible bias, but quite a small one: 

> pi - 3.139843
[1] 0.001749654

At lower granularity (sd=1 instead of .5), the bias has almost disappeared.

> mean(round(rnorm(1e8,pi,1)))
[1] 3.141577

If the granularity is increased sufficiently, you _will_ see a sizeable bias 
(because almost all observations will be round(pi)==3):

> mean(round(rnorm(1e8,pi,.1)))
[1] 3.00017


A full ML fit (with known sigma=1) is pretty easily done:

> library(stats4)
> mll <- function(mu)-sum(log(pnorm(y+.5,mu, .5)-pnorm(y-.5, mu, .5)))
> mle(mll,start=list(mu=3))

Call:
mle(minuslogl = mll, start = list(mu = 3))

Coefficients:
  mu 
3.122069 
> mean(y)
[1] 3.12

As you see, the difference is only 0.002. 

A small simulation (1000 repl.) gave (r[1,]==MLE ; r{2,]==mean)

> summary(r[1,]-r[2,])
 Min.   1st Qu.Median  Mean   3rd Qu.  Max. 
-0.004155  0.000702  0.001495  0.001671  0.002554  0.006860 

so the corrections relative to the crude mean stay within one unit in the 2nd 
place. Notice  that the corrections are pretty darn close to cancelling out the 
bias.

-pd

> 
> 
> HTH,
> 
> Chuck
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Crear variable con condiciones

2015-10-21 Thread Jorge I Velez
Muchas gracias Javier por tu respuesta.

Si. Para obtener "dataout" se utilizan filas anteriores de acuerdo con la
disponibilidad de la variable TOENDREF para cada valor de la variable REF.
Por ejemplo, las filas 3 y 4 de "datain" son

#REF TIMEREF TOENDREF
#3  999 360  150
#4 1099  30  480

En la fila 3, el valor de TOENDREF es 150. Esto indica que hay 150
unidades disponibles de esa referencia. Ahora, en la fila 4, TIMEREF es 30
para REF = 1099.  Como en esta fila TIMEREF es menor que TOENDREF para la
referencia anterior, entonces la nueva variable NEWREF debe ser 999 y no
1099.  El nuevo valor de TOENDREF en esta fila sera 150 - 30 = 120.  Esta
seria la fila 4 de "dataout":

REF TIMEREF TOENDREF NEWREF
#4  1099  30  120999

Para la fila 5 de "dataout", los recursos disponibles corresponden al
_nuevo_ valor de TOENDREF en NEWREF (i.e., 120).  Siguiendo la misma logica
anterior, obtenemos entonces las filas 5 a 12 de "dataout":

REF TIMEREF TOENDREF NEWREF
#5   731  30   90999
#6   731  60   30999
#7   731  90  420731
#8   731 120  300731
#9  1442  30  270731
#10 1442  60  210731
#11 1442  90  120731
#12 1442 1200731

Observa que en la ultima fila se agotaron todos los recursos de TOENDREF
para NEWREF = 731, por lo que no fue necesario utilizar la REF = 1442.

Espero que esta vez las cosas sean un poco mas claras.

Los datos se pueden agrupar por la variable REF, que basicamente se refiere
a la referencia de un producto.  Si aun tengo disponibilidad de ese
producto (variable TOENDREF) entonces lo utilizo y cancelo la referencia
siguiente.  Las unidades que se piden de cada producto corresponden a la
variable TIMEREF.

Gracias a todos de antemano por sus sugerencias.

Saludos,
Jorge
​Velez.-


2015-10-20 22:30 GMT-05:00 Javier Rubén Marcuzzi <
javier.ruben.marcu...@gmail.com>:

> Estimado Jorge I Velez
>
>
>
> Yo hace unos años tuve un problema parecido y creo que había dos posibles
> soluciones y una era aportada por usted.
>
>
>
> No alcanzo a entender correctamente, al procesar el ejemplo hay números
> que no me coinciden con lo que comenta, le sugiero lo siguiente, cree
> nuevamente el ejemplo, con un cambio, un data.frame donde está lo que ahora
> es datain y dataout, que entiendo que es lo que quiere y lo que desea, pero
> al describir el ejemplo ref 1099, indique las coordenadas entre las filas
> (que son 12) y las columnas.
>
>
>
> Yo no alcanzo a comprender correctamente, ¿usa una fila con valores de
> otra fila anterior?, O se me acomodaron mal los datos y me perdí.  El caso
> parecido que yo tuve que resolver, de acuerdo a ciertos valores tomaba los
> de filas anteriores para realizar el cálculo. También necesito saber si hay
> una identificación tipo id de una base de datos o son números que no se
> pueden agrupar. Le pregunto porque en mi caso que puede ser parecido, pude
> realizar agrupación y columnas auxiliares (para procesar las condiciones).
>
>
>
> Javier Rubén Marcuzzi
> Técnico en Industrias Lácteas
> Veterinario
>
>
>
>
>
>
> *De: *Jorge I Velez
> *Enviado: *martes, 20 de octubre de 2015 19:17
> *Para: *R-help-es
> *Asunto: *[R-es] Crear variable con condiciones
>
>
>
>
>
> Buenas tardes a todos,
>
>
>
> Quisiera crear una variable de acuerdo a ciertas condiciones.  Me gustaria
>
> llegar de "datain" a "dataout".
>
>
>
> ## entrada
>
> datain <- structure(list(REF = c("999", "999", "999", "1099", "731", "731",
>
> "731", "731", "1442", "1442", "1442", "1442"), TIMEREF = c(120,
>
> 240, 360, 30, 30, 60, 90, 120, 30, 60, 90, 120), TOENDREF = c(390,
>
> 270, 150, 480, 480, 450, 420, 390, 480, 450, 420, 390)), .Names = c("REF",
>
> "TIMEREF", "TOENDREF"), row.names = c(NA, 12L), class = "data.frame")
>
> datain
>
>
>
> ## salida
>
> dataout <- structure(list(REF = c(999L, 999L, 999L, 1099L, 731L, 731L,
>
> 731L,
>
> 731L, 1442L, 1442L, 1442L, 1442L), TIMEREF = c(120L, 240L, 360L,
>
> 30L, 30L, 60L, 90L, 120L, 30L, 60L, 90L, 120L), TOENDREF = c(390L,
>
> 270L, 150L, 120L, 90L, 30L, 420L, 300L, 270L, 210L, 120L, 0L),
>
> NEWREF = c(999L, 999L, 999L, 999L, 999L, 999L, 731L, 731L,
>
> 731L, 731L, 731L, 731L)), .Names = c("REF", "TIMEREF", "TOENDREF",
>
> "NEWREF"), row.names = c(NA, 12L), class = "data.frame")
>
> dataout
>
>
>
>
>
> A continuacion describo dos casos puntuales para ilustrar las condiciones
>
> que deben satisfacerse:
>
>
>
> * Ejemplo 1
>
> En REF = '1099', el TIMEREF es inferior al valor TOENDREF para REF = 99
>
> (i.e., 30 < 150). Por lo tanto, la nueva variable "NEWREF" debe tomar el
>
> valor de '99'.  Al realizar esta asignacion, se tiene que la fila 4 de
>
> "datain" se convierte en
>
>
>
> #REF TIMEREF TOENDREF NEWREF
>
> #4 1099  30  120999
>
>
>
> Aqui, 120 es el valor de TOENDREF para REF = '99'.
>
>
>
> * Ejemplo 2
>
> El proceso continua reasignando el REF hasta que el valor resultante de
>
> TOENDREF sea inferior (o 

Re: [R] ggplot2: discontinuous ribbon

2015-10-21 Thread sbihorel

Thanks!

On 10/21/2015 5:25 AM, Thierry Onkelinx wrote:

Dear Sebastien,

You are looking for geom_polygon().

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature 
and Forest

team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no 
more than asking him to perform a post-mortem examination: he may be 
able to say what the experiment died of. ~ Sir Ronald Aylmer Fisher

The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does 
not ensure that a reasonable answer can be extracted from a given body 
of data. ~ John Tukey


2015-10-21 11:20 GMT+02:00 sbihorel 
>:


Hi

I would like to use ggplot2 to create a 2d plot showing a series
of shaded areas that are not continuous with respect to the x-axis
variable. The expected result is illustrated below using
lattice/grid functions.

-
pdata <- data.frame(
  x=c(1,2,2,1,NA,3,4,4,3,NA,5,6,6,5),
  y=c(3,3,2,2,NA,2,2,1,1,NA,2.5,3,2,2))

lattice::xyplot((1:6)~(1:6),panel=function(pdata=pdata){
  grid::grid.polygon(pdata$x,pdata$y,
   default.units='native',
   gp=grid::gpar(fill=1,col=NULL,lty=0))
},pdata=pdata)
-

Here is my attempt to reproduce this plot in ggplot.

-
library(ggplot2)
data <- data.frame(
  x=c(1,2,NA,3,4,NA,5,6),
  ymin=c(2,2,NA,1,1,NA,2,2),
  ymax=c(3,3,NA,2,2,NA,2.5,3)
)

ggplot(data,aes(x=x))+geom_ribbon(aes(ymin=ymin,ymax=ymax))
-

Obviously, either geom_ribbon expects continuity in the data or I
need to setup my data and/or call differently...

Thanks for your help

Sebastien

__
R-help@r-project.org  mailing list --
To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.




__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] quantreg package: residuals

2015-10-21 Thread T.Riedle

Greetings R Community,

I am running quantile regressions using quantreg in R. I also plot the 
residuals in a QQplot which indicate fat tails. I would like to try using 
Student distribution, but I do not know if the R software allows it for my task 
in hand.

In my opinion it is very likely that there is a structural break and if that is 
not taken into consideration by the rq() function leading to QQ plots which 
display nonlinearity. Hence, the model is slightly misspecified.
I was also wondering if I can cope with the nonlinearity by using a sandwich 
estimate in the summary.rq() function such as "ker".

How can I modify the model to improve the model specification and the standard 
errors specifications? Can I modify the regression model or do I have to change 
the method used to compute the error terms in summary.rq()?

Thanks for your feedback.


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] ggplot2: discontinuous ribbon

2015-10-21 Thread Thierry Onkelinx
Dear Sebastien,

You are looking for geom_polygon().

Best regards,

ir. Thierry Onkelinx
Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
Forest
team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
Kliniekstraat 25
1070 Anderlecht
Belgium

To call in the statistician after the experiment is done may be no more
than asking him to perform a post-mortem examination: he may be able to say
what the experiment died of. ~ Sir Ronald Aylmer Fisher
The plural of anecdote is not data. ~ Roger Brinner
The combination of some data and an aching desire for an answer does not
ensure that a reasonable answer can be extracted from a given body of data.
~ John Tukey

2015-10-21 11:20 GMT+02:00 sbihorel :

> Hi
>
> I would like to use ggplot2 to create a 2d plot showing a series of shaded
> areas that are not continuous with respect to the x-axis variable. The
> expected result is illustrated below using lattice/grid functions.
>
> -
> pdata <- data.frame(
>   x=c(1,2,2,1,NA,3,4,4,3,NA,5,6,6,5),
>   y=c(3,3,2,2,NA,2,2,1,1,NA,2.5,3,2,2))
>
> lattice::xyplot((1:6)~(1:6),panel=function(pdata=pdata){
>   grid::grid.polygon(pdata$x,pdata$y,
>default.units='native',
>gp=grid::gpar(fill=1,col=NULL,lty=0))
> },pdata=pdata)
> -
>
> Here is my attempt to reproduce this plot in ggplot.
>
> -
> library(ggplot2)
> data <- data.frame(
>   x=c(1,2,NA,3,4,NA,5,6),
>   ymin=c(2,2,NA,1,1,NA,2,2),
>   ymax=c(3,3,NA,2,2,NA,2.5,3)
> )
>
> ggplot(data,aes(x=x))+geom_ribbon(aes(ymin=ymin,ymax=ymax))
> -
>
> Obviously, either geom_ribbon expects continuity in the data or I need to
> setup my data and/or call differently...
>
> Thanks for your help
>
> Sebastien
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] ggplot2: discontinuous ribbon

2015-10-21 Thread sbihorel

Hi

I would like to use ggplot2 to create a 2d plot showing a series of 
shaded areas that are not continuous with respect to the x-axis 
variable. The expected result is illustrated below using lattice/grid 
functions.


-
pdata <- data.frame(
  x=c(1,2,2,1,NA,3,4,4,3,NA,5,6,6,5),
  y=c(3,3,2,2,NA,2,2,1,1,NA,2.5,3,2,2))

lattice::xyplot((1:6)~(1:6),panel=function(pdata=pdata){
  grid::grid.polygon(pdata$x,pdata$y,
   default.units='native',
   gp=grid::gpar(fill=1,col=NULL,lty=0))
},pdata=pdata)
-

Here is my attempt to reproduce this plot in ggplot.

-
library(ggplot2)
data <- data.frame(
  x=c(1,2,NA,3,4,NA,5,6),
  ymin=c(2,2,NA,1,1,NA,2,2),
  ymax=c(3,3,NA,2,2,NA,2.5,3)
)

ggplot(data,aes(x=x))+geom_ribbon(aes(ymin=ymin,ymax=ymax))
-

Obviously, either geom_ribbon expects continuity in the data or I need 
to setup my data and/or call differently...


Thanks for your help

Sebastien

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Transfer a 3-dimensional array to a matrix in R

2015-10-21 Thread Chunyu Dong
thank you very much! Both the two methods work well for my data.


Best wishes,


Chunyu








At 2015-10-20 19:48:18, "William Dunlap"  wrote:
>Or use aperm() (array index permuation):
>  > array(aperm(x, c(2,1,3)), c(6,3))
>   [,1] [,2] [,3]
>  [1,]17   13
>  [2,]4   10   16
>  [3,]28   14
>  [4,]5   11   17
>  [5,]39   15
>  [6,]6   12   18
>
>Bill Dunlap
>TIBCO Software
>wdunlap tibco.com
>
>
>On Tue, Oct 20, 2015 at 11:31 AM, John Laing  wrote:
>>> x <- array(1:18, dim=c(3, 2, 3))
>>> x
>> , , 1
>>
>>  [,1] [,2]
>> [1,]14
>> [2,]25
>> [3,]36
>>
>> , , 2
>>
>>  [,1] [,2]
>> [1,]7   10
>> [2,]8   11
>> [3,]9   12
>>
>> , , 3
>>
>>  [,1] [,2]
>> [1,]   13   16
>> [2,]   14   17
>> [3,]   15   18
>>
>>> apply(x, 3, t)
>>  [,1] [,2] [,3]
>> [1,]17   13
>> [2,]4   10   16
>> [3,]28   14
>> [4,]5   11   17
>> [5,]39   15
>> [6,]6   12   18
>>
>>
>> On Tue, Oct 20, 2015 at 12:39 PM, Chunyu Dong 
>> wrote:
>>
>>> Hello!
>>>
>>>
>>> Recently I am trying to transfer a large 3-dimensional array to a matrix.
>>> For example, a array like:
>>> , , 1
>>>  [,1] [,2]
>>> [1,]14
>>> [2,]25
>>> [3,]36
>>> , , 2
>>>  [,1] [,2]
>>> [1,]7   10
>>> [2,]8   11
>>> [3,]9   12
>>> , , 3
>>>  [,1] [,2]
>>> [1,]   13   16
>>> [2,]   14   17
>>> [3,]   15   18
>>>
>>>
>>> I would like to transfer it to a matrix like:
>>> 17  13
>>> 41016
>>> 28  14
>>> 51117
>>> 39  15
>>> 61218
>>>
>>>
>>> Could you tell me how to do it in R ? Thank you very much!
>>>
>>>
>>> Best regards,
>>> Chunyu
>>>
>>>
>>>
>>>
>>>
>>> [[alternative HTML version deleted]]
>>>
>>> __
>>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide
>>> http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>
>> [[alternative HTML version deleted]]
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] r-markdown - keeping figures

2015-10-21 Thread John Kane
It may not be elegant but you can just embed a png() command in the knitr code. 
 Code from RStudio example with png() command added.


```{r, echo=FALSE}
plot(cars)
png("~/Rjunk/pnd.png")
plot(cars)
dev.off()
```

John Kane
Kingston ON Canada


> -Original Message-
> From: wewol...@gmail.com
> Sent: Tue, 20 Oct 2015 18:18:04 +0200
> To: r-help@r-project.org
> Subject: [R] r-markdown - keeping figures
> 
> I am running r-markdown from r-studio and can't work out how to keep
> the figures.
> I mean I have a few figures in the document and would like to have
> them as separate pdf's too as I have been used to have them when using
> Sweave.
> 
> 
> 
> best regards
> Witold
> 
> 
> --
> Witold Eryk Wolski
> 
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.


FREE ONLINE PHOTOSHARING - Share your photos online with your friends and 
family!
Visit http://www.inbox.com/photosharing to find out more!

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] apply function across dataframe columns for non-exclusive groups

2015-10-21 Thread Alexander Shenkin

Hello all,

I've been banging my head over what must be a simple solution.  I would 
like to apply a function across columns of a dataframe for rows grouped 
across different columns.  These groups are not exclusive.  See below 
for an example.  Happy to use dplyr, data.table, or whatever.  Any 
guidance appreciated!


Thanks,
Allie


desired algorithm: calculate a/(a+b) for each TRUE and FALSE grouping of 
columns grp1 and grp2.


this_df = data.frame(a = c(1,2,3,4,5), b = c(7,8,9,10,11), grp1 = 
c(T,T,F,F,F), grp2 = c(F,T,F,T,F))


desired output (doesn't have to be exactly this format, but something 
along these lines):


grp1 T 0.166
grp1 F 0.286
grp2 T 0.25
grp2 F 0.25

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] r-markdown - keeping figures

2015-10-21 Thread Bob O'Hara
The figures should be saved somewhere. e.g. if you have x.Rmd, you
should have a X_files/ folder with subfolders for the figures (e.g.
X-html or X-latex). At least that's what I have.

Bob

On 20 October 2015 at 18:18, Witold E Wolski  wrote:
> I am running r-markdown from r-studio and can't work out how to keep
> the figures.
> I mean I have a few figures in the document and would like to have
> them as separate pdf's too as I have been used to have them when using
> Sweave.
>
>
>
> best regards
> Witold
>
>
> --
> Witold Eryk Wolski
>
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



-- 
Bob O'Hara

Biodiversity and Climate Research Centre
Senckenberganlage 25
D-60325 Frankfurt am Main,
Germany

Tel: +49 69 798 40226
Mobile: +49 1515 888 5440
WWW:   http://www.bik-f.de/root/index.php?page_id=219
Blog: http://occamstypewriter.org/boboh/
Journal of Negative Results - EEB: www.jnr-eeb.org

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] r-markdown - keeping figures

2015-10-21 Thread Jeff Newmiller
I think the default now is to not save them unless you set the fig.path chunk 
option.

http://kbroman.org/knitr_knutshell/pages/Rmarkdown.html
---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On October 21, 2015 1:47:33 PM GMT+02:00, Bob O'Hara  wrote:
>The figures should be saved somewhere. e.g. if you have x.Rmd, you
>should have a X_files/ folder with subfolders for the figures (e.g.
>X-html or X-latex). At least that's what I have.
>
>Bob
>
>On 20 October 2015 at 18:18, Witold E Wolski 
>wrote:
>> I am running r-markdown from r-studio and can't work out how to keep
>> the figures.
>> I mean I have a few figures in the document and would like to have
>> them as separate pdf's too as I have been used to have them when
>using
>> Sweave.
>>
>>
>>
>> best regards
>> Witold
>>
>>
>> --
>> Witold Eryk Wolski
>>
>> __
>> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] apply function across dataframe columns for non-exclusive groups

2015-10-21 Thread Jeff Newmiller
The calculation appears to be sum(a)/(sum(a)+sum(b)).

library(dplyr)
library(tidyr)
result <- (   this_df
  %>% gather( group, truth, -c(a,b) )
  %>% group_by( group, truth )
  %>% summarise( calc = sum(a)/(sum(a)+sum(b)) )
  %>% as.data.frame
  )

---
Jeff NewmillerThe .   .  Go Live...
DCN:Basics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On October 21, 2015 1:30:46 PM GMT+02:00, Alexander Shenkin  
wrote:
>Hello all,
>
>I've been banging my head over what must be a simple solution.  I would
>
>like to apply a function across columns of a dataframe for rows grouped
>
>across different columns.  These groups are not exclusive.  See below 
>for an example.  Happy to use dplyr, data.table, or whatever.  Any 
>guidance appreciated!
>
>Thanks,
>Allie
>
>
>desired algorithm: calculate a/(a+b) for each TRUE and FALSE grouping
>of 
>columns grp1 and grp2.
>
>this_df = data.frame(a = c(1,2,3,4,5), b = c(7,8,9,10,11), grp1 = 
>c(T,T,F,F,F), grp2 = c(F,T,F,T,F))
>
>desired output (doesn't have to be exactly this format, but something 
>along these lines):
>
>grp1 T 0.166
>grp1 F 0.286
>grp2 T 0.25
>grp2 F 0.25
>
>__
>R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
>https://stat.ethz.ch/mailman/listinfo/r-help
>PLEASE do read the posting guide
>http://www.R-project.org/posting-guide.html
>and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R-es] Resumen de R-help-es, Vol 80, Envío 28

2015-10-21 Thread Gilsanz, Jose Luis
ps pero por mas que miro no veo la manera de meter el objeto
> superf ?dentro?
> >> de un mapa.
> >>
> >>
> >>
> >> Muchas Gracias
> >>
> >>
> >>
> >>
> >>
> >> TASACIONES HIPOTECARIAS S.A.
> >> Registration number: A-28/806222.
> >> Registered Office: Pº de la Castellana, 79 - 1ª ; 28046 Madrid
> >>
> >> This e-mail is for the use of the intended recipient(s) only. If you
> >> have received this e-mail in error, please notify the sender
> >> immediately and then delete it. If you are not the intended
> >> recipient, you must not use, disclose or distribute this e-mail without the
> author's prior permission.
> >> We have taken precautions to minimise the risk of transmitting
> >> software viruses, but we advise you to carry out your own virus
> >> checks on any attachment to this message. We cannot accept liability
> >> for any loss or damage caused by software viruses. If you are the
> >> intended recipient and you do not wish to receive similar electronic
> >> messages from us in future then please respond to the sender to this
> >> effect
> >>
> >> ___
> >> R-help-es mailing list
> >> R-help-es@r-project.org
> >> https://stat.ethz.ch/mailman/listinfo/r-help-es
> >>
> >
> >
> > ___
> > R-help-es mailing list
> > R-help-es@r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-help-es
> >
>  próxima parte 
> Se ha borrado un adjunto en formato HTML...
> URL: <https://stat.ethz.ch/pipermail/r-help-
> es/attachments/20151021/14626e2b/attachment.html>
>  próxima parte 
> A non-text attachment was scrubbed...
> Name: image001.png
> Type: image/png
> Size: 13312 bytes
> Desc: no disponible
> URL: <https://stat.ethz.ch/pipermail/r-help-
> es/attachments/20151021/14626e2b/attachment.png>
>  próxima parte 
> A non-text attachment was scrubbed...
> Name: image002.png
> Type: image/png
> Size: 265395 bytes
> Desc: no disponible
> URL: <https://stat.ethz.ch/pipermail/r-help-
> es/attachments/20151021/14626e2b/attachment-0001.png>
> 
> --
> 
> Subject: Pié de página del digest
> 
> ___
> R-help-es mailing list
> R-help-es@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-help-es
> 
> --
> 
> Fin de Resumen de R-help-es, Vol 80, Envío 28
> *

___
R-help-es mailing list
R-help-es@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-help-es


Re: [R] Error in rep.int() invalid 'times' value

2015-10-21 Thread PIKAL Petr
Hi

Several options

Coding in C+ or other similar language (you seem to be more familiar with them)

Using debug and find out how your function behaves in steps

Using function with smaller m, n with Rprof to see where the time is spent.

I believe that coding in R like it was C+ is the way to hell. There is nothing 
wrong with cycles however if you compute something which can be computed easily 
by vectorised approach you loose efficiency.

e.g.

you compute this
m<-5
> for ( i in 1:m-1)
+{
+b[i]<- (m-i)
+}
> b
[1] 4 3 2 1

but you can achieve it by

b <-  ((m-1):1)

Time comparison:

> m<-1e5
> system.time(for ( i in 1:m-1) {b[i]<- (m-i)})
   user  system elapsed
  11.610.00   12.11
> system.time(bb <- ((m-1):1))
   user  system elapsed
  0   0   0
> all.equal(b,bb)
[1] TRUE

So the time gain is huge only in this small computation.

Cheers
Petr

> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Maram
> SAlem
> Sent: Tuesday, October 20, 2015 6:33 PM
> To: William Dunlap
> Cc: r-help@r-project.org
> Subject: Re: [R] Error in rep.int() invalid 'times' value
>
> Yes Indeed William. f1() works perfectly well and took only 30 secs to
> execute f1(25,15), but I wonder if there is anyway to speed up the
> execution of the rest of my code (almost seven hours now) ?
>
> Thanks for helping.
>
> Maram Salem
>
> On 20 October 2015 at 18:11, William Dunlap  wrote:
>
> > f0 is essentially your original code put into a function, so expect
> it
> > to fail in the same way your original code did.
> > f1 should give the same answer as f0, but it should use less memory
> > and time.
> > Bill Dunlap
> > TIBCO Software
> > wdunlap tibco.com
> >
> >
> > On Tue, Oct 20, 2015 at 2:05 AM, Maram SAlem
> > 
> > wrote:
> > > Thanks William. I've tried the first code, ( with f0() ), but still
> > > for n=25, m=15 , I got this:
> > >
> > >> s<-f0(25,15)
> > > Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep)
> :
> > >   invalid 'times' value
> > > In addition: Warning message:
> > > In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
> > >   NAs introduced by coercion to integer range
> > >
> > >
> > > I don't know if this is related to the memory limits of my laptop,
> > > or it doesn't have to do with the memory.
> > >
> > > Any help on how to fix this error will be greatly appreciated.
> > >
> > > Thanks All.
> > >
> > > Maram Salem
> > >
> > > On 15 October 2015 at 17:52, William Dunlap 
> wrote:
> > >>
> > >> Doing enumerative combinatorics with rejection methods rarely
> works
> > >> well. Try mapping your problem to the problem of choosing
> > >> m-1 items from n-1.  E.g., your code was
> > >>
> > >> f0 <- function(n, m) {
> > >>stopifnot(n > m)
> > >>D<-matrix(0,nrow=n-m+1,ncol=m-1)
> > >>for (i in 1:m-1){
> > >>   D[,i]<-seq(0,n-m,1)
> > >>}
> > >>ED <- do.call(`expand.grid`,as.data.frame(D))
> > >>ED<-unname(as.matrix(ED))
> > >>lk<-which(rowSums(ED)<=(n-m))
> > >>ED[lk,]
> > >> }
> > >>
> > >> and I think the following does the same thing in much less space
> by
> > >> transforming the output of combn().
> > >>
> > >> f1 <- function(n, m) {
> > >>stopifnot(n > m)
> > >>r0 <- t(diff(combn(n-1, m-1)) - 1L)
> > >>r1 <- rep(seq(from=0, len=n-m+1), choose( seq(to=m-2, by=-1,
> > >> len=n-m+1), m-2))
> > >>cbind(r0[, ncol(r0):1, drop=FALSE], r1, deparse.level=0) }
> > >>
> > >> The code for adding the last column is a bit clumsy and could
> > >> probably
> > be
> > >> improved.  Both f0 and f1 could also be cleaned up to work for
> m<=2.
> > >>
> > >> See Feller vol. 1 or Benjamin's "Proofs that (really) count" for
> > >> more on this sort of thing.
> > >>
> > >>
> > >>
> > >> Bill Dunlap
> > >> TIBCO Software
> > >> wdunlap tibco.com
> > >>
> > >> On Thu, Oct 15, 2015 at 7:45 AM, Maram SAlem
> > >>  > >
> > >> wrote:
> > >>>
> > >>> Dear All,
> > >>>
> > >>> I'm trying to do a simple task (which is in fact a tiny part of a
> > larger
> > >>> code).
> > >>>
> > >>> I want to create a matrix, D, each of its columns is a sequence
> > >>> from 0
> > to
> > >>> (n-m), by 1. Then, using D, I want to create another matrix ED,
> > >>> whose rows represent all the possible combinations of the
> elements
> > >>> of the columns
> > of
> > >>> D. Then from ED, I'll select only the rows whose sum is less than
> > >>> or equal to (n-m), which will be called the matrix s. I used the
> > >>> following code:
> > >>>
> > >>> > n=5
> > >>> > m=3
> > >>> > D<-matrix(0,nrow=n-m+1,ncol=m-1) for (i in 1:m-1)
> > >>> +  {
> > >>> + D[,i]<-seq(0,n-m,1)
> > >>> +  }
> > >>> > ED <- do.call(`expand.grid`,as.data.frame(D))
> > >>> > ED<-as.matrix(ED)
> > >>>
> > >>> > lk<-which(rowSums(ED)<=(n-m))
> > >>>
> > >>> > s<-ED[lk,]
> > >>>
> > >>>
> > >>> This works perfectly well. But for rather larger values of n 

Re: [R] Error in rep.int() invalid 'times' value

2015-10-21 Thread marammagdysalem
Thanks a lot Petr for ur reply and advice. Hope I 'd be able to minmize time as 
much as possible.

Regards,
Maram Salem

Sent from my iPhone

> On Oct 21, 2015, at 9:28 AM, PIKAL Petr  wrote:
> 
> Hi
> 
> Several options
> 
> Coding in C+ or other similar language (you seem to be more familiar with 
> them)
> 
> Using debug and find out how your function behaves in steps
> 
> Using function with smaller m, n with Rprof to see where the time is spent.
> 
> I believe that coding in R like it was C+ is the way to hell. There is 
> nothing wrong with cycles however if you compute something which can be 
> computed easily by vectorised approach you loose efficiency.
> 
> e.g.
> 
> you compute this
> m<-5
>> for ( i in 1:m-1)
> +{
> +b[i]<- (m-i)
> +}
>> b
> [1] 4 3 2 1
> 
> but you can achieve it by
> 
> b <-  ((m-1):1)
> 
> Time comparison:
> 
>> m<-1e5
>> system.time(for ( i in 1:m-1) {b[i]<- (m-i)})
>   user  system elapsed
>  11.610.00   12.11
>> system.time(bb <- ((m-1):1))
>   user  system elapsed
>  0   0   0
>> all.equal(b,bb)
> [1] TRUE
> 
> So the time gain is huge only in this small computation.
> 
> Cheers
> Petr
> 
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Maram
>> SAlem
>> Sent: Tuesday, October 20, 2015 6:33 PM
>> To: William Dunlap
>> Cc: r-help@r-project.org
>> Subject: Re: [R] Error in rep.int() invalid 'times' value
>> 
>> Yes Indeed William. f1() works perfectly well and took only 30 secs to
>> execute f1(25,15), but I wonder if there is anyway to speed up the
>> execution of the rest of my code (almost seven hours now) ?
>> 
>> Thanks for helping.
>> 
>> Maram Salem
>> 
>>> On 20 October 2015 at 18:11, William Dunlap  wrote:
>>> 
>>> f0 is essentially your original code put into a function, so expect
>> it
>>> to fail in the same way your original code did.
>>> f1 should give the same answer as f0, but it should use less memory
>>> and time.
>>> Bill Dunlap
>>> TIBCO Software
>>> wdunlap tibco.com
>>> 
>>> 
>>> On Tue, Oct 20, 2015 at 2:05 AM, Maram SAlem
>>> 
>>> wrote:
 Thanks William. I've tried the first code, ( with f0() ), but still
 for n=25, m=15 , I got this:
 
> s<-f0(25,15)
 Error in rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep)
>> :
  invalid 'times' value
 In addition: Warning message:
 In rep.int(rep.int(seq_len(nx), rep.int(rep.fac, nx)), orep) :
  NAs introduced by coercion to integer range
 
 
 I don't know if this is related to the memory limits of my laptop,
 or it doesn't have to do with the memory.
 
 Any help on how to fix this error will be greatly appreciated.
 
 Thanks All.
 
 Maram Salem
 
 On 15 October 2015 at 17:52, William Dunlap 
>> wrote:
> 
> Doing enumerative combinatorics with rejection methods rarely
>> works
> well. Try mapping your problem to the problem of choosing
> m-1 items from n-1.  E.g., your code was
> 
> f0 <- function(n, m) {
>   stopifnot(n > m)
>   D<-matrix(0,nrow=n-m+1,ncol=m-1)
>   for (i in 1:m-1){
>  D[,i]<-seq(0,n-m,1)
>   }
>   ED <- do.call(`expand.grid`,as.data.frame(D))
>   ED<-unname(as.matrix(ED))
>   lk<-which(rowSums(ED)<=(n-m))
>   ED[lk,]
> }
> 
> and I think the following does the same thing in much less space
>> by
> transforming the output of combn().
> 
> f1 <- function(n, m) {
>   stopifnot(n > m)
>   r0 <- t(diff(combn(n-1, m-1)) - 1L)
>   r1 <- rep(seq(from=0, len=n-m+1), choose( seq(to=m-2, by=-1,
> len=n-m+1), m-2))
>   cbind(r0[, ncol(r0):1, drop=FALSE], r1, deparse.level=0) }
> 
> The code for adding the last column is a bit clumsy and could
> probably
>>> be
> improved.  Both f0 and f1 could also be cleaned up to work for
>> m<=2.
> 
> See Feller vol. 1 or Benjamin's "Proofs that (really) count" for
> more on this sort of thing.
> 
> 
> 
> Bill Dunlap
> TIBCO Software
> wdunlap tibco.com
> 
> On Thu, Oct 15, 2015 at 7:45 AM, Maram SAlem
>  wrote:
>> 
>> Dear All,
>> 
>> I'm trying to do a simple task (which is in fact a tiny part of a
>>> larger
>> code).
>> 
>> I want to create a matrix, D, each of its columns is a sequence
>> from 0
>>> to
>> (n-m), by 1. Then, using D, I want to create another matrix ED,
>> whose rows represent all the possible combinations of the
>> elements
>> of the columns
>>> of
>> D. Then from ED, I'll select only the rows whose sum is less than
>> or equal to (n-m), which will be called the matrix s. I used the
>> following code:
>> 
>>> n=5
>>> m=3
>>> D<-matrix(0,nrow=n-m+1,ncol=m-1) for (i in 1:m-1)
>> +  {
>> + D[,i]<-seq(0,n-m,1)