Re: [R] Unexpected behavior when giving a value to a new variable based on the value of another variable

2014-09-02 Thread Angel Rodriguez
Thank you for the explanation, Peter.

Angel


-Mensaje original-
De: peter dalgaard [mailto:pda...@gmail.com]
Enviado el: lun 01/09/2014 20:10
Para: Angel Rodriguez
CC: r-help
Asunto: Re: [R] Unexpected behavior when giving a value to a new variable based 
on the value of another variable
 

On 01 Sep 2014, at 13:08 , Angel Rodriguez angel.rodrig...@matiainstituto.net 
wrote:

 Thank you John, Jim, Jeff and both Davids for your answers.
 
 After trying different combinations of values for the variable samplem, it 
 looks like if age is greater than 65, R applies the correct code 1 whatever 
 the value of samplem, but if age is less than 65, it just copies the values 
 of samplem to sample. I do not understand why it does so.
 

It's because indexed assignment is really (white lie alert: it's actually worse)

N$sample - `[-`(`$`(N, `sample`), index, value)

and since N$sample isn't there from the outset, partial matching kicks in for 
the `$`bit and makes the right hand side equivalent to the same thing with 
`samplem`. The result still gets assigned to N$sample, but the value is the 
same that N$samplem would get from

N$samplem[N$age = 65] - 1

Notice the difference if you do

 N$sample - NA
 N$sample[N$age = 65] - 1 
 N
  age samplem sample
1  67  NA  1
2  62   1 NA
3  74   1  1
4  61   1 NA
5  60   1 NA
6  55   1 NA
7  60   1 NA
8  59   1 NA
9  58  NA NA

-pd

 In any case, Jim's syntax work very well, although I do not understand why 
 either.
 
 Answering to Jim, I just wanted a variable that could identify individuals 
 with some characteristics (not only age, as in this example that has been 
 oversimplified).
 
 Best regards,
 
 Angel Rodriguez-Laso
 
 
 -Mensaje original-
 De: John McKown [mailto:john.archie.mck...@gmail.com]
 Enviado el: vie 29/08/2014 14:46
 Para: Angel Rodriguez
 CC: r-help
 Asunto: Re: [R] Unexpected behavior when giving a value to a new variable 
 based on the value of another variable
 
 On Fri, Aug 29, 2014 at 3:53 AM, Angel Rodriguez
 angel.rodrig...@matiainstituto.net wrote:
 
 Dear subscribers,
 
 I've found that if there is a variable in the dataframe with a name very 
 similar to a new variable, R does not give the correct values to this latter 
 variable based on the values of a third value:
 
 
 snip
 
 Any clue for this behavior?
 
 snip
 
 Thank you very much.
 
 Angel Rodriguez-Laso
 Research project manager
 Matia Instituto Gerontologico
 
 That is unusual, but appears to be documented in a section from
 
 ?`[`
 
 quote
 Character indices
 
 Character indices can in some circumstances be partially matched (see
 pmatch) to the names or dimnames of the object being subsetted (but
 never for subassignment). Unlike S (Becker et al p. 358)), R never
 uses partial matching when extracting by [, and partial matching is
 not by default used by [[ (see argument exact).
 
 Thus the default behaviour is to use partial matching only when
 extracting from recursive objects (except environments) by $. Even in
 that case, warnings can be switched on by
 options(warnPartialMatchDollar = TRUE).
 
 Neither empty () nor NA indices match any names, not even empty nor
 missing names. If any object has no names or appropriate dimnames,
 they are taken as all  and so match nothing.
 /quote
 
 Note the commend about partial matching in the middle paragraph in
 the quote above.
 
 -- 
 There is nothing more pleasant than traveling and meeting new people!
 Genghis Khan
 
 Maranatha! 
 John McKown
 
 
 
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com












[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behavior when giving a value to a new variable based on the value of another variable

2014-09-01 Thread Angel Rodriguez
Thank you John, Jim, Jeff and both Davids for your answers.

After trying different combinations of values for the variable samplem, it 
looks like if age is greater than 65, R applies the correct code 1 whatever the 
value of samplem, but if age is less than 65, it just copies the values of 
samplem to sample. I do not understand why it does so.

In any case, Jim's syntax work very well, although I do not understand why 
either.

Answering to Jim, I just wanted a variable that could identify individuals with 
some characteristics (not only age, as in this example that has been 
oversimplified).

Best regards,

Angel Rodriguez-Laso


-Mensaje original-
De: John McKown [mailto:john.archie.mck...@gmail.com]
Enviado el: vie 29/08/2014 14:46
Para: Angel Rodriguez
CC: r-help
Asunto: Re: [R] Unexpected behavior when giving a value to a new variable based 
on the value of another variable
 
On Fri, Aug 29, 2014 at 3:53 AM, Angel Rodriguez
angel.rodrig...@matiainstituto.net wrote:

 Dear subscribers,

 I've found that if there is a variable in the dataframe with a name very 
 similar to a new variable, R does not give the correct values to this latter 
 variable based on the values of a third value:


snip

 Any clue for this behavior?

snip

 Thank you very much.

 Angel Rodriguez-Laso
 Research project manager
 Matia Instituto Gerontologico

That is unusual, but appears to be documented in a section from

?`[`

quote
Character indices

Character indices can in some circumstances be partially matched (see
pmatch) to the names or dimnames of the object being subsetted (but
never for subassignment). Unlike S (Becker et al p. 358)), R never
uses partial matching when extracting by [, and partial matching is
not by default used by [[ (see argument exact).

Thus the default behaviour is to use partial matching only when
extracting from recursive objects (except environments) by $. Even in
that case, warnings can be switched on by
options(warnPartialMatchDollar = TRUE).

Neither empty () nor NA indices match any names, not even empty nor
missing names. If any object has no names or appropriate dimnames,
they are taken as all  and so match nothing.
/quote

Note the commend about partial matching in the middle paragraph in
the quote above.

-- 
There is nothing more pleasant than traveling and meeting new people!
Genghis Khan

Maranatha! 
John McKown







[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behavior when giving a value to a new variable based on the value of another variable

2014-09-01 Thread peter dalgaard

On 01 Sep 2014, at 13:08 , Angel Rodriguez angel.rodrig...@matiainstituto.net 
wrote:

 Thank you John, Jim, Jeff and both Davids for your answers.
 
 After trying different combinations of values for the variable samplem, it 
 looks like if age is greater than 65, R applies the correct code 1 whatever 
 the value of samplem, but if age is less than 65, it just copies the values 
 of samplem to sample. I do not understand why it does so.
 

It's because indexed assignment is really (white lie alert: it's actually worse)

N$sample - `[-`(`$`(N, `sample`), index, value)

and since N$sample isn't there from the outset, partial matching kicks in for 
the `$`bit and makes the right hand side equivalent to the same thing with 
`samplem`. The result still gets assigned to N$sample, but the value is the 
same that N$samplem would get from

N$samplem[N$age = 65] - 1

Notice the difference if you do

 N$sample - NA
 N$sample[N$age = 65] - 1 
 N
  age samplem sample
1  67  NA  1
2  62   1 NA
3  74   1  1
4  61   1 NA
5  60   1 NA
6  55   1 NA
7  60   1 NA
8  59   1 NA
9  58  NA NA

-pd

 In any case, Jim's syntax work very well, although I do not understand why 
 either.
 
 Answering to Jim, I just wanted a variable that could identify individuals 
 with some characteristics (not only age, as in this example that has been 
 oversimplified).
 
 Best regards,
 
 Angel Rodriguez-Laso
 
 
 -Mensaje original-
 De: John McKown [mailto:john.archie.mck...@gmail.com]
 Enviado el: vie 29/08/2014 14:46
 Para: Angel Rodriguez
 CC: r-help
 Asunto: Re: [R] Unexpected behavior when giving a value to a new variable 
 based on the value of another variable
 
 On Fri, Aug 29, 2014 at 3:53 AM, Angel Rodriguez
 angel.rodrig...@matiainstituto.net wrote:
 
 Dear subscribers,
 
 I've found that if there is a variable in the dataframe with a name very 
 similar to a new variable, R does not give the correct values to this latter 
 variable based on the values of a third value:
 
 
 snip
 
 Any clue for this behavior?
 
 snip
 
 Thank you very much.
 
 Angel Rodriguez-Laso
 Research project manager
 Matia Instituto Gerontologico
 
 That is unusual, but appears to be documented in a section from
 
 ?`[`
 
 quote
 Character indices
 
 Character indices can in some circumstances be partially matched (see
 pmatch) to the names or dimnames of the object being subsetted (but
 never for subassignment). Unlike S (Becker et al p. 358)), R never
 uses partial matching when extracting by [, and partial matching is
 not by default used by [[ (see argument exact).
 
 Thus the default behaviour is to use partial matching only when
 extracting from recursive objects (except environments) by $. Even in
 that case, warnings can be switched on by
 options(warnPartialMatchDollar = TRUE).
 
 Neither empty () nor NA indices match any names, not even empty nor
 missing names. If any object has no names or appropriate dimnames,
 they are taken as all  and so match nothing.
 /quote
 
 Note the commend about partial matching in the middle paragraph in
 the quote above.
 
 -- 
 There is nothing more pleasant than traveling and meeting new people!
 Genghis Khan
 
 Maranatha! 
 John McKown
 
 
 
 
 
 
 
   [[alternative HTML version deleted]]
 
 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

-- 
Peter Dalgaard, Professor,
Center for Statistics, Copenhagen Business School
Solbjerg Plads 3, 2000 Frederiksberg, Denmark
Phone: (+45)38153501
Email: pd@cbs.dk  Priv: pda...@gmail.com

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


[R] Unexpected behavior when giving a value to a new variable based on the value of another variable

2014-08-29 Thread Angel Rodriguez

Dear subscribers,

I've found that if there is a variable in the dataframe with a name very 
similar to a new variable, R does not give the correct values to this latter 
variable based on the values of a third value:


 M - structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58)),.Names = 
 c(age), row.names = c(NA, -9L), 
+class = data.frame)
 M$sample[M$age = 65] - 1 
 M
  age sample
1  67  1
2  62 NA
3  74  1
4  61 NA
5  60 NA
6  55 NA
7  60 NA
8  59 NA
9  58 NA
 N - structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58), V2 = c(NA, 1, 
 1, 1, 1,1,1,1,NA)), 
+ .Names = c(age,samplem), row.names = c(NA, -9L), 
class = data.frame)
 N$sample[N$age = 65] - 1 
 N
  age samplem sample
1  67  NA  1
2  62   1  1
3  74   1  1
4  61   1  1
5  60   1  1
6  55   1  1
7  60   1  1
8  59   1  1
9  58  NA NA



Any clue for this behavior?



My specifications:

R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
LC_MONETARY=Spanish_Spain.1252
[4] LC_NUMERIC=C   LC_TIME=Spanish_Spain.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base 

other attached packages:
[1] foreign_0.8-61

loaded via a namespace (and not attached):
[1] tools_3.1.1




Thank you very much.

Angel Rodriguez-Laso
Research project manager
Matia Instituto Gerontologico


[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behavior when giving a value to a new variable based on the value of another variable

2014-08-29 Thread jim holtman
You are being bitten by the partial matching of the $ operator
(see  ?$ for a better explanation).  Here is solution that works:


**original**
 N - structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58), V2 = c(NA, 1, 
 1, 1, 1,1,1,1,NA)),
+ .Names = c(age,samplem), row.names = c(NA,
-9L), class = data.frame)
 N$sample[N$age = 65] - 1
 N
  age samplem sample
1  67  NA  1
2  62   1  1
3  74   1  1
4  61   1  1
5  60   1  1
6  55   1  1
7  60   1  1
8  59   1  1
9  58  NA NA


 N - structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58), V2 = c(NA, 1, 
 1, 1, 1,1,1,1,NA)),
+ .Names = c(age,samplem), row.names = c(NA,
-9L), class = data.frame)
 N[[sample]][N$age = 65] - 1  # use the '[[' operation for complete 
 matching
 N
  age samplem sample
1  67  NA  1
2  62   1 NA
3  74   1  1
4  61   1 NA
5  60   1 NA
6  55   1 NA
7  60   1 NA
8  59   1 NA
9  58  NA NA

Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.


On Fri, Aug 29, 2014 at 4:53 AM, Angel Rodriguez
angel.rodrig...@matiainstituto.net wrote:

 Dear subscribers,

 I've found that if there is a variable in the dataframe with a name very 
 similar to a new variable, R does not give the correct values to this latter 
 variable based on the values of a third value:


 M - structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58)),.Names = 
 c(age), row.names = c(NA, -9L),
 +class = data.frame)
 M$sample[M$age = 65] - 1
 M
   age sample
 1  67  1
 2  62 NA
 3  74  1
 4  61 NA
 5  60 NA
 6  55 NA
 7  60 NA
 8  59 NA
 9  58 NA
 N - structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58), V2 = c(NA, 
 1, 1, 1, 1,1,1,1,NA)),
 + .Names = c(age,samplem), row.names = c(NA, -9L), 
 class = data.frame)
 N$sample[N$age = 65] - 1
 N
   age samplem sample
 1  67  NA  1
 2  62   1  1
 3  74   1  1
 4  61   1  1
 5  60   1  1
 6  55   1  1
 7  60   1  1
 8  59   1  1
 9  58  NA NA



 Any clue for this behavior?



 My specifications:

 R version 3.1.1 (2014-07-10)
 Platform: x86_64-w64-mingw32/x64 (64-bit)

 locale:
 [1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252
 LC_MONETARY=Spanish_Spain.1252
 [4] LC_NUMERIC=C   LC_TIME=Spanish_Spain.1252

 attached base packages:
 [1] stats graphics  grDevices utils datasets  methods   base

 other attached packages:
 [1] foreign_0.8-61

 loaded via a namespace (and not attached):
 [1] tools_3.1.1




 Thank you very much.

 Angel Rodriguez-Laso
 Research project manager
 Matia Instituto Gerontologico


 [[alternative HTML version deleted]]

 __
 R-help@r-project.org mailing list
 https://stat.ethz.ch/mailman/listinfo/r-help
 PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
 and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behavior when giving a value to a new variable based on the value of another variable

2014-08-29 Thread John McKown
On Fri, Aug 29, 2014 at 3:53 AM, Angel Rodriguez
angel.rodrig...@matiainstituto.net wrote:

 Dear subscribers,

 I've found that if there is a variable in the dataframe with a name very 
 similar to a new variable, R does not give the correct values to this latter 
 variable based on the values of a third value:


snip

 Any clue for this behavior?

snip

 Thank you very much.

 Angel Rodriguez-Laso
 Research project manager
 Matia Instituto Gerontologico

That is unusual, but appears to be documented in a section from

?`[`

quote
Character indices

Character indices can in some circumstances be partially matched (see
pmatch) to the names or dimnames of the object being subsetted (but
never for subassignment). Unlike S (Becker et al p. 358)), R never
uses partial matching when extracting by [, and partial matching is
not by default used by [[ (see argument exact).

Thus the default behaviour is to use partial matching only when
extracting from recursive objects (except environments) by $. Even in
that case, warnings can be switched on by
options(warnPartialMatchDollar = TRUE).

Neither empty () nor NA indices match any names, not even empty nor
missing names. If any object has no names or appropriate dimnames,
they are taken as all  and so match nothing.
/quote

Note the commend about partial matching in the middle paragraph in
the quote above.

-- 
There is nothing more pleasant than traveling and meeting new people!
Genghis Khan

Maranatha! 
John McKown

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


Re: [R] Unexpected behavior when giving a value to a new variable based on the value of another variable

2014-08-29 Thread Jeff Newmiller
One clue is the help file for $...

? $

In particular there see the discussion of character indices and the exact 
argument.

You can also find this discussed in the Introduction to R document that comes 
with the software.
---
Jeff NewmillerThe .   .  Go Live...
DCN:jdnew...@dcn.davis.ca.usBasics: ##.#.   ##.#.  Live Go...
  Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/BatteriesO.O#.   #.O#.  with
/Software/Embedded Controllers)   .OO#.   .OO#.  rocks...1k
--- 
Sent from my phone. Please excuse my brevity.

On August 29, 2014 1:53:47 AM PDT, Angel Rodriguez 
angel.rodrig...@matiainstituto.net wrote:

Dear subscribers,

I've found that if there is a variable in the dataframe with a name
very similar to a new variable, R does not give the correct values to
this latter variable based on the values of a third value:


 M - structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59,
58)),.Names = c(age), row.names = c(NA, -9L), 
+class = data.frame)
 M$sample[M$age = 65] - 1 
 M
  age sample
1  67  1
2  62 NA
3  74  1
4  61 NA
5  60 NA
6  55 NA
7  60 NA
8  59 NA
9  58 NA
 N - structure(list(V1 = c(67, 62, 74, 61, 60, 55, 60, 59, 58), V2 =
c(NA, 1, 1, 1, 1,1,1,1,NA)), 
+ .Names = c(age,samplem), row.names = c(NA,
-9L), class = data.frame)
 N$sample[N$age = 65] - 1 
 N
  age samplem sample
1  67  NA  1
2  62   1  1
3  74   1  1
4  61   1  1
5  60   1  1
6  55   1  1
7  60   1  1
8  59   1  1
9  58  NA NA



Any clue for this behavior?



My specifications:

R version 3.1.1 (2014-07-10)
Platform: x86_64-w64-mingw32/x64 (64-bit)

locale:
[1] LC_COLLATE=Spanish_Spain.1252  LC_CTYPE=Spanish_Spain.1252   
LC_MONETARY=Spanish_Spain.1252
[4] LC_NUMERIC=C   LC_TIME=Spanish_Spain.1252

attached base packages:
[1] stats graphics  grDevices utils datasets  methods   base   
 

other attached packages:
[1] foreign_0.8-61

loaded via a namespace (and not attached):
[1] tools_3.1.1




Thank you very much.

Angel Rodriguez-Laso
Research project manager
Matia Instituto Gerontologico


   [[alternative HTML version deleted]]

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.