Re: [R] Cleaning data

2017-09-26 Thread Jim Lemon
Hi Bayan,
Your question seems to imply that the "age" column contains floating
point numbers, e.g.

df
height  weight  age
170  72 21.5
...

If this is so, you will only find an integer in diff(age) if two
adjacent numbers happen to have the same decimal fraction _and_ the
subtraction does not produce a very small decimal remainder due to one
or both of the numbers being unable to be represented exactly in
binary notation as Eric pointed out. This seems an unusual criterion
for discarding values. Perhaps if you explain why an integer result is
undesirable it would help. It can be done:

badrows<-which(is.integer(diff(df$age)))
df<-df[-badrows,]

OR

df<-df[badrows+1,]

if you want to delete the second rather than the first age.

Jim

On Tue, Sep 26, 2017 at 7:50 PM, bayan sardini  wrote:
> Hi
>
> I want to clean my data frame, based on the age column, whereas i want to 
> delete the rows that the difference between its elements (i+1)-i= integer. i 
> used
>
> a <- diff(df$age)
> for(i in a){if(is.integer(a) == true){df <- df[-a,]
> }}
>
> but, it doesn’t work, any ideas
>
> Thanks in advance
> Bayan
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Cleaning data

2017-09-26 Thread Eric Berger
Hi Bayan,
In your code, 'a' is a vector and is.integer(a) is a logical of length 1 -
most likely FALSE if even one element of a is not an integer. (Since R will
coerce all the elements of a to the same type.)
You need to decide whether something "close enough" to an integer is to be
considered an integer - e.g. a distance of 0.01 = 1e-6.

 a <- df$age
df <- df[ c( TRUE, abs( a - round(a,0) )%%1 ) > 1e-6 ), ]

I added the 'TRUE' at the beginning to always keep the first row of df. If
you prefer to always keep the last row then move the TRUE to the end.

HTH,

Eric




On Tue, Sep 26, 2017 at 12:50 PM, bayan sardini 
wrote:

> Hi
>
> I want to clean my data frame, based on the age column, whereas i want to
> delete the rows that the difference between its elements (i+1)-i= integer.
> i used
>
> a <- diff(df$age)
> for(i in a){if(is.integer(a) == true){df <- df[-a,]
> }}
>
> but, it doesn’t work, any ideas
>
> Thanks in advance
> Bayan
> __
> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/
> posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

[[alternative HTML version deleted]]

__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

[R] Cleaning data

2017-09-26 Thread bayan sardini
Hi 

I want to clean my data frame, based on the age column, whereas i want to 
delete the rows that the difference between its elements (i+1)-i= integer. i 
used 

a <- diff(df$age)
for(i in a){if(is.integer(a) == true){df <- df[-a,]
}}

but, it doesn’t work, any ideas

Thanks in advance
Bayan
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.