Hi Bayan,
Your question seems to imply that the "age" column contains floating
point numbers, e.g.
df
height weight age
170 72 21.5
...
If this is so, you will only find an integer in diff(age) if two
adjacent numbers happen to have the same decimal fraction _and_ the
subtraction
Hi Bayan,
In your code, 'a' is a vector and is.integer(a) is a logical of length 1 -
most likely FALSE if even one element of a is not an integer. (Since R will
coerce all the elements of a to the same type.)
You need to decide whether something "close enough" to an integer is to be
considered an
Hi
I want to clean my data frame, based on the age column, whereas i want to
delete the rows that the difference between its elements (i+1)-i= integer. i
used
a <- diff(df$age)
for(i in a){if(is.integer(a) == true){df <- df[-a,]
}}
but, it doesn’t work, any ideas
Thanks in advance
Bayan
Sarah,
Thank you very much. For the other variables
I was trying to do the same job in different way because it is easier to
list it
Example
test < which(dat$var1 !="BAA" | dat$var1 !="FAG" )
{
dat <- dat[-test,]} and I did not get the right result. What am I
missing here?
On
Please keep replies on the list so others may participate in the conversation.
If you have a character vector containing the potential values, you
might look at %in% for one approach to subsetting your data.
Var1 %in% myvalues
Sarah
On Wed, Nov 11, 2015 at 7:10 PM, Ashta
If what you posted here is what you typed, your syntax is wrong.
I strongly advise you to consult the two links here:
http://adv-r.had.co.nz/Reproducibility.html
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
... and please read the posting guide and don't
On Wed, Nov 11, 2015 at 8:44 PM, Ashta wrote:
> Hi Sarah,
>
> I used the following to clean my data, the program crushed several times.
>
> test <- dat[dat$Var1 == "YYZ" | dat$Var1 =="MSN" ,]
>
> What is the difference between these two
>
> test <- dat[dat$Var1 %in% "YYZ" |
Hi all,
I have a data frame with huge rows and columns.
When I looked at the data, it has several garbage values need to be
cleaned. For a sample I am showing you the frequency distribution
of one variables
Var1 Freq
1:3
2]6
3MSN 1040
4YYZ 300
5\\4
6+
Hi,
On Wed, Nov 11, 2015 at 6:51 PM, Ashta wrote:
> Hi all,
>
> I have a data frame with huge rows and columns.
>
> When I looked at the data, it has several garbage values need to be
>
> cleaned. For a sample I am showing you the frequency distribution
> of one variables
>
Hi Sarah,
I used the following to clean my data, the program crushed several times.
*test <- dat[dat$Var1 == "YYZ" | dat$Var1 =="MSN" ,]*
*What is the difference between these two**test <- dat[dat$Var1
**%in% "YYZ" | dat$Var1** %in% "MSN" ,]*
On Wed, Nov 11, 2015 at 6:38 PM, Sarah Goslee
In order to have a clean workspace at the start of each chapter of a
book I'm kniting I've written a little script as follows:
# chapclean.R
# This cleans up the R workspace
ilist-c(.GlobalEnv, package:stats, package:graphics,
package:grDevices,
package:utils, package:datasets,
This has been reported before on the bug list
(https://bugs.r-project.org/bugzilla3/show_bug.cgi?id=15481). The
message is coming from the methods package, but I don't know if it's a
bug or ignorable.
Duncan Murdoch
On 16/10/2013 11:03 AM, Prof J C Nash (U30A) wrote:
In order to have a
holtman jholt...@gmail.com
To: Greg Snow 538...@gmail.com
Cc: r-help r-help@r-project.org
Subject: Re: [R] Cleaning up messy Excel data
Message-ID:
caaxdm-6vzxcli4mr0gukwge5eva0-gx03fruey9ej3cajy4...@mail.gmail.com
Content-Type: text/plain; charset=ISO-8859-1
Unfortunately they only know
Sometimes we adapt to our environment, sometimes we adapt our
environment to us. I like fortune(108).
I actually was suggesting that you add a tool to your toolbox, not limit it.
In my experience (and I don't expect everyone else's to match) data
manipulation that seems easier in Excel than R is
Seconded
John Kane
Kingston ON Canada
-Original Message-
From: rolf.tur...@xtra.co.nz
Sent: Sat, 03 Mar 2012 13:46:42 +1300
To: 538...@gmail.com
Subject: Re: [R] Cleaning up messy Excel data
On 03/03/12 12:41, Greg Snow wrote:
SNIP
It is possible to do the right thing
Try sending your clients a data set (data frame, table, etc) as an MS
Access data table instead. They can still view the data as a table,
but will have to go to much more effort to mess up the data, more
likely they will do proper edits without messing anything up (mixing
characters in with
Unfortunately, a lot of people who use MS Office don't have or know how
to use MS Access. Where I work now (as in the past) I have to tie
someone to their chair, give them a few pokes with the cattle prod and
then show them that a CSV file will load straight into Excel before I
can convince
On 03/03/12 12:41, Greg Snow wrote:
SNIP
It is possible to do the right thing in
Excel, but Excel does not encourage (let alone force) you to do the
right thing, but makes it easy to do the wrong thing.
SNIP
Fortune!
cheers,
Rolf Turner
Unfortunately they only know how to use Excel and Word. They are not
folks who use a computer every day. Many of them run factories or
warehouses and asking them to use something like Access would not
happen in my lifetime (I have retired twice already).
I don't have any problems with them
But there are some important reasons to use Excel. In my work there
are a lot of people that I have to send the equivalent of a data.frame
to who want to look at the data and possibly slice/dice the data
differently and then send back to me updates. These folks do not know
how to use R, but do
: noahsilver...@ucla.edu
Sent: Tue, 28 Feb 2012 13:27:13 -0800
To: r-help@r-project.org
Subject: [R] Cleaning up messy Excel data
Unfortunately, some data I need to work with was delivered in a rather
messy Excel file. I want to import into R and clean up some things so
that I can do my
On 01/03/12 04:43, John Kane wrote:
(mydata- as.factor(c(1,2,3, 2, 5, 2)))
str(mydata)
newdata- as.character(mydata)
newdata[newdata==2]- 0
newdata- as.numeric(newdata)
str(newdata)
We really need to keep Excel (and other spreadsheets) out of peoples hands.
Amen, bro'!!!
cheers,
Unfortunately, some data I need to work with was delivered in a rather messy
Excel file. I want to import into R and clean up some things so that I can do
my analysis. Pulling in a CSV from Excel is the easy part.
My current challenge is dealing with some text mixed in the values.
i.e.
First of all when reading in the CSV file, use 'as.is = TRUE' to
prevent the changing to factors.
Now that things are character in that column, you can use some pattern
expressions (gsub, regex, ...) to search for and change your data.
E.g.,
sub(.*, 0, yourCol)
should do it for you.
On Tue,
-Original Message-
From: Noah Silverman
Sent: Tuesday, February 28, 2012 3:27 PM
To: r-help
Subject: [R] Cleaning up messy Excel data
Unfortunately, some data I need to work with was delivered in a rather messy
Excel file. I want to import into R and clean up some things so that I can
That's exactly what I need.
Thank You!!
--
Noah Silverman
UCLA Department of Statistics
8117 Math Sciences Building
Los Angeles, CA 90095
On Feb 28, 2012, at 1:42 PM, jim holtman wrote:
First of all when reading in the CSV file, use 'as.is = TRUE' to
prevent the changing to factors.
Now
Just replace that value with zero. If you provide some reproducible
code I could probably give you a solution.
?dput
good luck,
Stephen
On 02/28/2012 03:27 PM, Noah Silverman wrote:
Unfortunately, some data I need to work with was delivered in a rather messy
Excel file. I want to import
Dear Bill,
Thanks very much for the reply and for the code. I have amended my personal
details for future posts. I was wondering if there were any good books or
tutorials for writing code similar to what you have provided above?
Best wishes,
Natalie Van Zuydam
-
Natalie Van Zuydam
PhD
Hi Everyone,
I have the following problem:
data - structure(list(prochi = c(IND1, IND1, IND1,
IND2, IND2, IND2, IND2, IND3,
IND4, IND5), date_admission = structure(c(6468,
6470, 7063, 9981, 9983, 14186, 14372, 5129, 9767, 11168), class = Date)),
.Names = c(prochi,
date_admission), row.names
Subject: [R] Cleaning date columns
Hi Everyone,
I have the following problem:
data - structure(list(prochi = c(IND1, IND1, IND1,
IND2, IND2, IND2, IND2, IND3,
IND4, IND5), date_admission = structure(c(6468,
6470, 7063, 9981, 9983, 14186, 14372, 5129, 9767, 11168), class = Date)),
.Names = c
I calculated a large vector. Unfortunately, I have some measurement error
in my data and some of the values in the vector are erroneous. I ended up
wih some Infs and NaNs in the vector. I would like to filter out the Inf
and NaN values and only keep the values in my vector that range from 1 to
Try this:
x[is.finite(x)]
On Fri, Oct 1, 2010 at 2:51 PM, mlar...@rsmas.miami.edu wrote:
I calculated a large vector. Unfortunately, I have some measurement error
in my data and some of the values in the vector are erroneous. I ended up
wih some Infs and NaNs in the vector. I would like
Mike,
Small, reproducible examples are always useful for the rest of the us.
x - c(0, NA, NaN, 1 , 10, 20, 21, Inf)
x[!is.na(x) x =1 x= 20]
Is that what you're looking for?
mlar...@rsmas.miami.edu wrote:
I calculated a large vector. Unfortunately, I have some measurement error
in my data
On Fri, Oct 1, 2010 at 10:51 AM, mlar...@rsmas.miami.edu wrote:
I calculated a large vector. Unfortunately, I have some measurement error
in my data and some of the values in the vector are erroneous. I ended up
wih some Infs and NaNs in the vector. I would like to filter out the Inf
and
On Oct 1, 2010, at 12:51 PM, mlar...@rsmas.miami.edu wrote:
I calculated a large vector. Unfortunately, I have some measurement error
in my data and some of the values in the vector are erroneous. I ended up
wih some Infs and NaNs in the vector. I would like to filter out the Inf
and NaN
Complementing:
findInterval(x[is.finite(x)], 1:20)
On Fri, Oct 1, 2010 at 2:55 PM, Henrique Dallazuanna www...@gmail.comwrote:
Try this:
x[is.finite(x)]
On Fri, Oct 1, 2010 at 2:51 PM, mlar...@rsmas.miami.edu wrote:
I calculated a large vector. Unfortunately, I have some measurement
Dear R Users,
Was wondering if anyone can give me pointers to functionality in R that
can help clean a time series ? For example, some kind of
package/functionality which identifies potential errors and takes some
action, such as replacement by some suitable value (carry-forward, average
of
The zoo package has six na.* routines for carrying values
forward, etc.
library(zoo)
?zoo
describes them. Also see the vignettes.
On Fri, May 23, 2008 at 6:55 AM, [EMAIL PROTECTED] wrote:
Dear R Users,
Was wondering if anyone can give me pointers to functionality in R that
can help clean
I'm trying to work on a large dataset and after each segment of run, I need
a command to flush the memory. I tried gc() and rm(list=ls()) but they don't
seem to help. gc() does not do anything beside showing the memory usage.
I'm using the package BSgenome from BioC.
Thanks a bunch
--
Regards,
On 5/14/2008 3:59 PM, Anh Tran wrote:
I'm trying to work on a large dataset and after each segment of run, I need
a command to flush the memory. I tried gc() and rm(list=ls()) but they don't
seem to help. gc() does not do anything beside showing the memory usage.
How do you know it does
Dear R users,
I have a huge database and I need to adjust it somewhat.
Here is a very little cut out from database:
CODENAME DATE
DATA1
4813ADVANCED TELECOM19870.013
3845ADVANCED THERAPEUTIC SYS
Here is how to wittle it down for the first two parts of your
question. I am not exactly what you are after in the third part. Is
it that you want specific DATEs or do you want the ratio of the
DATE[max]/DATE[min]?
x - read.table(textConnection(CODENAME
42 matches
Mail list logo