Great question. What do I want? I want my co-workers to stop using Excel
spreadsheets for data entry, storage, and sharing! I want them to
understand the value of data discipline. But alas . . . .
I work in a county health department in the US. Between dplyr, stringr,
grep, grepl, and the base R
I don't think my view is of interest to many, so offlist.
I reject this:
" I would consider data analysis work to be three stages: data preparation,
statistical analysis, and producing the report."
For example, there is no such thing as "outliers" -- data to be removed as
part of
R has a very wide audience, clinical research, astronomy, psychology, and
so on and so on.
I would consider data analysis work to be three stages: data preparation,
statistical analysis, and producing the report.
This regards the process of getting the data ready for analysis and
reporting,
You are using terms and concepts that apply to spreadsheets, but do not apply
to R or CSV files. Please conform to the Posting Guide and make a reproducible
example [1][2][3] using R code to demonstrate your problem. I suspect you will
find that your problem begins in your spreadsheet and not
Hello This relates to trying to upload csv files to R. Essentially I have
some v large csv files, but in the column where the dates are appears the column
entry "00:00.0" for every line. But in the formula bar appears a date as well,
for example "01/04/09 00:00.0", and this never appears in
If this is a bioconductor package, why do you not post on the bioconductor
list?
-- Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Mon, Sep 18, 2017 at
Hello,
I would like to do a partial least square discriminant analysis (PLSDA) in R
using the package "ropls"
Which is in R available via the R command :
source("https://bioconductor.org/biocLite.R;)
I try to do a PLSDA to illustrate the impact of two genders (AP,C) on 5
compounds measured in
Hello,
I would like to do a partial least square discriminant analysis (PLSDA) in R
using the package "ropls"
Which is in R available via the R command :
source("https://bioconductor.org/biocLite.R;)
When I try to do a PLSDA using my own data.
The impact of two genders (AP,C) on 5 compounds
2017, 21:26 AbouEl-Makarim Aboueissa <
> abouelmakarim1...@gmail.com> wrote:
>
>> Dear All:
>>
>>
>> It was saved, but there was a space somewhere. So it works for me now.
>>
>> I do have another similar problem.
>>
>> I saved an R data file
where. So it works for me now.
>
> I do have another similar problem.
>
> I saved an R data file
>
>
> save(datahs0csv,file="
> F:\Fall_2017\5-STA574\2-Notes\1-R\1-R_new\chapter4-Entering_Data/
> datahs0csv2.rda")
>
> *The new R data file "*datahs0csv
Dear All:
It was saved, but there was a space somewhere. So it works for me now.
I do have another similar problem.
I saved an R data file
save(datahs0csv,file="
F:\Fall_2017\5-STA574\2-Notes\1-R\1-R_new\chapter4-Entering_Data/datahs0csv2
.rda")
*The new R data file "*
tando AbouEl-Makarim Aboueissa <abouelmakarim1...@gmail.com>:
Dear All:
I am trying to load an R data set, but I got the following message. Please
see below. The file is there.
setwd("F:/Fall_2017/5-STA574/2-Notes/1-R/1-R_new/chapter4-Entering_Data")
datahs0csv <- read.tab
Dear All:
>
> I am trying to load an R data set, but I got the following message. Please
> see below. The file is there.
>
> setwd("F:/Fall_2017/5-STA574/2-Notes/1-R/1-R_new/chapter4-Entering_Data")
>
> datahs0csv <- read.table("hs0.csv", header=T, sep=
Dear All:
I am trying to load an R data set, but I got the following message. Please
see below. The file is there.
setwd("F:/Fall_2017/5-STA574/2-Notes/1-R/1-R_new/chapter4-Entering_Data")
datahs0csv <- read.table("hs0.csv", header=T, sep=",")
attach(datahs0cs
I know there are ways around the 'can't allocate a vector of size x GB' errors,
but I'm stumped.
So my raw data has >7 million rows and eight columns. That's not a problem
itself.
Using the confreq package (for configural frequency analysis), I take my data
and run it through the package's
thank you both... assumption is in fact that a and b are always the same
length... these work for me well...
much appreciate it...
Andras
On Sunday, August 6, 2017 12:14 PM, Ulrik Stervbo
wrote:
Hi Andreas,
assuming that the increment is always indicated by the
Hi Andreas,
assuming that the increment is always indicated by the same value (in your
example 0), this could work:
df$a <- cumsum(seq_along(df$b) %in% which(df$b == 0))
df
HTH,
Ulrik
On Sun, 6 Aug 2017 at 18:06 Bert Gunter wrote:
> Your specification is a bit unclear
Your specification is a bit unclear to me, so I'm not sure the below
is really what you want. For example, your example seems to imply that
a and b must be of the same length, but I do not see that your
description requires this. So the following may not be what you want
exactly, but one way to do
Dear All,
wonder if you have thoughts on the following:
let us say we have:
df<-data.frame(a=c(1,2,3,4,5,1,2,3,4,5,6,7,8),b=c(0,1,2,3,4,0,1,2,3,4,5,6,7))
I would like to rewrite values in column name "a" based on values in column
name "b", where based on a certain value of column "b" the
Re-importing the data with read.table's strip.white=TRUE argument may be an
easier way to deal with the problem (if the problem is leading or trailing
whitespace).
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Jun 1, 2017 at 9:17 AM, David Winsemius
wrote:
>
> >
On Thu, 1 Jun 2017, Rui Barradas wrote:
Hello,
In order for us to help we need to know how you've imported your data. What
was the file type? What instructions have you used to import it? Did you use
base R or a package?
Give us a minimal but complete code example that can reproduce your
> On Jun 1, 2017, at 8:57 AM, William Dunlap via R-help
> wrote:
>
> Check for leading or trailing spaces in the strings in your data.
> dput(dataset) would show them.
This function would strip any leading or trailing spaces from a column:
trim <-
function (s)
: Thursday, June 1, 2017 11:07 AM
To: Ulrik Stervbo <ulrik.ster...@gmail.com>; Rui Barradas
<ruipbarra...@sapo.pt>; Tara Adcock <taraadco...@hotmail.com>;
r-help@r-project.org
Cc: William Dunlap via R-help <r-help@r-project.org>
Subject: Re: [R] Data import R: so
lto:r-help-boun...@r-project.org] On Behalf Of Ulrik Stervbo
Sent: Thursday, June 1, 2017 10:50 AM
To: Rui Barradas <ruipbarra...@sapo.pt>; Tara Adcock <taraadco...@hotmail.com>;
r-help@r-project.org
Subject: Re: [R] Data import R: some explanatory variables not showing up
correctly in summ
Check for leading or trailing spaces in the strings in your data.
dput(dataset) would show them.
Bill Dunlap
TIBCO Software
wdunlap tibco.com
On Thu, Jun 1, 2017 at 8:49 AM, Ulrik Stervbo
wrote:
> Hi Tara,
>
> It seems that you categorise and count for each category.
Hi Tara,
It seems that you categorise and count for each category. Could it be that
the method you use puts everything that doesn't match the predefined
categories in Other?
I'm only guessing because without a minimal reproducible example it's
difficult to do anything else.
Best wishes
Ulrik
Hello,
In order for us to help we need to know how you've imported your data.
What was the file type? What instructions have you used to import it?
Did you use base R or a package?
Give us a minimal but complete code example that can reproduce your
situation.
Hope this helps,
Rui Barradas
Hi,
I have a question regarding data importing into R.
When I import my data into R and review the summary, some of my explanatory
variables are being reported as if instead of being one variable, they are two
with the same name. See below for an example;
Behav person Behav dog
-project.org; Shawn Way <s...@meco.com>; Enrico Schumann
<e...@enricoschumann.net>
Cc: r-help@r-project.org
Subject: Re: [R] Data and Variables from Tables
He offered two solutions, and I want to second the vote against the first one.
I often put large numbers of configuration variab
umber of values for programming and their
>documentation significantly easier.
>
>Thank you
>
>Shawn Way, PE
>
>-Original Message-
>From: Enrico Schumann [mailto:e...@enricoschumann.net]
>Sent: Tuesday, March 21, 2017 4:40 PM
>To: Shawn Way <s...@meco.com>
>Cc: r-h
t;s...@meco.com>
Cc: r-help@r-project.org
Subject: Re: [R] Data and Variables from Tables
On Tue, 21 Mar 2017, Shawn Way writes:
> I have an org-mode table with the following structure that I am
> pulling into an R data.frame, using the sfsmisc package and using
> xtable to pr
On Tue, 21 Mar 2017, Shawn Way writes:
> I have an org-mode table with the following structure
> that I am pulling into an R data.frame, using the
> sfsmisc package and using xtable to print in org-mode
>
> | Symbol | Value | Units |
> |--+---+---|
> | A
I have an org-mode table with the following structure that I am pulling into an
R data.frame, using the sfsmisc package and using xtable to print in org-mode
| Symbol | Value | Units |
|--+---+---|
| A | 1 | kg/hr|
"data.frame", row.names = c(NA, -396L
))
Regards
Duncan
-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Monday, 19 December 2016 13:47
To: Duncan Mackay
Cc: R
Subject: Re: [R] data manipulation
> On Dec 18, 2016, at 5:39 PM, Duncan Mackay <
an
>>> Duncan Mackay
>>> Department of Agronomy and Soil Science
>>> University of New England
>>> Armidale NSW 2351
>>> Email: home: mac...@northnet.com.au
>>>
>>> -Original Message-
>>> From: R-help [mailto:r-help-boun
Original Message-
> From: David Winsemius [mailto:dwinsem...@comcast.net]
> Sent: Monday, 19 December 2016 05:36
> To: Duncan Mackay
> Cc: R
> Subject: Re: [R] data manipulation
>
>
>> On Dec 17, 2016, at 7:57 PM, Duncan Mackay <dulca...@bigpond.com> wrote:
>
Hi David
Thanks for the info.
As a test I am attaching it anyway
Regards
Duncan
-Original Message-
From: David Winsemius [mailto:dwinsem...@comcast.net]
Sent: Monday, 19 December 2016 05:36
To: Duncan Mackay
Cc: R
Subject: Re: [R] data manipulation
> On Dec 17, 2016, at 7:57
New England
>> Armidale NSW 2351
>> Email: home: mac...@northnet.com.au
>>
>> -----Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rui Barradas
>> Sent: Thursday, 15 December 2016 01:19
>> To: Farshad Fathian; r-help
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rui Barradas
> Sent: Thursday, 15 December 2016 01:19
> To: Farshad Fathian; r-help
> Subject: Re: [R] data manipulation
>
> Hello,
>
> Please cc your mails to the list.
ect.org] On Behalf Of Rui Barradas
> Sent: Thursday, 15 December 2016 01:19
> To: Farshad Fathian; r-help
> Subject: Re: [R] data manipulation
>
> Hello,
>
> Please cc your mails to the list.
> As for your data, your url is wrong, you need to contact Massey or maybe
> the
of Agronomy and Soil Science
University of New England
Armidale NSW 2351
Email: home: mac...@northnet.com.au
-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rui Barradas
Sent: Thursday, 15 December 2016 01:19
To: Farshad Fathian; r-help
Subject: Re: [R] data
That should read "now" instead of "not"
On Thursday, December 15, 2016 6:49 PM, John Kane
wrote:
It downloaded a file for me earlier but I am not getting the 404 error and I
did not bother to save the download. Shrug.
On Wednesday, December 14, 2016 6:57
It downloaded a file for me earlier but I am not getting the 404 error and I
did not bother to save the download. Shrug.
On Wednesday, December 14, 2016 6:57 AM, John Kane via R-help
wrote:
xx <- read.csv("http://massey.ac.nz/~pscoperwait/ts/cbe.dat;)
gives me
On Behalf Of Rui Barradas
Sent: Wednesday, December 14, 2016 6:12 AM
To: John Kane; Farshad Fathian; r-h...@stat.math.ethz.ch
Subject: Re: [R] data manipulation
Hello,
What do you mean by "gives me something"?
xx <- read.csv("http://massey.ac.nz/~pscoperwait/ts/cbe.dat;)
Erro
ge Station, TX 77840-4352
>
>
>
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Rui Barradas
> Sent: Wednesday, December 14, 2016 6:12 AM
> To: John Kane; Farshad Fathian; r-h...@stat.math.ethz.ch
> Subject: Re: [R] data manipulati
...@stat.math.ethz.ch
Subject: Re: [R] data manipulation
Hello,
What do you mean by "gives me something"?
xx <- read.csv("http://massey.ac.nz/~pscoperwait/ts/cbe.dat;)
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
Hello,
Please cc your mails to the list.
As for your data, your url is wrong, you need to contact Massey or maybe
the source of your information and get a valid internet address.
Without one there's not much we can do.
Rui Barradas
Em 14-12-2016 12:16, Farshad Fathian escreveu:
Hello,
Hello,
What do you mean by "gives me something"?
xx <- read.csv("http://massey.ac.nz/~pscoperwait/ts/cbe.dat;)
Error in file(file, "rt") : cannot open the connection
In addition: Warning message:
In file(file, "rt") :
cannot open URL 'http://massey.ac.nz/~pscoperwait/ts/cbe.dat': HTTP
status
xx <- read.csv("http://massey.ac.nz/~pscoperwait/ts/cbe.dat;)
gives me something. Since we have no idea of what you are doing I don't know if
the data has downloaded correctly
On Tuesday, December 13, 2016 1:38 PM, Farshad Fathian
wrote:
Hi,
I couldn't
On Tue, Dec 13, 2016 at 3:23 AM, Farshad Fathian
wrote:
> Hi,
>
> I couldn't access to data file about PSCoperwait by
> http://massey.ac.nz/~pscoperwait/ts/cbe.dat.
>
First off, this post is nearly useless. You don't tell us what you tried
to do. And you didn't tell
Hello,
And what has your question to do with R?
Please read the posting guide before posting and when you do, post a
question where at least the link is correct.
Rui Barradas
Em 13-12-2016 09:23, Farshad Fathian escreveu:
Hi,
I couldn't access to data file about PSCoperwait by
Hi,
I couldn't access to data file about PSCoperwait by
http://massey.ac.nz/~pscoperwait/ts/cbe.dat.
Looking forward to hearing from you,
[[alternative HTML version deleted]]
__
R-help@r-project.org mailing list -- To UNSUBSCRIBE and
This should be reasonably efficient with 'dplyr':
> library(dplyr)
> input <- read.csv(text = "state,city,x
+ 1,12,100
+ 1,12,100
+ 1,12,200
+ 1,13,200
+ 1,13,100
+ 1,13,100
+ 1,14,200
+ 2,21,200
+ 2,21,200
+ 2,21,100
+ 2,23,100
+ 2,23,200
+ 2,34,200
+ 2,34,100
+ 2,35,100")
>
> result <- input
Hi all,
I am trying to read and summarize a big data frame( >10M records)
Here is the sample of my data
state,city,x
1,12,100
1,12,100
1,12,200
1,13,200
1,13,100
1,13,100
1,14,200
2,21,200
2,21,200
2,21,100
2,23,100
2,23,200
2,34,200
2,34,100
2,35,100
I want get the total count by state, and
Frank S.
De: Bert Gunter <bgunter.4...@gmail.com>
Enviat el: dilluns, 26 de setembre de 2016 23:18:52
Per a: Ista Zahn
A/c: Frank S.; r-help@r-project.org
Tema: Re: [R] Using lapply in R data table
... and just for fun, here's an alternative in which mapply() is used
to
fini group exposure
>>>>> 1 2 2005-04-20 A1
>>>>> 2 2 2005-04-20 A1
>>>>> 3 2 2005-04-20 A1
>>>>> 4 5 2006-02-19 B 0.87
>>>>> 5 5 2006-02-19 B 0.87
>>>>> 6
7 2006-10-08 A 0.5
>>>>
>>>>
>>>> (but note that exposure is a factor, not numeric)
>>>>
>>>>
>>>> Cheers,
>>>> Bert
>>>>
>>>> Bert Gunter
>>>>
>>>> "
athed in his "Bloom County" comic strip )
>>>
>>>
>>> On Mon, Sep 26, 2016 at 10:05 AM, Ista Zahn <istaz...@gmail.com> wrote:
>>>> Hi Frank,
>>>>
>>>> lapply(DT) iterates over each column. That doesn't seem to be what you
each column. That doesn't seem to be what you want.
>>>
>>> There are probably better ways, but here is one approach.
>>>
>>> DT[, exposure := vector(mode = "numeric", length = .N)]
>>> DT[fini < as.Date("2006-01-01"), exposure := 1]
what you want.
>>
>> There are probably better ways, but here is one approach.
>>
>> DT[, exposure := vector(mode = "numeric", length = .N)]
>> DT[fini < as.Date("2006-01-01"), exposure := 1]
>> DT[fini >= as.Date("2006-01-01") & f
ic", length = .N)]
> DT[fini < as.Date("2006-01-01"), exposure := 1]
> DT[fini >= as.Date("2006-01-01") & fini <= as.Date("2006-06-30"),
> exposure := difftime(as.Date("2007-01-01"), fini, units="days")/365.25]
>
.Date("2006-01-01") & fini <= as.Date("2006-06-30"),
exposure := difftime(as.Date("2007-01-01"), fini, units="days")/365.25]
DT[fini >= as.Date("2006-07-01"), exposure := 0.5]
Best,
Ista
On Mon, Sep 26, 2016 at 11:28 AM, Frank S
Dear all,
I have a R data table like this:
DT <- data.table(
id = rep(c(2, 5, 7), c(3, 2, 2)),
fini = rep(as.Date(c('2005-04-20', '2006-02-19', '2006-10-08')), c(3, 2, 2)),
group = rep(c("A", "B", "A"), c(3, 2, 2)) )
I want to construct a new variable
Hi,
SolutionMetrics is presenting Data Visualisation and Data Science/Predictive
Modelling courses in Sydney, Melbourne, Canberra and Adelaide.
Data Visualisation (1 Day)
Introduction to Data Analysis and Graphics - Histograms, Box Plots, Bar Charts,
Scatter Plots; Changing symbols, colours,
Hello,
Try the following.
dat <- read.csv(text = "
Regime, Industry, Cost
10, 01, 370
11, 01, 400
10, 02, 200
10, 01, 500
11, 02, 60
10, 02, 30
")
dat
res <- aggregate(Cost ~ Industry + Regime, data = dat, sum)
res <- res[order(res$Industry), ]
res
And see the help page ?aggregate
Hope
: Paolo Letizia <paolo.leti...@gmail.com>
> Cc: R Help <r-help@r-project.org>
> Subject: Re: [R] Data aggregation
>
> ?tapply
>
> You should have encountered this already in most basic R tutorials.
> Have you gone through any? If not, you should. In particular,you need to
&g
?tapply
You should have encountered this already in most basic R tutorials.
Have you gone through any? If not, you should. In particular,you need
to learn about R's basic data structures (e.g. data frames).
Alternatively, the dplyr package has many elegant tools for this sort
of thing. You might
Dear All:
I have a data frame with 3 columns: "Regime", "Industry", and "Cost".
I want to sum the value of "Cost" for each industry and "Regime".
Example:
The data frame is:
Regime, Industry, Cost
10, 01, 370
11, 01, 400
10, 02, 200
10, 01, 500
11, 02, 60
10, 02, 30
I want the following output:
Hi,
I have a dataset obtained as:
mydata <- read.csv("data.csv", header = TRUE) which contains the variable
'y' (y is binary 0 or 1) and also another variable 'weight' (weight is a
numerical variable - taking fractional values between 0 and 1).
1>
I want to first apply ctree() on mydata, but
You could use transform() instead of [[<- to add columns to your data.frame
so the new columns get transformed they way they do when given to the
data.frame function itself. E.g.,
> dd <- data.frame(X=1:5, Y=11:15)
> str(transform(dd, Z=matrix(X+Y,ncol=1,dimnames=list(NULL, "NewZ"
> On Apr 23, 2016, at 8:59 AM, thomas mann wrote:
>
> I am attempting to add a calculated column to a data frame. Basically,
> adding a column called "newcol2" which are the stock closing prices from 1
> day to the next.
>
> The one little hang up is the name of the
I am attempting to add a calculated column to a data frame. Basically,
adding a column called "newcol2" which are the stock closing prices from 1
day to the next.
The one little hang up is the name of the column. There seems to be an
additional data column name included in the attributes
In R, square brackets [] are called "extraction operators" as they are
interpreted so as to "extract" the parts of an object specified by the
information within them. Your message contained only part of the line
below:
AltB<-svdatstr[row,indicesA][svdatstr[row,indicesA]
Hi Jim,
Thanks for your time. But somehow this code did not help me to achieve my
expected output.
Problems: 1) x, y are coming as logical rather than values as I mentioned
in my post
2) The values that I get for Max A and Max B not correct
3) It looks like a pretty
Hi sri,
I think that I see what you mean. Your statements:
x = Count_A_less_than_max of (Count type B)
y = Count_A_higher_than_max of (Count type B).
I took to mean that you wanted a logical value for x and y. Looking
more closely at your initial message, I see that you wanted _all_
values of A
Hi sri,
As your problem involves a few logical steps, I found it easier to
approach it in a stepwise way. Perhaps there are more elegant ways to
accomplish this.
svdat<-read.table(text="Count id name type
117 335 sally A
19 335 sally A
167 335 sally B
18 340 susan A
56 340 susan A
22 340 susan B
Dear All,
I am trying to reshape the data with some conditions. A small part of the
data looks like below. Like this there will be more data with repeating ID.
Count id name type
117 335 sally A
19 335 sally A
167 335 sally B
18 340 susan A
56 340 susan A
22 340 susan B
53 340 susan B
135 351
>
> I'm new to R and wants to read XML file as R data frame. Is there any
> package that could be used for this purpose.
>
>
> I will really appreciate your response.
>
>
> Many Thanks and
>
>
> Kind Regards
>
> --
> Muhammad Bilal
> Resear
Hi All,
I'm new to R and wants to read XML file as R data frame. Is there any package
that could be used for this purpose.
I will really appreciate your response.
Many Thanks and
Kind Regards
--
Muhammad Bilal
Research Assistant and PhD Student,
Bristol Enterprise, Research
Hi,
I found the error.
Thanks in advance
On Wed, Feb 17, 2016 at 4:01 PM, Shane Carey wrote:
> Hi,
>
> Im trying to append rows to a data frame using smartbind
>
> I have 3 dataframes:
>
> > dim(DATA_WH)[1] 235 24> dim(DATA_GW)[1] 3037 41> dim(DATA_NFGWS)[1] 2485
> >
Hi,
Im trying to append rows to a data frame using smartbind
I have 3 dataframes:
> dim(DATA_WH)[1] 235 24> dim(DATA_GW)[1] 3037 41> dim(DATA_NFGWS)[1] 2485
> 62
B<-smartbind(DATA_NFGWS,DATA_WH)
However I get the following error:
Error in `[.data.frame`(block, , col) : undefined
The tables and vectors storing the data will be used for accessing the data
(sequentially is also fine) to do calculations as needed.
RegardsAlex
On Monday, February 15, 2016 7:17 PM, Bert Gunter
wrote:
I would say that it depends on what you want to do with
I would say that it depends on what you want to do with the data.
Bert
On Monday, February 15, 2016, Alaios via R-help
wrote:
> Dear all,I am using R to emulate radio propagation dynamics.
> I have 90 antennas in a region and each of these 90 antennas hold
> information
Dear all,I am using R to emulate radio propagation dynamics.
I have 90 antennas in a region and each of these 90 antennas hold information
about 36 points (these are all exactly the same and there is no need to
differentiate them further)
Each of these antennas now should keep information about
for compatibility with most R data visualization and
interactive visualization packages, such as ggplot2 and rCharts.
rZeppelin is available here: https://github.com/elbamos/Zeppelin-With-R
[[alternative HTML version deleted]]
__
R-help@r
Hi
I need a data set containing both numerical and categorical variables and a
two-class outcome to be used in examples of my R package (for variable
selection). Do you know any well-known one? I prefer it to be related to
healthcare and to have at least about 15 variables.
Regards
Farideh
Hi Peter and Jeff!
Thanks very much for your code! Both worked perfectly in my data set!!
All best,
Raoni
2015-10-10 21:40 GMT-03:00 peter dalgaard :
>
>> On 11 Oct 2015, at 02:12 , Jeff Newmiller wrote:
>>
>> Sorry I missed the boat the first time,
Sorry, looked like there were a different number of rows in the results because
the rownames were different. I also see that the OP was interested in any
Groups, not just the two in the example, so your solution probably meets the
requirements better than mine
Hello R-Helpers!
I have a data-frame as below (dput in the end of mail) and need to
select just the first sequence of occurrence of each "Group" in each
"ID".
For example, for ID "1" I have two sequential occurrences of T2 and
two sequential occurrences of T3:
> test [test$ID == 1, ]
ID
?aggregate
in base R. Make a short function that returns the first element of a vector and
give that to aggregate.
Or...
library(dplyr)
( test %>% group_by( ID, Group ) %>% summarise( Var=first( Var ) ) %>%
as.data.frame )
These situations where the desired results depend on the order of observations
in a dataset do tend to get a little tricky (this is one kind of problem that
is easier to handle in a SAS DATA step with its sequential processing
paradigm). I think this will do it:
keep <- function(d)
with(d,
Sorry I missed the boat the first time, and while it looks like Peter is
getting closer I suspect that is not quite there either due to the T2
being considered separate from T3 requirement.
Here is another stab at it:
library(dplyr)
# first approach is broken apart to show the progression of
Hello Jeff!
Thanks very much for your prompt reply, but this is not exactly what I
need. I need the first sequence of records. In example that I send, I
need the first seven lines of group "T2" in ID "1" (lines 3 to 9) and
others six lines of group "T3" in ID "1" (lines 10 to 15). I have to
> On 11 Oct 2015, at 02:12 , Jeff Newmiller wrote:
>
> Sorry I missed the boat the first time, and while it looks like Peter is
> getting closer I suspect that is not quite there either due to the T2 being
> considered separate from T3 requirement.
Er, what do you
I have a tab delimited table in the data directory of a package.
I would like that when loading this data with
data(tablename)
in the example section the strings are not coerced to factors.
How can I achieve it? Or should I move this tables to the inst/extdata
directory and load them with
This is the kind of problem the package tidyR has been designed for.
On 19 Aug 2015, at 16:29, minikg min...@cmfri.org.in wrote:
Hi,
I have a dataset consisting of landmarks of each sample's coordinates as
given below.
landmark X Y X Y X Y
P1
Hi,
I have a dataset consisting of landmarks of each sample's coordinates as
given below.
landmarkX Y X Y X Y
P1 534 7 26 7 32
P2 46 45 48 42 44 48
P3 73 45 72 44
Subject: [R] data format
Hi,
I have a dataset consisting of landmarks of each sample's coordinates as
given below.
landmarkX Y X Y X Y
P1 534 7 26 7 32
P2 46 45 48 42 44 48
P3 73 45
Hello all,
I would like to take a data frame such as the following one:
df -
data.frame(id=c(A,A,B,B),first=c(BX,NA,NA,LF),second=c(NA,TD,BZ,NA),third=c(NA,NA,RB,BT),fourth=c(LG,QR,NA,NA))
df
id first second third fourth
1 ABX NA NA LG
2 A NA TD NA QR
3 B NA
Here's one way in base R:
df - data.frame(id=c(A,A,B,B),
first=c(BX,NA,NA,LF),
second=c(NA,TD,BZ,NA),
third=c(NA,NA,RB,BT),
fourth=c(LG,QR,NA,NA))
new_df - data.frame(do.call(rbind, by(df, df$id, function(x) {
sapply(x[,-1],
101 - 200 of 1414 matches
Mail list logo