Attachments have been stripped by the mailing list. Read the Posting Guide.
Also, English can help, but R code can be ever so much more clear in indicating
what you have to work with and even what you want out of the broken/missing
part of your code.
https://cran.r-project.org/web/packages/repr
I really think you need to create a simple reprex to show us what you want
to do. In doing so, you may figure out how to get what you want. I suspect
you may also need to spend some more time learning R -- following rote
examples can be a fool's errand if you don't know the basics.
Bert Gunter
"T
Hi,
I am trying to merge columns from four different .csv files into one
dataframe. I am trying to do something like this
https://statisticsglobe.com/merge-csv-files-in-r . I am taking long format
.csv files, 1 being the base file (testing-long.csv) which I change to wide
format first and the thre
r-help mailing list
> Subject: Re: [R] merging multiple .csv files
>
> I know that but I do not want to merge them sequentially because I may lose
> some rows which are present in one file while the other doesn't have. I
> googled and found something called multmerge but the c
Did you work the examples in help("merge")? Also, have you looked at
the "dplyr" package? It has 9 different vignettes. The lead author is
Hadley Wickham, who won the 2019 COPSS Presidents' Award for work like
this.
Alternatively, you could manually read all 10 files, then figure out
I know that but I do not want to merge them sequentially because I may lose
some rows which are present in one file while the other doesn't have. I
googled and found something called multmerge but the code is not working
for me. I used the following:
path <-"P:/Documents/Puja Desktop items/Documen
?read.csv to read your csv files in data frames
?merge to merge them (sequentially).
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Tue, Dec 15, 2020 at 1:36
Hi All,
I have 10 .csv files containing 12 to 15 columns but have some columns in
common. I need to join all of my .csv files into one using one common
column ‘Pos’. The header portion of my .csv files looks as shown below:
Chrom Pos Avg Stdev A15_3509.C A31_3799.C A32_3800.C A35_3804.C Gene ID
Sorry... I missed that the mailing list had been removed from this email.
Tina... always keep the mailing list included when you ask questions.
On Thu, 13 Jun 2019, Jeff Newmiller wrote:
I am sorry I did not read more closely earlier... I agree with Bert... you do
need to spend some time learn
Jeff:
Your solution is not quite what she asked for (she wanted a data frame, not
a list).
Moreover, most of the time it is done automatically as the first step of a
tapply() /filter() type operation or is inherent in modeling and
trellis-type plots. I *still* suspect it is unnecessary, but of cou
I do it regularly.
Base R:
result <- split( DF[ , 4, drop=FALSE ], DF[ , -4 ] )
Tidyverse:
library(tidyr)
result <- nest( DF, time )
filter( result, "a2"==a & "b1"==b & "c1"==c )[[ "data" ]]
On Thu, 13 Jun 2019, Bert Gunter wrote:
Why? I suspect that there is no reason that you need to do t
Why? I suspect that there is no reason that you need to do this.
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On Thu, Jun 13, 2019 at 1:22 PM Tina
How about just
df$time[match(paste(df$a, df$b, df$c), c(
"co mb o1",
..
"co mb oN"))]
On Fri, 14 Jun 2019 at 08:22, Tina Chatterjee
wrote:
> Hello everyone!
> I have the following dataframe(df).
>
> a<-c("a1","a2","a2","a1","a1","a1")
> b<-c("b1","b1","b1","b1","b1","b2")
> c<-c(
Hello everyone!
I have the following dataframe(df).
a<-c("a1","a2","a2","a1","a1","a1")
b<-c("b1","b1","b1","b1","b1","b2")
c<-c("c1","c1","c1","c1","c1","c2")
time <- c(runif(6,0,1))
df<-data.frame(a,b,c,time)
df
a b c time
1 a1 b1 c1 0.28781082
2 a2 b1 c1 0.02102591
3 a2 b1 c1 0.724
Have a look at anti_join() from the dplyr package. It does exactly
what you want. Here is an example based on the code of Robin
Table_A <- as.data.frame(Table_A, stringsAsFactors = FALSE)That is
Table_B <- as.data.frame(Table_B, stringsAsFactors = FALSE)
library(dplyr)
anti_join(Table_A, Table_B,
Hi,
I'll coded your example into R code:
Table_A <- c('a...@gmail.com', 'John Chan', '0909')
Table_A <- rbind(Table_A, c('b...@yahoo.com', 'Tim Ma', '89089'))
colnames(Table_A) <- c('Email', 'Name', 'Phone')
Table_A
Table_B <- c('a...@gmail.com', 'John Chan', 'M', '0909')
Table_B <- rbind(Table_
Thanks - Peter, Eivind, Rui
Sorry, I perhaps could not explain it properly in the first go.
Trying to simplify it here with an example - Say I have two dataframes as
below that are NOT equally-sized data frames (i.e., number of columns are
different in each table):
Table_A:
Email
...@r-project.org<mailto:r-help@r-project.org>t 9:05
PM
Subject: Re: [R] Merging dataframes
To: Rui Barradas mailto:ruipbarra...@sapo.pt>>
Cc: Chintanu mailto:chint...@gmail.com>>, R help
mailto:r-help@r-project.org>>
I'd expect more like
setdiff(A$key, B$key)
and
On Tue, 1 May 2018, Chintanu wrote:
Hi,
May I please ask how I do the following in R. Sorry - this may be trivial,
but I am struggling here for this.
For two dataframes (A and B), I wish to identify (based on a primary
key-column present in both A & B) -
1. Which records (rows) of A did no
I'd expect more like
setdiff(A$key, B$key)
and vice versa. Or, if you want the actual rows
A[!(A$key %in% B$key),]
or for the row numbers
which(!(A$key %in% B$key))
-pd
> On 1 May 2018, at 12:48 , Rui Barradas wrote:
>
> Hello,
>
> Is it something like this that you want?
>
> x <- d
Hello,
Is it something like this that you want?
x <- data.frame(a = c(1:3, 5, 5:10), b = c(1:7, 7, 9:10))
y <- data.frame(a = 1:10, b = 1:10)
which(x != y, arr.ind = TRUE)
Hope this helps,
Rui Barradas
On 5/1/2018 11:35 AM, Chintanu wrote:
Hi,
May I please ask how I do the following in R
Hi,
May I please ask how I do the following in R. Sorry - this may be trivial,
but I am struggling here for this.
For two dataframes (A and B), I wish to identify (based on a primary
key-column present in both A & B) -
1. Which records (rows) of A did not match with B, and
2. Which records
To expand on what Bert suggests. Use:
loadToEnv <- function(file, ..., envir = new.env()) {
base::load(file = file, envir = envir, ...)
}
envA <- loadToEnv("a.RData")
envB <- loadToEnv("b.RData")
and then access the objects in environments envA and envB using
environment access methods, e.g.
?load
Read this carefully. Pay attention to its instructions re: overwriting
existing objects.
Cheers,
Bert
Bert Gunter
"The trouble with having an open mind is that people keep coming along and
sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )
On T
ne
object).
BTW, what do you mean exactly by "combine/consolidate"?
And finally, post your questions in plain text not html, otherwise
they can be mangled.
Cheers
Petr
-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
Steven Yen
Sent: Tue
?
>>>
>>> And finally, post your questions in plain text not html, otherwise
>>> they can be mangled.
>>>
>>> Cheers
>>> Petr
>>>
>>>> -Original Message-
>>>> From: R-help [mailto:r-help-boun...@r-proje
W, what do you mean exactly by "combine/consolidate"?
And finally, post your questions in plain text not html, otherwise they can be
mangled.
Cheers
Petr
-Original Message-
From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Steven Yen
Sent: Tuesday, January 16, 2
t not html, otherwise they can
> be mangled.
>
> Cheers
> Petr
>
>> -Original Message-
>> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Steven Yen
>> Sent: Tuesday, January 16, 2018 9:44 AM
>> To: r-help@r-project.org
>> Su
"?
And finally, post your questions in plain text not html, otherwise they can be
mangled.
Cheers
Petr
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Steven Yen
> Sent: Tuesday, January 16, 2018 9:44 AM
> To: r-help@r-project.org
&g
On 16/01/2018 3:43 AM, Steven Yen wrote:
I ran two separate hours-long projects. Results of each were saved to
two separate .RData files.
Content of each includes, among others, the following:
me se t p sig
pc21.age 0.640 0.219 2.918 0.004 ***
pc21.agesq
I ran two separate hours-long projects. Results of each were saved to
two separate .RData files.
Content of each includes, among others, the following:
me se t p sig
pc21.age 0.640 0.219 2.918 0.004 ***
pc21.agesq 0.000 0.000 NaN NaN
pc21.inc
Hi Bailey,
I may be misunderstanding what you are doing as I can't work out how
you get unequal column lengths, but this may help:
myval<-matrix(sample(1:365,740,TRUE),ncol=74)
mydata<-as.data.frame(cbind(1950:1959,myval))
lakenames<-paste(rep(LETTERS[1:26],length.out=74),
rev(rep(letters[1:25],l
I frequently work with mismatched-length data, but I think I would rarely want
this behaviour because there is no compelling reason to believe that all of the
NA values should wind up at the end of the data as you suggest. Normally there
is a second column that controls where things should line
You should review "The Recycling Rule in R" before attempting to
perform functions on 2 or more vectors of unequal lengths:
https://cran.r-project.org/doc/manuals/R-intro.html#The-recycling-rule
Most often, the "Recycling Rule" does exactly what the researcher
intends (automatically). And in many
You can't do that. You can either make a different data frame, or you can stack
the data in additional rows. If you make your example reproducible, we may be
able to give more specific help.
Also, post in plain text to avoid HTML code corruption.
http://stackoverflow.com/questions/5963269/ho
I did not look at the code, but note the following.
By definition,
1. You cannot highlight code in plan text, which is the format accepted by
r-help.
2. You cannot have columns of different lengths in a dataframe.
R. Mark Sharp, Ph.D.
msh...@txbiomed.org
> On Dec 12, 2016, at 5:41 PM, Bail
you also don't need to do a merger if you use a base `geom_map()`
layer with the polygons and another using the fill (or points, lines,
etc).
On Fri, Jun 17, 2016 at 5:08 PM, MacQueen, Don wrote:
> And you can check what David and Jeff suggested like this:
>
> intersect( df$COUNTRY, world_map$reg
Don't use HTML on sending email- messes up the data.
What do you mean that you get lots of duplicates? If you have duplicated
entries in df2 this will lead to dups because of the way merge works (here
is the help file):
If there is more than one match, all possible matches contribute
one r
Hi all,
I have two data sets similar like below and wanted to merge them with variable
"deps". As this is a sample data with small sample size, I don't have any
problem using command merge. However, the actual data set has ~60,000
observations with a lot of repeated measures. For example, for a
And you can check what David and Jeff suggested like this:
intersect( df$COUNTRY, world_map$region )
If they have any values in common, that command will show them. (Note that
I said values in common, not countries in common.)
WARNING:
It appears that you have each country appearing more than on
> On Jun 17, 2016, at 1:06 PM, ch.elahe via R-help wrote:
>
> Hi all,
> I want to use world map in ggplot2 and show my data on world map. my df is:
>
>
>$ COUNTRY : chr "DE" "DE" "FR" "FR" ..
>
>$ ContrastColor : int 9 9 9 9 13 9 9 9 9 ..
>
>$ quant :
You should look at your own data before you post. The information in COUNTRY is
not the same as the information in region.
Also, dput is better than str for posting questions.
--
Sent from my phone. Please excuse my brevity.
On June 17, 2016 1:06:29 PM PDT, "ch.elahe via R-help"
wrote:
>Hi a
Hi all,
I want to use world map in ggplot2 and show my data on world map. my df is:
$ COUNTRY : chr "DE" "DE" "FR" "FR" ..
$ ContrastColor : int 9 9 9 9 13 9 9 9 9 ..
$ quant : Factor w/ 4 levels "FAST","SLOW",..I need to merge my
df with world_map data
Hi
see in line
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Abraham
> Mathew
> Sent: Friday, June 10, 2016 6:15 PM
> To: r-help@r-project.org
> Subject: [R] Merging two data frame with different lengths
>
> So I have
So I have two data frames.
The first one is a reccomendation data frame and the second is a melted
list with a pairing of OpportunityId's and ProductId's. There are multiple
product id's per an opportunty id. What I want to do is merge based on
ProductId so that I can add the OpportunityId to the
I would probably do it this way,
tmp <- list(data.frame(name="sample1", red=20),
data.frame(name="sample1", green=15),
data.frame(name="sample2", red=10),
data.frame(name="sample2", green=30))
fun1 <- function(df) data.frame(name=df$name, color=names(df)[2],
va
iginal Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of Michael
> Dewey
> Sent: Monday, June 6, 2016 3:46 PM
> To: g.maub...@weinwolf.de; r-help@r-project.org
> Subject: Re: [R] Merging variables
>
> X-Originating-<%= hostname %>-IP: [217.15
X-Originating-<%= hostname %>-IP: [217.155.205.190]
Dear Georg
I find it a bit surprising that you end up with customer.x and
customer.y. Can you share with us a toy example of two data.frames which
exhibit this behaviour?
On 06/06/2016 13:29, g.maub...@weinwolf.de wrote:
Hi All,
I merged
0 3
> (is.na(ds_test[,2])+2*is.na(ds_test[,1]))+temp-1
[1] 1 1 2 -1 4
> (is.na(ds_test[,2])+2*is.na(ds_test[,1]))+temp-1
> -Original Message-
> From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of
> g.maub...@weinwolf.de
> Sent: Monday, June 6, 2016 2:30 P
You loop through each row but during each iteration you assign a value to the
entire "mismatch" column. The last value assigned was 1.
Sent from my iPhone
> On Jun 6, 2016, at 8:29 AM, g.maub...@weinwolf.de wrote:
>
> Hi All,
>
> I merged two datasets:
>
> ds_merge1 <- merge(x = ds_bw_custome
You loop through each
Sent from my iPhone
> On Jun 6, 2016, at 8:29 AM, g.maub...@weinwolf.de wrote:
>
> Hi All,
>
> I merged two datasets:
>
> ds_merge1 <- merge(x = ds_bw_customer_4_match, y =
> ds_zww_customer_4_match,
> by.x = "customer", by.y = "customer",
> all.x = TRUE, all.y = FALSE
Hi All,
I merged two datasets:
ds_merge1 <- merge(x = ds_bw_customer_4_match, y =
ds_zww_customer_4_match,
by.x = "customer", by.y = "customer",
all.x = TRUE, all.y = FALSE)
R created a new dataset with the variables customer.x and customer.y. I
would like to merge these two variable back
Here is how you can to it with tidyr:
> x <- list(data.frame(name="sample1", red=20)
+ , data.frame(name="sample1", green=15)
+ , data.frame(name="sample2", red=10)
+ , data.frame(name="sample2", green=30)
+ )
> library(dplyr)
> library(tidyr)
>
> # convert to 'name, type, value';
Thanks, ldply got me a data frame straight away. But it filled empty
spaces with NA and merge no longer works.
> ldply(mylist)
name red green
1 sample1 20NA
2 sample1 NA15
3 sample2 10NA
4 sample2 NA30
> mydf <- ldply(mylist)
> merge(mydf[1,],mydf[2,])
[1] name red gre
Hello,
Sorry, forget my first answer, I misunderstood what you wanted.
Let's try again.
First of all you have a typo in your second sample2, you wrote 'sample
2' with a space.
Now try this.
fun2 <- function(n){
merge(lst[[n]], lst[[n + 1]])
}
N <- which(seq_along(lst) %% 2 == 1)
lst2 <- l
You can use ldply in the plyr package to bind all the data.frames together
(a regular loop will also work). Afterwards you can summarise using ddply
Hope this helps
Ulrik
Ed Siefker schrieb am Fr., 3. Juni 2016 21:10:
> aggregate isn't really what I want. Maybe tapply? I still can't get
> it
Hello,
Maybe something like the following.
lst <-
list(data.frame(name="sample1", red=20), data.frame(name="sample1",
green=15), data.frame(name="sample2", red=10), data.frame(name="sample
2", green=30))
fun <- function(DF){
data.frame(name = DF[, 1], color = colnames(DF)[2], colnum = DF
aggregate isn't really what I want. Maybe tapply? I still can't get
it to work.
> length(mylist)
[1] 4
> length(names)
[1] 4
> tapply(mylist, names, merge)
Error in tapply(mylist, names, merge) : arguments must have same length
I guess because a list isn't an atomic data type. What function wi
I manually constructed the list of sample names and tried the
aggregate call I mentioned.
Merge works when called manually, but not when using aggregate.
> mylist <- list(data.frame(name="sample1", red=20), data.frame(name="sample1",
> green=15), data.frame(name="sample2", red=10), data.frame(na
I have a list of data as follows.
> list(data.frame(name="sample1", red=20), data.frame(name="sample1",
> green=15), data.frame(name="sample2", red=10), data.frame(name="sample 2",
> green=30))
[[1]]
name red
1 sample1 20
[[2]]
name green
1 sample115
[[3]]
name red
1 sample
You have by now seen some other responses on the list. Keeping the list
included will insure you continue to get mulitple eyes looking at the
problem and will benefit others trying to use the answers.
Two comments:
1) Your first format includes a specification for seconds. If that is
nonzero
> On May 22, 2016, at 9:40 AM, Bhaskar Mitra wrote:
>
> Hello, I am trying to merge two text files by using the timestamp
> header for both the files:
>
> The first file has the following format for the timestamp:"2012-01-01
> 23:30:00 UTC"
>
> Timestamp for the second file : 2012-01-01 2330.
Is this the format of a column within the two different files? If they are
columns, here is a way of converting to a common format for merging:
> # convert to POSIXct
> date1 <- as.POSIXct("27-Dec-12 23H 30M 0S", format = "%d-%b-%y %HH %MM
%SS")
> date2 <- as.POSIXct('2012-12-27 2330', format =
What time zone are these data in? Does daylight savings adjustment apply?
--
Sent from my phone. Please excuse my brevity.
On May 22, 2016 9:48:08 AM PDT, Bhaskar Mitra wrote:
>Hello,
>
>My apologies for the earlier posting. There was an error with regard to
>my
>query :
>
>
>I am trying to mer
Hello, I am trying to merge two text files by using the timestamp
header for both the files:
The first file has the following format for the timestamp:"2012-01-01
23:30:00 UTC"
Timestamp for the second file : 2012-01-01 2330.
I am having problems by converting from one timestamp format to anothe
Hello,
My apologies for the earlier posting. There was an error with regard to my
query :
I am trying to merge two text files by using the timestamp
header for both the files:
The first file has the following format for the timestamp:"27-Dec-12 23H
30M 0S"
Timestamp for the second file : 2012-
Kunden <- Kunden_2011
Kunden <- merge(Kunden, Kunden_2012,
by = "Debitor", all = TRUE)
etc.
See ?merge for details.
Best,
Ista
On Wed, Apr 20, 2016 at 2:23 AM, wrote:
> Hi All,
>
> I would like to match some datasets. Both deliver variables AND cases
> which might or might not
> On Apr 19, 2016, at 11:23 PM, g.maub...@weinwolf.de wrote:
>
> Hi All,
>
> I would like to match some datasets. Both deliver variables AND cases
> which might or might not be present in all datasets:
>
> This sequence
>
> Kunden <- Kunden_2011
> Kunden <- merge(Kunden, Kunden_2012,
>
Hi All,
I would like to match some datasets. Both deliver variables AND cases
which might or might not be present in all datasets:
This sequence
Kunden <- Kunden_2011
Kunden <- merge(Kunden, Kunden_2012,
by.x = "Debitor", by.y = "Debitor")
Kunden <- merge(Kunden, Kunden_2013,
And more!
:-)
On Mar 7, 2016, at 8:15 AM, Michael Dewey wrote:
> Inline
>
> On 07/03/2016 12:21, hoda rahmati via R-help wrote:
>> Hi all,I have a data frame which have a column named COUNTRY, I want to
>> merge my data frame with world map from ggmap to plot it on world map, but
>> when I mer
Inline
On 07/03/2016 12:21, hoda rahmati via R-help wrote:
Hi all,I have a data frame which have a column named COUNTRY, I want to merge
my data frame with world map from ggmap to plot it on world map, but when I
merge my data frame with world map I get 0 observations! Here is my main data
fr
Hi all,I have a data frame which have a column named COUNTRY, I want to merge
my data frame with world map from ggmap to plot it on world map, but when I
merge my data frame with world map I get 0 observations! Here is my main data
frame (mydata):
'data.frame': 269265 obs. of 470 variabl
You did not show the structure of your datasets (with, e.g.,
dump(c("datafile1","datafile2"),file=stdout())) nor what your call to
merge() was. However, it may be that you did not use the by.x and by.y
arguments to merge() to specify which columns to match.
txt1 <- "date1 xva
Since the date columns have different names, you need to specify the
by.x and by.y arguments to merge().
Other than that, it should work.
If you need more help, please use dput() to provide some of your data,
and include both the code you used and the error message or incorrect
result you got (th
Hello there
Pardon my ignorance but, I have two data files with different series of dates
and x and y values.
There are common dates in both files. For example
>datafile1
date1 xval
31/12/1982 20
1/01/198330
2/01/198340
3/01/198350
4/
Thanks for your comments. Actually only the last group has a single element.
The first group is always "full" of members and as that it works fine. Some
constant spacing between the groups would be good as well and thus I will check
quantiles.
Thanks for the great support and time invested on th
Whatever approach is "best" to define subsets depends completely on the
semantics of the data. Your approach (a fixed number of equally spaced breaks)
is the right one if the absolute ranges of the data is important. It should be
obvious that either the top or the bottom group could contain only
The breaks are just the min() and max() in your groups. Something like
sprintf("[%5.2f,%5.2f]", min(dBin[groups==2]), max(dBin[groups==2]))
... should achieve what you need.
B.
On Nov 4, 2015, at 8:45 AM, Alaios wrote:
> you are right.
> by labels I mean the "categories", "breaks" that m
you are right.by labels I mean the "categories", "breaks" that my data fall
in.To be part of group 2 for example you have to be in the range of [110,223) I
need to keep those for my plots.
Did I describe it more precisely now?Alex
On Wednesday, November 4, 2015 2:09 PM, Boris Steipe
wr
I don't understand:
- where does the "label" come from? (It's not an element of your data that I
see.)
- what do you want to do with this "label" i.e. how does it need to be
associated with the data?
B.
On Nov 4, 2015, at 7:57 AM, Alaios wrote:
> Thanks it works great and gives me group
Thanks it works great and gives me group numbers as integers and thus I can
with which group the elements as needed (which (groups== 2))
Question though is how to keep also the labels for each group. For example that
my first group is the [13,206)
RegardsAlex
On Wednesday, November 4, 20
I would transform the original numbers into integers which you can use as group
labels. The row numbers of the group labels are the indexes of your values.
Example: assume your input vector is dBin
nGroups <- 5 # number of groups
groups <- (dBin - min(dBin)) / (max(dBin) - min(dBin)) # rescale
Thanks for the answer. Split does not give me the indexes though but only in
which group they fall in. I also need the index of the group. Is the first, the
second .. group?Alex
On Tuesday, November 3, 2015 5:05 PM, Ista Zahn wrote:
Probably
split(binDistance, test).
Best,
Ista
Probably
split(binDistance, test).
Best,
Ista
On Tue, Nov 3, 2015 at 10:47 AM, Alaios via R-help wrote:
> Dear all,I am not exactly sure on what is the proper name of what I am trying
> to do.
> I have a vector that looks like
> binDistance
>[,1]
> [1,] 238.95162
> [2,] 143.0859
Dear all,I am not exactly sure on what is the proper name of what I am trying
to do.
I have a vector that looks like
binDistance
[,1]
[1,] 238.95162
[2,] 143.08590
[3,] 88.50923
[4,] 177.67884
[5,] 277.54116
[6,] 342.94689
[7,] 241.60905
[8,] 177.81969
[9,] 211.25559
[10,] 27
Replacing na.omit() with !is.na() appears to improve performance with time.
rm(list=ls())
test1 <- (rbind(c(0.1,0.2),0.3,0.1))
rownames(test1)=c('y1','y2','y3')
colnames(test1) = c('x1','x2');
test2 <- (rbind(c(0.8,0.9,0.5),c(0.5,0.1,0.6)))
rownames(test2) = c('y2','y5')
colnames(te
I reworked Frank Schwidom's solution to make it shorter than its original
version.
test1 <- (rbind(c(0.1,0.2),0.3,0.1))
rownames(test1)=c('y1','y2','y3')
colnames(test1) = c('x1','x2');
test2 <- (rbind(c(0.8,0.9,0.5),c(0.5,0.1,0.6)))
rownames(test2) = c('y2','y5')
colnames(test2) = c(
Another approach:
test1 <- data.frame(rbind(c(0.1,0.2),0.3,0.1))
rownames(test1) = c('y1','y2','y3')
colnames(test1) = c('x1','x2');
test2 <- data.frame(rbind(c(0.8,0.9,0.5),c(0.5,0.1,0.6)))
rownames(test2) = c('y2','y5')
colnames(test2) = c('x1','x3','x2')
> test1
x1 x2
y1 0.1 0.2
y2 0.3 0.
test1 <- (rbind(c(0.1,0.2),0.3,0.1))
rownames(test1)=c('y1','y2','y3')
colnames(test1) = c('x1','x2');
test2 <- (rbind(c(0.8,0.9,0.5),c(0.5,0.1,0.6)))
rownames(test2) = c('y2','y5')
colnames(test2) = c('x1','x3','x2')
lTest12 <- list( test1, test2)
namesRow <- unique( unlist( lapply( lTest12, ro
Dear R users,
I am trying to merge tables based on both their row names and column names.
My ultimate goal is to build a distribution table of values for each
combination of row and column names.
I have more test tables, more x's and y's than in the toy example below.
Thanks in advance for your
I have two tables that I would like to join together in a way equivalent to
the following SQL. Note that I'm using a "greater than" statement in my
join, rather than checking for equality.
require(sqldf)
require(data.table)
dt <- data.table(num=c(1, 2, 3, 4, 5, 6), char=c('A', 'A', 'A', 'B', 'B
package dplyr's full_join, left_join, right_join, inner_join are also
comparable in speed to data table. The syntax is also more like merge's.
On Thu, Jan 15, 2015 at 2:17 PM, Mike Miller wrote:
> Thanks, Jeff. You really know the packages. I search and I guess I
> didn't use the right terms.
Thanks, Jeff. You really know the packages. I search and I guess I
didn't use the right terms. That package seems to do exactly what I
wanted.
Mike
On Tue, 13 Jan 2015, Jeff Newmiller wrote:
On Tue, 13 Jan 2015, Mike Miller wrote:
I have many pairs of data frames each with about 15 mill
On Tue, 13 Jan 2015, Mike Miller wrote:
I have many pairs of data frames each with about 15 million records each and
about 10 million records in common. They are sorted by two of their fields
and will be merged by those same fields.
The fact that the data are sorted could be used to greatly
I have many pairs of data frames each with about 15 million records each
and about 10 million records in common. They are sorted by two of their
fields and will be merged by those same fields.
The fact that the data are sorted could be used to greatly speed up a
merge, but I have the impressi
Have you looked at the merge() function?
Here is an example. I don't know if it resembles your problem.
> M1 <- data.frame(V1=letters[1:3], V2=LETTERS[26:24], N1=101:103)
> M2 <- data.frame(V1=letters[c(3,1,2,3,2)],
V2=LETTERS[c(23,26,22,24,24)], N2=c(1003,1001,1002,1003,1002))
> merge(M
Below...
On Mon, 8 Dec 2014, David Lambert wrote:
I have 2 data frames, M1[n,20] and M2[m,30].
What does this mean? It might be intended to convey matrix dimensions, but
these are not matrices and that is not R syntax.
If V1 and V2 are the same in both M1 and M2, then append V3-V30 from M
I have 2 data frames, M1[n,20] and M2[m,30].
If V1 and V2 are the same in both M1 and M2, then append V3-V30 from M2 onto M1.
Otherwise, continue searching for a match.
M1 is complete for all V1 and V2. M2 is missing observations for V1 or V2, or
both.
I can't figure this one out, except
On Fri, 1 Aug 2014 07:25:05 AM barbara tornimbene wrote:
> HI.
> I have a set of disease outbreak data. Each observation have a
> location (spatial coordinates) and a start date. Outbreaks that occur
in
> the same location within a two week periods have to be merged.
Basically I
> need to delete
HI.
I have a set of disease outbreak data. Each observation have a location
(spatial coordinates) and a start date.
Outbreaks that occur in the same location within a two week periods have to be
merged.
Basically I need to delete duplicates that have same spatial coordinated and
start dates co
1 - 100 of 552 matches
Mail list logo