The traditional (SQL) way to attack this problem is to make the data structure simpler so that faster comparisons can be utilized:

################
A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))
B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))

library(dplyr)
#>
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#>
#>     filter, lag
#> The following objects are masked from 'package:base':
#>
#>     intersect, setdiff, setequal, union
library(tidyr)
Bx <- (   B
      %>% mutate( z_B = as.character( z ) )
      %>% rename( t_B = t )
      %>% separate_rows( z, sep="::" )
      )
Bx
#>     z t_B      z_B
#> 1 a*b   1 a*b::x*y
#> 2 x*y   1 a*b::x*y
#> 3   c   2        c
#> 4       3
#> 5 g*h   4      g*h
result <- (   A
          %>% mutate( z = as.character( z ) )
          %>% rename( t_A = t )
          %>% inner_join( Bx, by="z" )
          )
result
#>     z t_A t_B      z_B
#> 1 a*b   1   1 a*b::x*y

#' Created on 2018-07-18 by the [reprex package](http://reprex.tidyverse.org) 
(v0.2.0).
################

Note that this is preferable if you can avoid ever creating the complex data z in B, but Bx is much more flexible and less error prone than B. (Especially if you don't have to create B$z_B at all, but have some other unique identifier(s) for the groupings represented by each row in B.)

On Wed, 18 Jul 2018, Bogdan Tanasa wrote:

Thanks a lot ! It looks that I am getting the same results with :

B %>% regex_left_join(A, by = c(z = 'z'))

On Wed, Jul 18, 2018 at 3:57 PM, Riley Finn <rileyfi...@gmail.com> wrote:

please may I ask for a piece of advise regarding merging two dataframes :
A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))
B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))
function of the criteria :
if "the elements in the 1st column of A could be found among the elements
of the 1st column of B" i.e.
for the example above, we shall combine in the results only the row with
"a*b" of A with the row with "a*b::x*y" of B.


This may be what you are looking for:

library(fuzzyjoin)

The inner join returns just the one row where the string matches.
B %>%
  regex_inner_join(A, by = c(z = 'z'))

While the full join returns NA's where the string does not match.
B %>%
  regex_full_join(A, by = c(z = 'z'))

On Wed, Jul 18, 2018 at 5:20 PM Bogdan Tanasa <tan...@gmail.com> wrote:

Dear all,

please may I ask for a piece of advise regarding merging two dataframes :

A <- data.frame(z=c("a*b", "c*d", "d*e", "e*f"), t =c(1, 2, 3, 4))

B <- data.frame(z=c("a*b::x*y", "c", "", "g*h"), t =c(1, 2, 3, 4))

function of the criteria :

if "the elements in the 1st column of A could be found among the elements
of the 1st column of B" i.e.

for the example above, we shall combine in the results only the row with
"a*b" of A with the row with "a*b::x*y" of B.

thank you,

bogdan

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/
posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


---------------------------------------------------------------------------
Jeff Newmiller                        The     .....       .....  Go Live...
DCN:<jdnew...@dcn.davis.ca.us>        Basics: ##.#.       ##.#.  Live Go...
                                      Live:   OO#.. Dead: OO#..  Playing
Research Engineer (Solar/Batteries            O.O#.       #.O#.  with
/Software/Embedded Controllers)               .OO#.       .OO#.  rocks...1k

______________________________________________
R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to