Hi Georg, it is difficult to figure out what happens between your expectation and the outcome if we cannot see a minimal dataset.
Based on your description I did this library(tidyr) library(dplyr) test_df <- data_frame(channel = LETTERS[1:5], unit = letters[1:5], custID = c(1:5), dummy = 1) test_df %>% spread(channel, dummy) %>% mutate(dummy = 1) %>% spread(unit, dummy) which seems to be working fine as I get wide data. If a combination is missing in the long form it will also be missing in the wide form. Maybe you are looking for something like this: channel_wide <- test_df %>% select(channel, custID) %>% spread(channel, custID) unit_wide <- test_df %>% select(unit, custID) %>% spread(unit, custID) bind_cols(channel_wide, unit_wide) Apologies for the HTML - it's gmail Best wishes, Ulrik On Thu, 28 Jul 2016 at 13:54 <g.maub...@weinwolf.de> wrote: > Hi Ulrik, > > I have included a reproducable example. I ran the code and it did exactly > what I wanted to show you. > > You are right: the solution shall merge cases in the end cause the values > on the variables are either missing or the same. > > Example 1: Values are the same > If you look at 6 and 7 and variable 70 the value is 1 in both cases. This > is in this context the same information and cases 6 and 7 with custID can > be merged to 1 for variable 70. > > Example 2: Values are missing and not missing > If you look at cases 8 and 9 the value for case 8 at variable 40, 50 and > 2000 is missing whereas the variables 40, 50 and 2000 have all 1 for case > 9. Case 8 and 9 could be merged together cause the missing values are > overwritten what is correct in this case. > > The solution I am looking for is to transform the data from long into wide > form and keep all but missing value information. > > Did I explain my problem in a comprehensible way? Are there any further > questions? > > Kind regards > > Georg > > > > > > Von: Ulrik Stervbo <ulrik.ster...@gmail.com> > An: g.maub...@weinwolf.de, r-help@r-project.org, > Datum: 28.07.2016 12:59 > Betreff: Re: [R] Spread data.frame on 2 variables > > > > Hi Georg, > > it's hard to tell without a reproducible example. > > Should spread really merge elements? Does spread know anything about > CustID? Maybe you need to make a useful key of the CustIDs first and > spread on that? > > Maybe I'm all off, because I'm really just guessing. > > Best, > Ulrik > > On Thu, 28 Jul 2016 at 12:36 <g.maub...@weinwolf.de> wrote: > Hi All, > > I need to spread a data.frame on 2 variables, e. g. "channel" and "unit". > > If I do it in two steps spreads keeps all cases that does not look like > the one before although it contains the same values for a specific case. > > Here is what I have right now: > > -- cut -- > > test1$dummy <- 1 > test2 <- spread(data = test1, key = 'channel', value = "dummy") > test2 > cat("First spread is OK!") > > test2$dummy <- 1 > test3 <- spread(data = test2, key = 'unit', value = 'dummy') > > test1 > # test2 > test3 > warning(paste0("Second spread is not OK cause spread does not merge > cases\n", > "with CustID 700 and 800 into one case,\n", > "cause they have values on different variables,\n", > "although the corresponding values of the cases with", > "custID 700 and 800 are missing.")) > > cat("What I would like to have is:\n") > target4 <- structure(list(custID = c(100, 200, 300, 500, 600, 700, 800, > 900), > `10` = c(1, NA, NA, NA, NA, NA, NA, NA), > `20` = c(1, NA, NA, NA, NA, NA, NA, NA), > `30` = c(NA, NA, NA, NA, NA, NA, 1, 1), > `40` = c(NA, NA, NA, NA, 1, NA, 1, 1), > `50` = c(NA, NA, 1, NA, NA, NA, 1, 1), > `60` = c(NA, NA, NA, NA, NA, 1, NA, NA), > `70` = c(NA, NA, NA, NA, NA, 1, NA, NA), > `99` = c(NA, 1, NA, 1, NA, NA, NA, NA), > `1000` = c(1, NA, NA, NA, NA, NA, 1, 1), > `2000` = c(NA, NA, NA, NA, 1, 1, 1, NA), > `3000` = c(NA, NA, 1, NA, NA, 1, NA, NA), > `4000` = c(NA, NA, 1, NA, NA, NA, NA, NA), > `6000` = c(NA, NA, NA, NA, 1, NA, NA, NA), > `9999` = c(NA, 1, NA, 1, NA, NA, NA, NA)), > .Names = c("custID", > "10", "20", "30", "40", "50", "60", "70", "99", > "1000", "2000", "3000", "4000", "6000", "9999"), > row.names = c(NA, 8L), class = "data.frame") > > target4 > > cat("What would be a proper way to create target4 from test1?") > > -- cut -- > > What would be the proper way to create target4 from test1? > > Kind regards > > Georg > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.