You are absolutely right, Ista - it's not haven's fault, my bad. Of course, it's the attr function and exact = TRUE. Thank you so much! Dimitri
On Fri, Nov 13, 2015 at 10:00 AM, Ista Zahn <istaz...@gmail.com> wrote: > Why do you think this is a bug in have? To the contrary, I don't think > this has anything to do with haven at all. The problem seems to be > that attr does partial matching by default. Check it out: > >> attr(x, "labels") <- c("foo", "bar", "baz") >> attr(x, "label") > [1] "foo" "bar" "baz" > > and see ?attr for details. > > The answer I think is > > fix_labels <- function(x, TextIfMissing) { > val <- attr(x, "label", exact = TRUE) > if (is.null(val)) TextIfMissing else val > } > > Finally, note that the development version of rio > (https://github.com/leeper/rio) has an (non-exported) function for > cleaning up meta data from haven imports. See > https://github.com/leeper/rio/blob/master/R/utils.R#L86 > > Best, > Ista > > On Thu, Nov 12, 2015 at 8:37 PM, Dimitri Liakhovitski > <dimitri.liakhovit...@gmail.com> wrote: >> I have to rephrase my question again - it's clearly a small bug in >> haven. Here is what it is about: >> >> If I have a column in SPSS that has BOTH a long label and value >> labels, then everything works fine - I access one with 'label' and >> another with 'labels': >> >> attr(spss1$MYVAR, "label") >> [1] "LONG LABEL" >> attr(spss1$MYVAR, "labels") >> DEFINITELY CONSIDER PROBABLY CONSIDER PROBABLY NOT >> CONSIDER DEFINITELY NOT CONSIDER >> 1 2 >> 3 4 >> >> However, if I have a column that has no long label and ONLY value >> labels, then it's not working properly: >> >>> attr(spss1$MYVAR, "label") >> VERY/SOMEWHAT FAMILIAR NOT AT ALL FAMILIAR >> 1 2 >>> attr(spss1$MYVAR, "labels") >> VERY/SOMEWHAT FAMILIAR NOT AT ALL FAMILIAR >> 1 2 >> >> And I actually need to be able to identify if label is empty. >> Thank you for looking into it! >> >> Dimitri >> >> >> On Thu, Nov 12, 2015 at 5:55 PM, Dimitri Liakhovitski >> <dimitri.liakhovit...@gmail.com> wrote: >>> Looks like a little bug in 'haven': >>> >>> When I actually look at the attributes of one variable that has no >>> long label in SPSS but has Value Labels, I am getting: >>> attr(spss1$WAVE, "label") >>> NULL >>> >>> But when I sapply my function longlabels to my data frame and ask it >>> to print the long labels for each column, for the same column "WAVE" I >>> am getting - instead of NULL: >>> NULL >>> VERY/SOMEWHAT FAMILIAR NOT AT ALL FAMILIAR >>> 1 2 >>> >>> This is, of course, incorrect, because it grabs the next attribute >>> (which one? And replaces NULL with it). >>> Any suggestions? >>> Thanks! >>> >>> >>> >>> >>> On Thu, Nov 12, 2015 at 11:56 AM, Dimitri Liakhovitski >>> <dimitri.liakhovit...@gmail.com> wrote: >>>> Hello! >>>> >>>> I don't have an example file, but I think my question should be clear >>>> without it. >>>> I have an SPSS file. I read it in using 'haven': >>>> >>>> library(haven) >>>> spss1 <- read_spss("SPSS_Example.sav") >>>> >>>> I created a function that extracts the long labels (in SPSS - "Label"): >>>> >>>> fix_labels <- function(x, TextIfMissing) { >>>> val <- attr(x, "label") >>>> if (is.null(val)) TextIfMissing else val >>>> } >>>> longlabels <- sapply(spss1, fix_labels, TextIfMissing = "NO LABLE IN SPSS") >>>> >>>> This function is supposed to create a vector of long labels and >>>> usually it does, e.g.: >>>> >>>> str(longlabels) >>>> Named chr [1:64] "Serial number" ... >>>> - attr(*, "names")= chr [1:64] "Respondent_Serial" "weight" "r7_1" "r7_2" >>>> ... >>>> >>>> However, I just got an SPSS file with 92 columns and ran exactly the >>>> same function on it. Now, I am getting not a vector, but a list >>>> >>>> str(longlabels) >>>> List of 92 >>>> $ VEHRATED : chr "VEHICLE RATED" >>>> $ RESPID : chr "RESPONDENT ID" >>>> $ RESPID8 : chr "8 DIGIT RESPONDENT NUMBER" >>>> >>>> An observation about the structure of longlabels here: those columns >>>> that do NOT have a long lable in SPSS but DO have Values (value >>>> labels) - for them my function grabs their value labels, so that now >>>> my long label is recorded as a numeric vector with names, e.g.: >>>> >>>> $ AWARE2 : Named num [1:2] 1 2 >>>> ..- attr(*, "names")= chr [1:2] "VERY/SOMEWHAT FAMILIAR" "NOT AT ALL >>>> FAMILIAR" >>>> >>>> Question: How could I avoid the extraction of the Value Labels for the >>>> columns that have no long labels? >>>> >>>> Thank you very much! >>>> -- >>>> Dimitri Liakhovitski >>> >>> >>> >>> -- >>> Dimitri Liakhovitski >> >> >> >> -- >> Dimitri Liakhovitski >> >> ______________________________________________ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. -- Dimitri Liakhovitski ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.