Hey all, So, I'm attempting to decode some (and I don't know why anyone did this) URl-encoded user agents. Running URLdecode over them generates the error:
"Error in rawToChar(out) : embedded nul in string" Okay, so there's an embedded nul - fair enough. Presumably decoding the URL is exposing it in a format R doesn't like. Except when I try to dig down and work out what an encoded nul looks like, in order to simply remove them with something like gsub(), I end up with several different strings, all of which apparently resolve to an embedded nul: > URLdecode("0;%20@%gIL") Error in rawToChar(out) : embedded nul in string: '0; @\0L' In addition: Warning message: In URLdecode("0;%20@%gIL") : out-of-range values treated as 0 in coercion to raw > URLdecode("%20%use") Error in rawToChar(out) : embedded nul in string: ' \0e' In addition: Warning message: In URLdecode("%20%use") : out-of-range values treated as 0 in coercion to raw I'm a relative newb to encodings, so maybe the fault is simply in my understanding of how this should work, but - why are both strings being read as including nuls, despite having different values? And how would I go about removing said nuls? -- Oliver Keyes Research Analyst Wikimedia Foundation [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.