Hi Oliver, I think you're being misled by the default behaviour of warnings: they all get displayed at once, before control returns to the console. If you making them immediate, you get a slightly more informative error:
> URLdecode("0;%20@%gIL") Warning in URLdecode("0;%20@%gIL") : out-of-range values treated as 0 in coercion to raw Error in rawToChar(out) : embedded nul in string: '0; @\0L' So the out of range value (%g...) is getting converted to a raw(0), aka a nul. Then rawToChar() chokes. The code for URLdecode is simple enough that I'd recommend rewriting yourself to better handle bad inputs. Hadley On Mon, Sep 1, 2014 at 11:02 AM, Oliver Keyes <oke...@wikimedia.org> wrote: > Hey all, > > So, I'm attempting to decode some (and I don't know why anyone did this) > URl-encoded user agents. Running URLdecode over them generates the error: > > "Error in rawToChar(out) : embedded nul in string" > > Okay, so there's an embedded nul - fair enough. Presumably decoding the URL > is exposing it in a format R doesn't like. Except when I try to dig down > and work out what an encoded nul looks like, in order to simply remove them > with something like gsub(), I end up with several different strings, all of > which apparently resolve to an embedded nul: > >> URLdecode("0;%20@%gIL") > Error in rawToChar(out) : embedded nul in string: '0; @\0L' > In addition: Warning message: > In URLdecode("0;%20@%gIL") : > out-of-range values treated as 0 in coercion to raw >> URLdecode("%20%use") > Error in rawToChar(out) : embedded nul in string: ' \0e' > In addition: Warning message: > In URLdecode("%20%use") : > out-of-range values treated as 0 in coercion to raw > > I'm a relative newb to encodings, so maybe the fault is simply in my > understanding of how this should work, but - why are both strings being > read as including nuls, despite having different values? And how would I go > about removing said nuls? > > -- > Oliver Keyes > Research Analyst > Wikimedia Foundation > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- http://had.co.nz/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.