Yes - it's taken from the wild (an HTML page on the internet). Then turned
into XML, then a table extracted etc - so looks to me like non-utf8 stuff
has go in there somewhere.
That's why I was wandering if there was a way to filter out arbitrary text
and make it utf8-safe. You know urlencode for
You really need to know the encoding you are working with. Check if the page
has a charset attribute first and if it does re-encode to utf8 first. The try
it in mergJSON. If it chokes then the best you can do is replace any char
greater than charToNum(127) with “?”. Other than that I think
Any tricks to ensure that text I receive from an internet (HTML) source -
destined to be placed into a nice pretty JSON wrapper is safe to go? At the
moment it is bugging out somewhere.
I'm placing the text into an array and then using Monte's mergJsonEncode
function to decode it. Usually works
On 24 Jul 2015, at 7:22 am, David Bovill david@viral.academy wrote:
I'm placing the text into an array and then using Monte's mergJsonEncode
function to decode it. Usually works fine - but in this case it looks like
the content needs some tidying before I put it into the array.
mergJSON