What is the recommended way to get utf8 content of spam message 
in cases when:
1) spam message misses charset declaration (common for TW spam)
2) TextCat Plugin detects language *and charset*

In case of one specific spam:
* TextCat detects zh.big5
* $status->get_content_preview() 
    return "bushes" and (us ascii) http links
* Encode::decode('big5',$status->get_content_preview())
    return something auto-translators can translate into "making sense"
    English but the http links are missing

-- 
[pl>en: Andrew] Andrzej Adam Filip : a...@onet.eu
Adam and Eve had many advantages, but the principal one was,
that they escaped teething.
  -- Mark Twain, "Pudd'nhead Wilson's Calendar"

Reply via email to