On 15 Sep 2013, at 22:52, Stephan Stiller wrote: > On 9/15/2013 1:04 PM, Doug Ewell wrote: >> André Schappo wrote: >>> U+2026 is useful for microblogs when one is looking to save characters >> Not if the microblog is in UTF-8, as almost all are. > > That's an astute observation, but André was talking about input limits > https://dev.twitter.com/docs/counting-characters , > not backend/database space. > > Stephan
Thank you for that clarification Stephan. Yes I was referring to input limits in microblogs. This is presented to the User as a Counter which starts at 140. So however the characters are stored or transformed in the backend is of little interest to the User. The User is interested in the Counter. So U+2026 decrements the Counter by 1 whereas U+002E U+002E U+002E decrements the Counter by 3 There are (and in some cases have been) unexpected variations on this simple User oriented Counter mechanism for microblogs ① Twitter - Until recently, characters outside the BMP resulted in a Counter decrement of 2 and BMP characters gave a decrement of 1. Not sure when the change happened but now both BMP & non BMP characters result in a decrement of 1 ② Sina Weibo - The Weibo Counter has 3 possible decrement values : 0.5, 1 & 2. • Characters from Unicode range U+0000➜U+00FF have a count of 0.5 • Characters from Unicode range U+0100➜U+FFFF have a count of 1 • Characters from Unicode range ≥ U+010000 have a count of 2 About a year ago I blogged about it http://schappo.blogspot.co.uk/2012/10/weibo-character-count.html André