[whatwg] Unicode - ASCII copy/paste fallback
Hello, I have a page with a span class=rarrspan-gt;/span/span b and style .rarr span { overflow: hidden; height: 0; width: 0; display: inline-block; } .rarr::after { content: →; } (That's RIGHTWARDS ARROW x2192.) In Firefox 36, this copies and pastes like a - b which is the desired behavior. In Chrome 40, this copies and pastes like a b. Is my desired behavior (to show unicode but copy an ASCII representation) generally possible? Are there specs somewhere about copy/paste behavior? I looked in https://html.spec.whatwg.org/ but found nothing relevant. Is this the right venue for this question? Should I take it somewhere else? Thanks, David Sheets
Re: [whatwg] Unicode - ASCII copy/paste fallback
David Sheets kosmo...@gmail.com writes: On Fri, Feb 13, 2015 at 12:18 PM, James M. Greene james.m.gre...@gmail.com wrote: In this case, you can use Unicode escape values by preceding them with a slash: .rarr:after { content: \2192; } This is specified in the CSS 2.1 spec: http://www.w3.org/TR/CSS2/syndata.html#characters Personally, I probably would've just started on StackOverflow with this question (e.g. [1]) but no harm done. Hi James! Sorry, I wasn't clear. The issue is not with putting Unicode values into CSS. The issue is that I would like unicode values to be copied and pasted as a specific ASCII fallback value. That is, I would like the equivalent of a rarr; b to appear on a page but, upon copying, a - b to show up in the clipboard. I have a solution that works in Firefox 36 (described in original mail). Chrome 40 does not behave similarly. I can see some arguments for Chrome's behavior along security lines. I certainly can understand the utility of Firefox's behavior because I am writing a documentation generation tool for a programming language with right arrows represented as - but would like to render them as →. I would suggest to use OpenType ligatures for that. You could reasonably create a ligature font that renders any occurence of “-” as “→”. -- Nils Dagsson Moskopp // erlehmann http://dieweltistgarnichtso.net
Re: [whatwg] Unicode - ASCII copy/paste fallback
On Fri, Feb 13, 2015 at 5:45 AM, David Sheets kosmo...@gmail.com wrote: Hello, I have a page with a span class=rarrspan-gt;/span/span b and style .rarr span { overflow: hidden; height: 0; width: 0; display: inline-block; } .rarr::after { content: →; } (That's RIGHTWARDS ARROW x2192.) In Firefox 36, this copies and pastes like a - b which is the desired behavior. In Chrome 40, this copies and pastes like a b. Is my desired behavior (to show unicode but copy an ASCII representation) generally possible? Are there specs somewhere about copy/paste behavior? I looked in https://html.spec.whatwg.org/ but found nothing relevant. Copying ASCII isn't desirable. It should copy the Unicode string a → b. After all, that's what gets copied if you had done spana → b/span in the first place. (Chrome's issue isn't related to Unicode. It just doesn't know how to select text that's inside CSS content, so it isn't included in the copy.) -- Glenn Maynard
Re: [whatwg] Unicode - ASCII copy/paste fallback
David Sheets kosmo...@gmail.com writes: On Fri, Feb 13, 2015 at 12:23 PM, Mathias Bynens mathi...@opera.com wrote: On Fri, Feb 13, 2015 at 1:18 PM, James M. Greene james.m.gre...@gmail.com wrote: In this case, you can use Unicode escape values by preceding them with a slash: OP’s question wasn’t about how to escape non-ASCII characters, but rather about what the copy/paste behavior should be in browsers. @David, I don’t think it’s reasonable to expect non-ASCII characters to be transliterated to ASCII characters copying them. That said, it would be nice to standardize on the behavior here: should generated content be included when copying or not? Hi Mathias, Do you mean that it's not reasonable for transliteration to happen automatically? I agree. What do you mean with “not reasonable” ? I once noticed that where elinks does show “ベルントとウンターアルターバッハの謎”, links shows “BeRuN6TotoU6N6Ta-6A6RuTa-6BaTUHano***”. Now while I cannot read any japanese, I wonder if this transliteration seems unreasonable to you. -- Nils Dagsson Moskopp // erlehmann http://dieweltistgarnichtso.net
Re: [whatwg] Unicode - ASCII copy/paste fallback
On Fri, Feb 13, 2015 at 1:08 PM, Nils Dagsson Moskopp n...@dieweltistgarnichtso.net wrote: David Sheets kosmo...@gmail.com writes: On Fri, Feb 13, 2015 at 12:18 PM, James M. Greene james.m.gre...@gmail.com wrote: In this case, you can use Unicode escape values by preceding them with a slash: .rarr:after { content: \2192; } This is specified in the CSS 2.1 spec: http://www.w3.org/TR/CSS2/syndata.html#characters Personally, I probably would've just started on StackOverflow with this question (e.g. [1]) but no harm done. Hi James! Sorry, I wasn't clear. The issue is not with putting Unicode values into CSS. The issue is that I would like unicode values to be copied and pasted as a specific ASCII fallback value. That is, I would like the equivalent of a rarr; b to appear on a page but, upon copying, a - b to show up in the clipboard. I have a solution that works in Firefox 36 (described in original mail). Chrome 40 does not behave similarly. I can see some arguments for Chrome's behavior along security lines. I certainly can understand the utility of Firefox's behavior because I am writing a documentation generation tool for a programming language with right arrows represented as - but would like to render them as →. I would suggest to use OpenType ligatures for that. You could reasonably create a ligature font that renders any occurence of “-” as “→”. This is a really brilliant solution that satisfies my use case perfectly. I created the following (horrible) font that works as expected. data:application/font-woff;base64,d09GRk9UVE8AAAXQAA0ACAwAAQBDRkYgAAABMY4AAAGdFwHC50ZGVE0AAALAHBxuRRqXR0RFRgAAAtwqNABJADhHUE9TAAADCCAgRHZMdUdTVUIAAAMoQFCxBbQIT1MvMgAAA2gAAABHYFYhYsBjbWFwAAADsEMAAAFKAmIC1WhlYWQAAAP0MDYFFMPmaGhlYQAABCQdJAZsBDtobXR4AAAERBQUEegCzm1heHRYBgYABVAAbmFtZQAABGFZAAACi0q47Qlwb3N0AAAFvBMg/4MAM3icRY9BSxtRFIXvSzINhuGlCZlGwjSORGgL6cRKcSFmY0gQYtMoVupKDHnJhNYoydMacOdCcewu0EV2pf8gUOlm6EY63ZdCV8WV/gG9z7xAjF2kZ3XOtzjwEfD5gBASeFPnNf6elV8A8QCBJREE8ZiIuEdMeEXUF7/wfBmo3vjy54HqiwdgbG8tZdujovqRfu+N3ZqKDn+DOsBDHa5COoR1YodBuX8MQhTWoQTvrNaOxeob1Qbb5KxR2Nxir5g5k/k3y0apZZSbFmO8mUwmjQ81bhm57TrPbTeqzJgxp42nFuc7c6lUZUgr99RsVsw6489GBv9VhqEwQY7IMSiETM2+3u30Q7Yjig5xHHQdr/NImOJr33zg9A81UUS3X/RT/IYJgpNiUZOJ2wNMDMm56GqYk0wW5fJlCn/hmezgb4ViFylBKp5rtqRi/GQh7WYxg5nTH3IVm9kCKjKs0Bt0MUbwpShoH2VMuHb6yc8szuOK/Ue+xf18/lpGFNprR3ohzVYDmAgg/dRW1Ttgr6BKAAABAMw9os8A0QO3ygDRA7/9eJwlibENACAMwxzamSsY+f8+XKHI9hACbDlcvSiiY8u2i8wD+TwTFwCCAQoAHAAeAAFERkxUAAgABAD//wAAAHicY2BkYGDgYpBj0GFgdHHzCWHgYGABijD8/88AkmHMyUxPBIoxQHhAORYwzQHEQlCahYGZgQkIGUEQACkoBXB4nGNgYWFg/MLAysDANJPpDAMDQz+EZnzNYMzICRRlYGNmgAFGBiQQkOaawnCAQZfBjtn4vzFDDNOs/6dR1CgAIRMAc0gMkAB4nGNgYGBmgGAZBkYGEHAB8hjBfBYGDSDNBqQZGZgYdBns/v8H8sH0/yv/j0DVAwEjGwOCQwzAUMxEiu5BCQAKpAk1AHicY2BkYGAA4hCza8vj+W2+MnCzMIDAReYDvHC6isGUuZRpFpDLwcAEEgUABpUIx3icY2BkYGCaxWDKEMPCAALMpQyMDKiAFQAsSwGxAgQAAI0EAAEnBAAAegPoAKAAAFUAAHichZDNSsNAFIXP9A8KItInmI1QIU0nKd1kaaGI4NLuWzJpAjUpyZTSrYgrn8VXcO3atWufwJ0LT6ZjQQSbYe797uHOmTsBcIpnCOy/Szw6Fuji3XEDHXw6buJcXDluoSvuHbdxJn58OtTf2ClaXVYP9lTNAj28Om7gBB+Om7jGl+MWeiJ33IYUT4471F8wQQmNOQxjDIkFdowxKqRUNPUKnl0SW2SsU9IUBXJynUss2ScRwodi7rPDcK0RYciVuN7k0OvTM2HMrf8FMCn13OhYLnYyrlKtTeV5ntxmJpXTIjfTolxqGfpK9lNj1tFwmFBNatWvEj/Xhh639pJ6wJV9SkApN5lZ6Zh4Y7UMG9yx0HG2Yf7vFRH3X8u9HmCEATvrrViNafVrzEgeriYHo0E4CFUwPjbkjFrJf5PZuSS9a3ff5nomzHRZZUUulQp8pZQ8YvgNzG5wnHicY2BmAIP/DQzGDFgAACgUAbYA The browser inconsistency in the original case still stands, though. Is there a spec covering copy and paste? David -- Nils Dagsson Moskopp // erlehmann http://dieweltistgarnichtso.net
Re: [whatwg] Unicode - ASCII copy/paste fallback
On Fri, Feb 13, 2015 at 3:02 PM, Glenn Maynard gl...@zewt.org wrote: On Fri, Feb 13, 2015 at 5:45 AM, David Sheets kosmo...@gmail.com wrote: Hello, I have a page with a span class=rarrspan-gt;/span/span b and style .rarr span { overflow: hidden; height: 0; width: 0; display: inline-block; } .rarr::after { content: →; } (That's RIGHTWARDS ARROW x2192.) In Firefox 36, this copies and pastes like a - b which is the desired behavior. In Chrome 40, this copies and pastes like a b. Is my desired behavior (to show unicode but copy an ASCII representation) generally possible? Are there specs somewhere about copy/paste behavior? I looked in https://html.spec.whatwg.org/ but found nothing relevant. Copying ASCII isn't desirable. It should copy the Unicode string a → b. After all, that's what gets copied if you had done spana → b/span in the first place. (Chrome's issue isn't related to Unicode. It just doesn't know how to select text that's inside CSS content, so it isn't included in the copy.) The only relation this issue has to Unicode is a use case for alternate copy/paste behavior. Judging from the replies to my original inquiry, either Firefox or Chrome is doing something unexpected or both are behaving unexpectedly (and should put the unicode arrow on the clipboard). I'm not sure if all use cases for my original trick can be covered by using OpenType ligatures (thanks, Nils!) or if there are other 'alternative clipboard behavior' applications. Certainly, the most consistent behavior would be for both Chrome and Firefox (and other browsers that I haven't/don't care to test) to put the CSS content on the clipboard and ignore hidden content. I suppose currently Chrome is preventing copying hidden content but Firefox is not and neither picks up the CSS content. David
Re: [whatwg] Unicode - ASCII copy/paste fallback
On 2/13/15 10:15 AM, David Sheets wrote: I suppose currently Chrome is preventing copying hidden content but Firefox is not and neither picks up the CSS content. Both prevent copying hidden content, but may not have identical definitions of hidden. Neither picks up CSS generated content, because both represent selections in terms of DOM ranges, and DOM ranges can't represent CSS generated content... -Boris
Re: [whatwg] Unicode - ASCII copy/paste fallback
Why is it desirable to copy ASCII versions of unicode text? Doesn't most software now support unicode so the user can copy and paste what they see, rather than some ASCII-art equivalent? On 13 February 2015 at 15:45, Boris Zbarsky bzbar...@mit.edu wrote: On 2/13/15 10:15 AM, David Sheets wrote: I suppose currently Chrome is preventing copying hidden content but Firefox is not and neither picks up the CSS content. Both prevent copying hidden content, but may not have identical definitions of hidden. Neither picks up CSS generated content, because both represent selections in terms of DOM ranges, and DOM ranges can't represent CSS generated content... -Boris
Re: [whatwg] Unicode - ASCII copy/paste fallback
On Fri, Feb 13, 2015 at 9:02 AM, Glenn Maynard gl...@zewt.org wrote: Copying ASCII isn't desirable. It should copy the Unicode string a → b. After all, that's what gets copied if you had done spana → b/span in the first place. (Oh, I missed the obvious--the - from Firefox is coming from the HTML, of course.) I guess what you're after is being able to have separate text for display vs. copy. I'm sure you don't actually want to use a hacky custom font. What's the actual use case? In general I think browsers should always copy just what the user selected, and not let pages cause something other than that to be copied, since things like that are generally abused (eg. inserting linkback ads to copied text). -- Glenn Maynard
Re: [whatwg] Unicode - ASCII copy/paste fallback
In this case, you can use Unicode escape values by preceding them with a slash: .rarr:after { content: \2192; } This is specified in the CSS 2.1 spec: http://www.w3.org/TR/CSS2/syndata.html#characters Personally, I probably would've just started on StackOverflow with this question (e.g. [1]) but no harm done. [1]: http://stackoverflow.com/questions/10393462/placing-unicode-character-in-css-content-value Sincerely, James Greene On Fri, Feb 13, 2015 at 5:45 AM, David Sheets kosmo...@gmail.com wrote: Hello, I have a page with a span class=rarrspan-gt;/span/span b and style .rarr span { overflow: hidden; height: 0; width: 0; display: inline-block; } .rarr::after { content: →; } (That's RIGHTWARDS ARROW x2192.) In Firefox 36, this copies and pastes like a - b which is the desired behavior. In Chrome 40, this copies and pastes like a b. Is my desired behavior (to show unicode but copy an ASCII representation) generally possible? Are there specs somewhere about copy/paste behavior? I looked in https://html.spec.whatwg.org/ but found nothing relevant. Is this the right venue for this question? Should I take it somewhere else? Thanks, David Sheets
Re: [whatwg] Unicode - ASCII copy/paste fallback
On Fri, Feb 13, 2015 at 1:18 PM, James M. Greene james.m.gre...@gmail.com wrote: In this case, you can use Unicode escape values by preceding them with a slash: OP’s question wasn’t about how to escape non-ASCII characters, but rather about what the copy/paste behavior should be in browsers. @David, I don’t think it’s reasonable to expect non-ASCII characters to be transliterated to ASCII characters copying them. That said, it would be nice to standardize on the behavior here: should generated content be included when copying or not?
Re: [whatwg] Unicode - ASCII copy/paste fallback
On Fri, Feb 13, 2015 at 12:18 PM, James M. Greene james.m.gre...@gmail.com wrote: In this case, you can use Unicode escape values by preceding them with a slash: .rarr:after { content: \2192; } This is specified in the CSS 2.1 spec: http://www.w3.org/TR/CSS2/syndata.html#characters Personally, I probably would've just started on StackOverflow with this question (e.g. [1]) but no harm done. Hi James! Sorry, I wasn't clear. The issue is not with putting Unicode values into CSS. The issue is that I would like unicode values to be copied and pasted as a specific ASCII fallback value. That is, I would like the equivalent of a rarr; b to appear on a page but, upon copying, a - b to show up in the clipboard. I have a solution that works in Firefox 36 (described in original mail). Chrome 40 does not behave similarly. I can see some arguments for Chrome's behavior along security lines. I certainly can understand the utility of Firefox's behavior because I am writing a documentation generation tool for a programming language with right arrows represented as - but would like to render them as →. This seems like a pretty straightforward document feature but I can't seem to get interoperable behavior (or even find where such behavior might be specified). Thanks, David [1]: http://stackoverflow.com/questions/10393462/placing-unicode-character-in-css-content-value Sincerely, James Greene On Fri, Feb 13, 2015 at 5:45 AM, David Sheets kosmo...@gmail.com wrote: Hello, I have a page with a span class=rarrspan-gt;/span/span b and style .rarr span { overflow: hidden; height: 0; width: 0; display: inline-block; } .rarr::after { content: →; } (That's RIGHTWARDS ARROW x2192.) In Firefox 36, this copies and pastes like a - b which is the desired behavior. In Chrome 40, this copies and pastes like a b. Is my desired behavior (to show unicode but copy an ASCII representation) generally possible? Are there specs somewhere about copy/paste behavior? I looked in https://html.spec.whatwg.org/ but found nothing relevant. Is this the right venue for this question? Should I take it somewhere else? Thanks, David Sheets
Re: [whatwg] Unicode - ASCII copy/paste fallback
On Fri, Feb 13, 2015 at 12:23 PM, Mathias Bynens mathi...@opera.com wrote: On Fri, Feb 13, 2015 at 1:18 PM, James M. Greene james.m.gre...@gmail.com wrote: In this case, you can use Unicode escape values by preceding them with a slash: OP’s question wasn’t about how to escape non-ASCII characters, but rather about what the copy/paste behavior should be in browsers. @David, I don’t think it’s reasonable to expect non-ASCII characters to be transliterated to ASCII characters copying them. That said, it would be nice to standardize on the behavior here: should generated content be included when copying or not? Hi Mathias, Do you mean that it's not reasonable for transliteration to happen automatically? I agree. Do you mean that it's not reasonable to support specific replacements during copying? Firefox seems to support this currently (and perfectly). There are user trickery concerns here but, at least in my case, I think codepoint - 2 byte replacement is probably safe... Thanks, David
Re: [whatwg] Unicode - ASCII copy/paste fallback
Sorry, David Mathias. Hasty 6:00am reply here before my brain and eyes fully woke up! Interesting question. Personally, I would expect and desire the CSS-generated content to be copied. Sincerely, James M. Greene On Feb 13, 2015 6:24 AM, David Sheets kosmo...@gmail.com wrote: On Fri, Feb 13, 2015 at 12:18 PM, James M. Greene james.m.gre...@gmail.com wrote: In this case, you can use Unicode escape values by preceding them with a slash: .rarr:after { content: \2192; } This is specified in the CSS 2.1 spec: http://www.w3.org/TR/CSS2/syndata.html#characters Personally, I probably would've just started on StackOverflow with this question (e.g. [1]) but no harm done. Hi James! Sorry, I wasn't clear. The issue is not with putting Unicode values into CSS. The issue is that I would like unicode values to be copied and pasted as a specific ASCII fallback value. That is, I would like the equivalent of a rarr; b to appear on a page but, upon copying, a - b to show up in the clipboard. I have a solution that works in Firefox 36 (described in original mail). Chrome 40 does not behave similarly. I can see some arguments for Chrome's behavior along security lines. I certainly can understand the utility of Firefox's behavior because I am writing a documentation generation tool for a programming language with right arrows represented as - but would like to render them as →. This seems like a pretty straightforward document feature but I can't seem to get interoperable behavior (or even find where such behavior might be specified). Thanks, David [1]: http://stackoverflow.com/questions/10393462/placing-unicode-character-in-css-content-value Sincerely, James Greene On Fri, Feb 13, 2015 at 5:45 AM, David Sheets kosmo...@gmail.com wrote: Hello, I have a page with a span class=rarrspan-gt;/span/span b and style .rarr span { overflow: hidden; height: 0; width: 0; display: inline-block; } .rarr::after { content: →; } (That's RIGHTWARDS ARROW x2192.) In Firefox 36, this copies and pastes like a - b which is the desired behavior. In Chrome 40, this copies and pastes like a b. Is my desired behavior (to show unicode but copy an ASCII representation) generally possible? Are there specs somewhere about copy/paste behavior? I looked in https://html.spec.whatwg.org/ but found nothing relevant. Is this the right venue for this question? Should I take it somewhere else? Thanks, David Sheets
Re: [whatwg] Unicode - ASCII copy/paste fallback
To expand on my own comment: Personally, I would expect and desire the CSS-generated content to be copied. ...because THAT is what the user sees, per the browser rendering. I'm surprised that neither Firefox nor Chrome exhibits that behavior. Sincerely, James M. Greene On Feb 13, 2015 6:30 AM, James M. Greene james.m.gre...@gmail.com wrote: Sorry, David Mathias. Hasty 6:00am reply here before my brain and eyes fully woke up! Interesting question. Personally, I would expect and desire the CSS-generated content to be copied. Sincerely, James M. Greene On Feb 13, 2015 6:24 AM, David Sheets kosmo...@gmail.com wrote: On Fri, Feb 13, 2015 at 12:18 PM, James M. Greene james.m.gre...@gmail.com wrote: In this case, you can use Unicode escape values by preceding them with a slash: .rarr:after { content: \2192; } This is specified in the CSS 2.1 spec: http://www.w3.org/TR/CSS2/syndata.html#characters Personally, I probably would've just started on StackOverflow with this question (e.g. [1]) but no harm done. Hi James! Sorry, I wasn't clear. The issue is not with putting Unicode values into CSS. The issue is that I would like unicode values to be copied and pasted as a specific ASCII fallback value. That is, I would like the equivalent of a rarr; b to appear on a page but, upon copying, a - b to show up in the clipboard. I have a solution that works in Firefox 36 (described in original mail). Chrome 40 does not behave similarly. I can see some arguments for Chrome's behavior along security lines. I certainly can understand the utility of Firefox's behavior because I am writing a documentation generation tool for a programming language with right arrows represented as - but would like to render them as →. This seems like a pretty straightforward document feature but I can't seem to get interoperable behavior (or even find where such behavior might be specified). Thanks, David [1]: http://stackoverflow.com/questions/10393462/placing-unicode-character-in-css-content-value Sincerely, James Greene On Fri, Feb 13, 2015 at 5:45 AM, David Sheets kosmo...@gmail.com wrote: Hello, I have a page with a span class=rarrspan-gt;/span/span b and style .rarr span { overflow: hidden; height: 0; width: 0; display: inline-block; } .rarr::after { content: →; } (That's RIGHTWARDS ARROW x2192.) In Firefox 36, this copies and pastes like a - b which is the desired behavior. In Chrome 40, this copies and pastes like a b. Is my desired behavior (to show unicode but copy an ASCII representation) generally possible? Are there specs somewhere about copy/paste behavior? I looked in https://html.spec.whatwg.org/ but found nothing relevant. Is this the right venue for this question? Should I take it somewhere else? Thanks, David Sheets