fonts for U7.0 scripts
I'm looking for freely downloadable TTF fonts for any of the following. I'd appreciate links to sites for any of these: 1. Bassa_Vah 2. Duployan 3. Grantha 4. Khojki 5. Khudawadi 6. Mahajani 7. Mende_Kikakui 8. Modi 9. Mro 10. Nabataean 11. Old_Permic 12. Palmyrene 13. Pau_Cin_Hau 14. Tirhuta 15. Warang_Citi Coverage doesn't need to be complete, and the font doesn't need to support shaping (these are just for charts / illustrations). Mark https://google.com/+MarkDavis ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
Re: fonts for U7.0 scripts
Am 22.10.2014 um 09:27 schrieb Mark Davis ☕️: Bassa_Vah Duployan Grantha Khojki Khudawadi Mahajani Mende_Kikakui Modi Mro Nabataean Old_Permic Palmyrene Pau_Cin_Hau Tirhuta Warang_Citi You’re asking for quite a lot – for nothing. best, Andreas Stötzner. (font producer) ___ Andreas Stötzner Gestaltung Signographie Fontentwicklung Haus des Buches Gerichtsweg 28, Raum 434 04103 Leipzig 0176-86823396 http://stoetzner-gestaltung.prosite.com ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
Re: fonts for U7.0 scripts
On 22 October 2014 08:27, Mark Davis ☕️ m...@macchiato.com wrote: I'm looking for freely downloadable TTF fonts for any of the following. I'd appreciate links to sites for any of these: Bassa_Vah Duployan Grantha Khojki Khudawadi Mahajani Mende_Kikakui Modi Mro Nabataean Old_Permic Palmyrene Pau_Cin_Hau Tirhuta Warang_Citi Was the encoding of any of these scripts funded by the Script Encoding Initiative? According to the SEI (http://www.linguistics.berkeley.edu/sei/help.html) Funding is used primarily for the creation of proposals on a per-project basis and for fonts. Fonts will be made available over the Unicode website and will be available for free distribution but cannot be bundled with commercial products. Although I have to say that I cannot see anywhere on the Unicode website that provides fonts for SEI-funded scripts. Andrew ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
Re: fonts for U7.0 scripts
ScriptSource has links to fonts, and you may find some there. For instance, I immediately found three Bassa_Vah fonts, two of which appear to be free, one of which costs only $19. There's also a freeware font for Grantha. I didn't search further. (Fwiw, you can find the right ScriptSource pages quickly by going to http://rishida.net/scriptlinks and selecting the script. Look near the bottom of the list that appears for the direct link.) ri On 22/10/2014 08:27, Mark Davis ☕️ wrote: I'm looking for freely downloadable TTF fonts for any of the following. I'd appreciate links to sites for any of these: 1. Bassa_Vah 2. Duployan 3. Grantha 4. Khojki 5. Khudawadi 6. Mahajani 7. Mende_Kikakui 8. Modi 9. Mro 10. Nabataean 11. Old_Permic 12. Palmyrene 13. Pau_Cin_Hau 14. Tirhuta 15. Warang_Citi Coverage doesn't need to be complete, and the font doesn't need to support shaping (these are just for charts / illustrations). Mark https://google.com/+MarkDavis // ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
Re: fonts for U7.0 scripts
The Grantha link is broken. The site no longer exists. I have contacted the original author. Will post here once he replies. -- Shriramana Sharma ஶ்ரீரமணஶர்மா श्रीरमणशर्मा ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
RE: fonts for U7.0 scripts
Dear Andrew, Most of the scripts listed below did come via Script Encoding Initiative (SEI), you are correct. The intent of SEI was to work on proposals and provide fonts but, to date, the focus of the work has been almost exclusively on getting scripts into Unicode and not on the creation of distributable fonts. I will modify the wording on the webpage accordingly. Ideally, I would like to have free fonts made available via SEI, but it hasn't been possible due to funding constraints. In the future, I plan to work closely with ScriptSource (and other projects that make free fonts available), and will encourage the creation and submission of free fonts to such projects, though at this point SEI doesn't have the funding itself to pay for such work, unfortunately. Debbie Anderson -Original Message- From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Andrew West Sent: Wednesday, October 22, 2014 2:51 AM To: Mark Davis ☕️ Cc: Unicode Public Subject: Re: fonts for U7.0 scripts On 22 October 2014 08:27, Mark Davis ☕️ m...@macchiato.com wrote: I'm looking for freely downloadable TTF fonts for any of the following. I'd appreciate links to sites for any of these: Bassa_Vah Duployan Grantha Khojki Khudawadi Mahajani Mende_Kikakui Modi Mro Nabataean Old_Permic Palmyrene Pau_Cin_Hau Tirhuta Warang_Citi Was the encoding of any of these scripts funded by the Script Encoding Initiative? According to the SEI (http://www.linguistics.berkeley.edu/sei/help.html) Funding is used primarily for the creation of proposals on a per-project basis and for fonts. Fonts will be made available over the Unicode website and will be available for free distribution but cannot be bundled with commercial products. Although I have to say that I cannot see anywhere on the Unicode website that provides fonts for SEI-funded scripts. Andrew ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
Limits in UBA
Hi, I have 2 questions related to the Unicode Bidirectional Algorithm, both regarding limits on certain aspects of the UBA. First, I'd like to ask about the 127 entries of the directional status stack; it had 63 entries in the version of the UBA before Unicode 6.3. Where and why are such deep embeddings/isolates needed? Does anyone know of practical examples of text that requires such a depth? I personally never saw a situation where one or 2 embeddings/overrides were not enough. This is a far cry from the UAX#9 numbers. Implementing such a deep stack requires memory-management solutions that are non-trivial, and add complexity to an already complex algorithm, but if I implement only a small fraction of that, I cannot claim bidirectional conformity. So I wonder if there's a practical justification for such a deep UBA stack. The second question is about the stack required for implementing the BPA resolution of brackets, as described in BD16 and N0. The UBA doesn't place any limits on the depth of that stack. This means that text with a large enough number of opening bracket characters and no closing brackets could exhaust the entire memory space of an application. What is the implementation supposed to do in this situation? Crashing or exiting with a fatal error code is clearly inappropriate in some applications. Is it even reasonable not to have any limits for this stack? Thanks in advance for any insights. ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
Re: Limits in UBA
From: Andrew Glass (WINDOWS) andrew.gl...@microsoft.com Date: Wed, 22 Oct 2014 17:57:52 + Thanks for responding. Embeddings are common in generated text. The guiding principle, is seemingly, when in doubt wrap the string in an embedding. At the UTC, we heard, that this can lead to very deep stacks - but I've never actually seen one with more than 63 levels - but that is not my topic here. I'd appreciate some pointers to such texts, if they are publicly accessible. I'd be very interested to see why such deep embeddings are necessary. In Emacs, we do use embeddings and overrides in a few places in text we generate, for example, to make sure information about a character displayed by a specialized command doesn't get jumbled due to that character's bidi class. But we never needed more than one, maximum 2 levels. Most of the cases can be resolved by using LRM or RLM. The BPA is not as subject to the extremes of generated text, and therefore brackets should follow a natural limit such that it is possible for a human to parse and track the bracketed levels. As such, the max depth is going to be quite low in normal text. Most cases of the BPA involve one pair. Nested pairs beyond three become quite artificial - except in languages such as LISP. However, supporting correct display of Bidi LISP code is not a goal of the BPA. I'm not sure what the maximum depth used by the test data is - I think that is the best current guide unless we introduce something. The test data doesn't have more than 3 nested levels, I think. For Emacs, I limited the BPA stack at 1024 levels, which is probably way too much, but it was cheap, so I saw no reason forcing an arbitrary lower limit. As for Lisp and similar languages, since the BPA in otherwise all-L2R text is equivalent to normal resolution of neutrals per N1 and N2, I simply bypass the BPA in that case -- because N1/N2 processing is much cheaper in the Emacs case. So Lisp is not the case that worries me. But I do wonder why there's absolutely no guidance in the UBA regarding this issue, which in practice every implementor will probably bump into. Thanks. ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
RE: Limits in UBA
Eli, Embeddings are common in generated text. The guiding principle, is seemingly, when in doubt wrap the string in an embedding. At the UTC, we heard, that this can lead to very deep stacks - but I've never actually seen one with more than 63 levels - but that is not my topic here. I'd appreciate some pointers to such texts, if they are publicly accessible. I'd be very interested to see why such deep embeddings are necessary. They aren't necessary for human-generated text. There is no normal human text reading case for them. But as Andrew indicated, the problem arises from the potential for automated injection of text wrapped in an embedding. There is no expectation that any of that would actually be readable text in most cases. But on the other hand, the generated text could end up in logs or other text stores which, in turn, could end up processed by some text rendering for display in a window somewhere. You don't then want an arbitrarily low limit for handling embeddings in the UBA to suddenly crap out the display: that just leads to bug reports and a lot of confused thrashing up and down the customer support chain. An example I could think of off the top of my head might involve some complicated database application working with Arabic data. If the mechanism generating some automated queries was automatically encapsulating string literals in the where db103.tbl246.col27='blah' qualifiers *and* the query was encapsulating each full select xxx statement *and* the query was using nested subqueries, then if the generation of the query ended up nesting 32 subqueries (which can occur, although it might not be good practice), then you would already have bumped over the prior 63 level embedding limit for UBA. With *big* database applications, where installations may have thousands of tables, with thousands of partitions, and multiple terabytes of data, automated generation of very large and complicated SQL queries is common. And while the database itself doesn't care about UBA or display order when parsing and compiling such queries, the SQL text can be and *is* routinely logged. And the worry by the UTC is that when such logged generated text might include encapsulated embedded chunks, you don't want UBA per se to be introducing limits that cause failures when there might be a use case to display such text for diagnostics, for example. I don't happen to *know* of a particular example of such text to point you to, but that kind of thing is the relevant use scenario. --Ken ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
Re: fonts for U7.0 scripts
Debbie, Thanks for the explanation. I just wonder, in order to get a script accepted for encoding the proposer has to provide a font for the Unicode/10646 code charts, so creating a font (that is at least good enough for the code charts even if it does not have full shaping behaviour) is an essential part of the proposal process, so if the SEI is funding someone to research/write a proposal is not the funding provided by SEI at least indirectly funding the creation of a font, and if so should not the font be made freely available at the end of the project? Andrew On 22 October 2014 14:48, Deborah W. Anderson dwand...@sonic.net wrote: Dear Andrew, Most of the scripts listed below did come via Script Encoding Initiative (SEI), you are correct. The intent of SEI was to work on proposals and provide fonts but, to date, the focus of the work has been almost exclusively on getting scripts into Unicode and not on the creation of distributable fonts. I will modify the wording on the webpage accordingly. Ideally, I would like to have free fonts made available via SEI, but it hasn't been possible due to funding constraints. In the future, I plan to work closely with ScriptSource (and other projects that make free fonts available), and will encourage the creation and submission of free fonts to such projects, though at this point SEI doesn't have the funding itself to pay for such work, unfortunately. Debbie Anderson -Original Message- From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Andrew West Sent: Wednesday, October 22, 2014 2:51 AM To: Mark Davis ☕️ Cc: Unicode Public Subject: Re: fonts for U7.0 scripts On 22 October 2014 08:27, Mark Davis ☕️ m...@macchiato.com wrote: I'm looking for freely downloadable TTF fonts for any of the following. I'd appreciate links to sites for any of these: Bassa_Vah Duployan Grantha Khojki Khudawadi Mahajani Mende_Kikakui Modi Mro Nabataean Old_Permic Palmyrene Pau_Cin_Hau Tirhuta Warang_Citi Was the encoding of any of these scripts funded by the Script Encoding Initiative? According to the SEI (http://www.linguistics.berkeley.edu/sei/help.html) Funding is used primarily for the creation of proposals on a per-project basis and for fonts. Fonts will be made available over the Unicode website and will be available for free distribution but cannot be bundled with commercial products. Although I have to say that I cannot see anywhere on the Unicode website that provides fonts for SEI-funded scripts. Andrew ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
RE: Limits in UBA
Eli, I think you are correct that the BidiCharacterTest.txt data currently does not go beyond 3 nesting levels for testing the BPA part of UBA. I agree with Andrew that that is reasonable guide to the normal limit of meaningful bracket embeddings one might find in text. However, I don't think it is safe to assume that 3 is the deepest that the conformance test data would ever have in it. Unlike the bidi format control embeddings, which are hard to visualize and involve special input or programming, it is *easy* for people to generate strings with deeply embedded bracket pairs: ((99) So it might make sense to add test cases with data like that to BidiCharacterTest. In such cases, fallback behavior when hitting the implementation limit are presumably o.k., but is advisable to check implementations to ensure that they don't actually fall over if they *do* hit their limit. In the C BidiRef reference implementation I wrote, the limit I picked was simply half the maximum string length it would process, on the assumption that the worst case it would have to deal with would be a string consisting of *nothing but* bracket pairs. If supporting 1024 bracket pair levels in cheap for Emacs support, that seems like a defensible limit choice to me. --Ken The BPA is not as subject to the extremes of generated text, and therefore brackets should follow a natural limit such that it is possible for a human to parse and track the bracketed levels. As such, the max depth is going to be quite low in normal text. Most cases of the BPA involve one pair. Nested pairs beyond three become quite artificial - except in languages such as LISP. However, supporting correct display of Bidi LISP code is not a goal of the BPA. I'm not sure what the maximum depth used by the test data is - I think that is the best current guide unless we introduce something. The test data doesn't have more than 3 nested levels, I think. For Emacs, I limited the BPA stack at 1024 levels, which is probably way too much, but it was cheap, so I saw no reason forcing an arbitrary lower limit. ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
RE: fonts for U7.0 scripts
Dear Andrew, It is true that proposals require a font to create the code charts, but I was careful in my comments to say SEI doesn't currently fund creation of distributable fonts. Fonts for proposals are usually very basic, and often partly auto-generated by font editing software, usually with an ASCII cmap. They appear marginally OK in the code chart, and while they are acceptable for talking about charts or to use as examples in papers about the script, they are typically not acceptable for most purposes that contain running text, including publication in printed form, e.g., in books. Just because someone develops a basic set of outlines for a script proposal doesn't necessarily mean (a) that they have done any work to make their font useful for anything else and (b) that they have, or will license their font for public use. (They don't sign up for that automatically when doing a proposal, and it has not really budgeted into any proposals, so far.) At the moment, SEI is severely budget-constrained, and proposal authors are not earning much doing proposal work. The more work put in for purposes beyond the proposal itself, the lower their hourly income. And as John Hudson or Ken Lunde can probably attest, good font development is labor intensive. In sum, it would take additional resources for a developer to do work on a font to make it acceptable for distribution. However, like Andrew Glass, I commend the work on Noto fonts, which is a way to help make free working fonts available. With best wishes, Debbie -Original Message- From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Andrew West Sent: Wednesday, October 22, 2014 12:29 PM To: Deborah W. Anderson Cc: Mark Davis ☕️; Unicode Public Subject: Re: fonts for U7.0 scripts Debbie, Thanks for the explanation. I just wonder, in order to get a script accepted for encoding the proposer has to provide a font for the Unicode/10646 code charts, so creating a font (that is at least good enough for the code charts even if it does not have full shaping behaviour) is an essential part of the proposal process, so if the SEI is funding someone to research/write a proposal is not the funding provided by SEI at least indirectly funding the creation of a font, and if so should not the font be made freely available at the end of the project? Andrew On 22 October 2014 14:48, Deborah W. Anderson dwand...@sonic.net wrote: Dear Andrew, Most of the scripts listed below did come via Script Encoding Initiative (SEI), you are correct. The intent of SEI was to work on proposals and provide fonts but, to date, the focus of the work has been almost exclusively on getting scripts into Unicode and not on the creation of distributable fonts. I will modify the wording on the webpage accordingly. Ideally, I would like to have free fonts made available via SEI, but it hasn't been possible due to funding constraints. In the future, I plan to work closely with ScriptSource (and other projects that make free fonts available), and will encourage the creation and submission of free fonts to such projects, though at this point SEI doesn't have the funding itself to pay for such work, unfortunately. Debbie Anderson -Original Message- From: Unicode [mailto:unicode-boun...@unicode.org] On Behalf Of Andrew West Sent: Wednesday, October 22, 2014 2:51 AM To: Mark Davis ☕️ Cc: Unicode Public Subject: Re: fonts for U7.0 scripts On 22 October 2014 08:27, Mark Davis ☕️ m...@macchiato.com wrote: I'm looking for freely downloadable TTF fonts for any of the following. I'd appreciate links to sites for any of these: Bassa_Vah Duployan Grantha Khojki Khudawadi Mahajani Mende_Kikakui Modi Mro Nabataean Old_Permic Palmyrene Pau_Cin_Hau Tirhuta Warang_Citi Was the encoding of any of these scripts funded by the Script Encoding Initiative? According to the SEI (http://www.linguistics.berkeley.edu/sei/help.html) Funding is used primarily for the creation of proposals on a per-project basis and for fonts. Fonts will be made available over the Unicode website and will be available for free distribution but cannot be bundled with commercial products. Although I have to say that I cannot see anywhere on the Unicode website that provides fonts for SEI-funded scripts. Andrew ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode --- This email is free from viruses and malware because avast! Antivirus protection is active. http://www.avast.com ___ Unicode mailing list Unicode@unicode.org
Re: fonts for U7.0 scripts
On 10/22/2014 12:29 PM, Andrew West wrote: should not the font be made freely available at the end of the project? The policy requires that a license is given to produce the charts and related documents. No more, no less. This allows people and organizations to donate a free license for use by the editors, but otherwise seek for commercial distribution of their work. In other words, they retain all the rights to their intellectual property that are not strictly required for the encoding process. Nothing prevents people to put their fonts in the public domain, if they so desire, but that can't be a requirement of the character encoding process. Debbie might approach people who have provided chart fonts with a query as to whether they might like to issue a broader license, or to list their fonts with sites that distribute free fonts - in some cases they might be motivated as this might increase the chance that their script is used and implemented. But we need to be very clear that this would be highly voluntary. A./ ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode
Re: Limits in UBA
From: Whistler, Ken ken.whist...@sap.com CC: unicode@unicode.org unicode@unicode.org, Whistler, Ken ken.whist...@sap.com Date: Wed, 22 Oct 2014 19:18:38 + Accept-Language: en-US I'd appreciate some pointers to such texts, if they are publicly accessible. I'd be very interested to see why such deep embeddings are necessary. They aren't necessary for human-generated text. There is no normal human text reading case for them. But if humans aren't going to read that text, the embeddings aren't necessary at all, because programs read and process text in logical order anyway. Bidi reordering is a display-time feature, meant for human consumption. An example I could think of off the top of my head might involve some complicated database application working with Arabic data. Again, if the query is to be submitted to a program, there should not be a need for embeddings at all. And while the database itself doesn't care about UBA or display order when parsing and compiling such queries, the SQL text can be and *is* routinely logged. And the worry by the UTC is that when such logged generated text might include encapsulated embedded chunks, you don't want UBA per se to be introducing limits that cause failures when there might be a use case to display such text for diagnostics, for example. I don't happen to *know* of a particular example of such text to point you to, but that kind of thing is the relevant use scenario. Still, the number 63 or 127 sounds arbitrary, and unnecessarily large to me. ___ Unicode mailing list Unicode@unicode.org http://unicode.org/mailman/listinfo/unicode