To handle the UCD XML file a streaming parser like Expat is necessary.
For codepoints.net I use that data to stuff everything in a MySQL
database. If anyone is interested, the code for that is Open Source:
https://github.com/Codepoints/unicode2mysql/
The example for handling the large XML file
The XML files in these folders:
https://unicode.org/repos/cldr/tags/latest/common/
But I agree. I spent an extreme amount of time to get somewhat used to
cldr.unicode.org and and the data repo, and still I have no clue,
where to find a concrete piece of information without digging into the
site.
>> but are there any plans to integrate the data in the ucdxml [2]
>> (possibly as separate files) ?
>
> No. Not unless and until they become formally part of the UCD.
In this context: Would it be possible for the maintainers of the TR #51
data files to add a symlink "latest" under
The rising standard in the world of web development (and others) is called
»Semantic Versioning« [1], that many projects adhere to or sometimes must
actively explain, why they don't.
The structure of a »semantic version« string is a set of three integers,
MAJOR.MINOR.PATCH, where the »sematics«
Maybe I'm missing context, but what is the specific problem of those lists
differing?
The EU and Europe _are_ two different things. The United States of America
similarly do not include the whole of America, despite the name.
And Norway and Switzerland and some others (incl. soon England) might
Hi,
for my work on codepoints.net and Emojipedia I found myself repeatedly
in a place, where I needed some tool like hexdump to inspect the content
of a string. However, instead of raw bytes I am more interested in the
code points that the string is composed of. So I wrote this tool.
I reasoned,
Not technically a school, but I gave a Batman-themed high-level overview
of Unicode at Munich's local JavaScript user group two years ago:
http://www.manuel-strehl.de/publications/holy-batman/presentation
It was well received, especially for its lighter tone on this perceived
dry subject, and
Hi,
please let me start by saying, that I think the adoption of characters
is a very good idea to provide funding for the development of Unicode.
To promote this idea, I thought it could be worthwhile to place an
"adopt this codepoint" button on the description pages of code points on
Thank you! Yes, that's an implicit part of the "I'd like feedback from the
people involved" :-)
In fact, if such a GET parameter existed, I could remove the dialog and
replace it with a simple link. This sounds like a good idea in principle.
(It would also fix Leo Broukhis' issue.)
Does anyone
Thanks for the comment!
> As far as I'm concerned, the pop-up contents should end with the link
> "Read more about codepoint adoption." In your brief description, you
> might want to add a proviso about the temporary nature of character
> adoption.
Good catch! The 12-month period is important to
Thank you, Doug and Rick!
> If you can find the most recent version of the Symbola font updated
> for Unicode 8.0, it contains a huge number of symbols and b/w
> emoticons, etc.
Yes, that's kind of the Go-to-font for pan-emoji support. It's a pity,
that George won't continue development (although
Hello,
I am wondering, if there is a list, which font / work is used to render
which of the black & white emoji (and other symbols) in the code chart
PDFs. Neither
http://www.unicode.org/charts/fonts.html
nor
http://unicode.org/emoji/images.html
nor the PDF itself have a sufficiently detailed
Interesting! Out of curiosity: How come this was recognized in Unicode 7?
Is that documented anywhere?
2015-05-28 17:03 GMT+02:00 Doug Ewell d...@ewellic.org:
Chris idou747 at gmail dot com wrote:
Unicode has the arrow dingbats ⬅⬆⬇⬈⬉⬊⬋
in the range 2b05 with names like “LEFTWARDS BLACK
Yes, they have the huge advantage over my http://codepoints.net, that they
have a team providing already so many translations. I envy them for that a
bit. But competition is good for business. :-)
Cheers,
Manuel
2014/1/23 Leo Broukhis l...@mailcom.com
I find http://unicode-table.com/ of which
Hello,
under http://codepoints.net/api/v1/ I published a REST API to access
information about Unicode (codepoints, blocks, planes, sample glyphs).
There are also tools to transform or filter input. The API documentation is
published on Github:
https://github.com/Boldewyn/Codepoints.net/wiki/API
Out of curiosity, has it happened before, that a glyph was updated (i.e.,
substantially changed) in the standard?
Cheers,
2013/5/29 Asmus Freytag asm...@ix.netcom.com
On 5/29/2013 8:39 AM, Leo Broukhis wrote:
I'd like to ask: what is supposed to be the trigger condition for the UTC
to
That's great news. I'm really looking forward to see Decode Unicode
having v6.0 displayed.
(By the way: The initial idea to display scripts and hence codepoints
geographically came from Decode Unicode's solution.)
Manuel
2012/8/22 Johannes Bergerhausen johan...@bergerhausen.com:
Am 22.08.2012
First of all, thanks for all the answers.
It's quite interesting for me to learn, that data here is such
fragmented. As calligrapher, who once was delighted to learn about the
Latf script tag in the context of RFC 5646 et al., I guess I was way
too naive when starting the scripts part of
, not an effective or
particularly error-free approach.
Cheers,
Manuel
2012/8/20 Asmus Freytag asm...@ix.netcom.com:
On 8/19/2012 4:05 PM, Manuel Strehl wrote:
Hello,
I'm looking for a data source, that maps countries to scripts used in
them. The target application is a visualization in the context
This might not work too well, since the ISO 15924 code elements you're
thinking of are Hira and Kana.
This awkward moment... I'm trying to figure out, what I was thinking
of with Hana.
Of course, the mapping must be sensible in a way, that is, explain,
how the mapping is done. I'd be fine, I
Hello,
I'm looking for a data source, that maps countries to scripts used in
them. The target application is a visualization in the context of my
codepoints.net site, namely http://codepoints.net/scripts.
At the moment I've extracted the prefered scripts from CLDR (e.g., Cyrl
for Russia, Latn
21 matches
Mail list logo