One detail to add: Some archive file formats also add redundancy for
error correction.
I agree that error correction is in principle best left to storage media
and transmission protocols (for a clean separation of functionality),
but the idea to have error correction tailored to (ie: optimized
On 01/28/2013 07:30 AM, William_J_G Overington wrote:
A document saved as UTF-64 may well take four times as many bytes as such a
Unicode Text Document, yet there would be the error checking and correction
facilities at a character level.
It seems to me that the character-encoding level is the
William_J_G Overington wrote:
> The idea is that there would be an additional UTF format, perhaps UTF-64,
> so that each character would be expressed in UTF-64 notation using 64 bits,
> thus providing error checking and correction facilities at a character level.
Error detection and correction a
> UTF-256 allows each hex digit of UTF-32 to be expressed as an ASCII hex digit
> (characters 0-9 and A-F encoded as bytes 0x30-0x39 and 0x41-0x46).
In my experience, I lose an entire block of a disk, or track, or drive, so
redundancy at the character level isn’t likely to be very helpful, you’d
Using UTF64 with 48 bits of Reed-Solomon error correction (RSEC) on a
single UTF-16 data codeword would allow you to recover 24 data or EC bits.
Remember that the EC bits, being in the same codeword, are just as likely
to be damaged as the data.
Ottos' comment is more practical. You have 11 unused
Hello,
am 28.01.2013 schrieb William_J_G Overington:
The idea is that there would be an additional UTF format, perhaps UTF-64,
so that each character would be expressed in UTF-64 notation using 64 bits,
thus providing error checking and correction facilities at a character level.
We have alrea
> "WJGO" == William J G Overington writes:
WJGO> I was thinking about the problems of the long-term archiving of
WJGO> electronic text documents and thought of an idea. I wonder if I
WJGO> may please mention the idea here in the hope of there being a
WJGO> discussion so that an assessment of
On 1/28/2013 4:30 AM, William_J_G Overington wrote:
The idea is that there would be an additional UTF format, perhaps UTF-64, so
that each character would be expressed in UTF-64 notation using 64 bits, thus
providing error checking and correction facilities at a character level.
I think this
I would love to have such a facility because it is too much hassle to
write bilingual/trilingual documentswhich is often the case at least
in Indian environment.
On Jan 28, 2013 6:17 PM, "William_J_G Overington"
wrote:
> I was thinking about the problems of the long-term archiving of electro
On 1/28/2013 5:12 AM, Martinho Fernandes wrote:
Similarly, there could be a type of pdf document where the text within the pdf
document were stored in UTF-64 format.
>>
FWIW, there is already a PDF variant designed for long-term archiving
known as PDF/A. You may want to look into that.
Goo
On 1/28/2013 5:12 AM, Martinho Fernandes wrote:
Similarly, there could be a type of pdf document where the text within the pdf
document were stored in UTF-64 format.
FWIW, there is already a PDF variant designed for long-term archiving
known as PDF/A. You may want to look into that.
Good po
The MUFI 3.0 specification states that codepoint U+F1C3 COMBINING
ABBREVIATION MARK SUPERSCRIPT UR TILDE FORM has been assigned to U+1DD1
COMBINING UR ABOVE, and the box has been shaded yellow which indicates that
the codepoint has been decommissioned.
However the glyph for U+1DD1 in the Unicode C
> Similarly, there could be a type of pdf document where the text within the
> pdf document were stored in UTF-64 format.
FWIW, there is already a PDF variant designed for long-term archiving
known as PDF/A. You may want to look into that.
Mit freundlichen Grüßen,
Martinho
I was thinking about the problems of the long-term archiving of electronic text
documents and thought of an idea.
I wonder if I may please mention the idea here in the hope of there being a
discussion so that an assessment of whether the idea is worth developing can be
made.
The idea is that t
14 matches
Mail list logo