Just a quick note:
The correct URL for ONIX for Serials is
http://www.editeur.org/17/ONIX-for-Serials/ - note that this is a family of
standards, so it covers a very wide range of data types and content. The code
lists Tom mentioned are available there in human-readable form.
Also: it sounded
RESPONSIBILITIES:
The Rutgers University Libraries seek an experienced, innovative, and
serviceoriented librarian to fill the position of Archivist in the Institute
of Jazz Studies, John Cotton Dana Library onthe Newark
Campus of Rutgers, The State University of New Jersey.
Reporting to the
**Description and Duties:**
Within the framework of established policies, regulations and procedures, in
consultation with the Systems Coordinator, the Division Head of Discovery
Systems and other Discovery Systems staff, the incumbent provides technical
expertise and support for the
I know how char encodings work in MARC ISO binary -- the encoding can
legally be either Marc8 or UTF8 (nothing else). The encoding of a
record is specified in it's header. In the wild, specified encodings are
frequently wrong, or data includes weird mixed encodings. Okay!
But what's going on
There are probably a couple of answers to that.
XML rules define what characterset is used. The encoding attribute on
the ?xml? header is where you find out what characterset is being
used.
I've always gone under the assumption that if an encoding wasn't
specified, then UTF-8 is in effect and
What's the legal thing to do? What's actually found 'in the wild' with
MarcXML?
In some cases, invalid XML.
In an ideal world, the encoding should be included in the declaration. But
I wouldn't trust it.
kyle
--
--
Kyle Banerjee
So what if the ?xml? decleration says one charset encoding, but the
MARC header included in the MarcXML says a different encoding... which
one is the 'legal' one to believe?
Is it legal to have MarcXML that is not UTF-8 _or_ Marc8, that is an
entirely different charset that is legal in XML?
On 4/17/2012 1:57 PM, Kyle Banerjee wrote:
In some cases, invalid XML. In an ideal world, the encoding should be
included in the declaration. But I wouldn't trust it. kyle
So would you use the Marc header payload instead?
Or you're just saying you wouldn't trust _any_ encoding declerations
Okay, maybe here's another way to approach the question.
If I want to have a MarcXML document encoded in Marc8 -- what should it
look like? What should be in the XML decleration? What should be in the
MARC header embedded in the XML? Or is it not in fact legal at all?
If I want to have a
**Director of Library Information Technology Production Services**
Academic Professional Position
University of Illinois at Urbana-Champaign
**Position Available**: This position is available July, 2012. This is a
100%-time, twelve-month appointment Academic Professional position.
If I want to have a MarcXML document encoded in Marc8 -- what should
it
look like? What should be in the XML decleration? What should be in
the
MARC header embedded in the XML? Or is it not in fact legal at all?
I'm going out on a limb here, but I don't think it is legal. There is
no
The University of Oregon Libraries and Oregon State University Libraries invite
you to code4lib west, Monday, July 30, 2012, at the UO Knight Library. There is
no registration fee for this conference. Registration is limited to 50
participants. All participants are expected to deliver a
Hi Ralph,
But, ignoring the encoding, the original MarcXML rules were the same as
the MARC-21 rules for character repertoire and you were suppose to
restrict yourself to characters that could be mapped back into MARC-8.
I don't know if that rule is still in force, but everyone ignores it.
Thanks, this is helpful feedback at least.
I think it's completely irrelevant, when determining what is legal under
standards, to talk about what certain Java tools happen to do though, I
don't care too much what some tool you happen to use does.
In this case, I'm _writing_ the tools. I want
Re: But do others agree that there is in fact no legal way to have Marc8 in
MarcXML?
No -- it is perfectly legal - -but you MUST declare the encoding to BE Marc8 in
the XML prolog, and you will want to be aware that XML processors are only
REQUIRED to process UTF-8 and UTF-16 -- in practice
Jonathan Rochkind
Sent: Tuesday, April 17, 2012 14:18
Subject: Re: [CODE4LIB] MarcXML and char encodings
Okay, maybe here's another way to approach the question.
If I want to have a MarcXML document encoded in Marc8 -- what should it
look like? What should be in the XML decleration?
Michigan Technological University's Van Pelt and Opie Library seeks an
energetic, user-focused and collegial Web developer that enjoys working on a
variety of projects with library and IT staff, faculty, and students that
support library services, instruction and research.
Michigan
So would you use the Marc header payload instead?
Or you're just saying you wouldn't trust _any_ encoding declerations you
find anywhere?
This.
The short version is that too many vendors and systems just supply some
value without making sure that's what they're spitting out. I haven't had
The discussions at the MARC standards group relating to Unicode all had
to do with using Unicode *within* ISO2709. I can't find any evidence
that MARCXML ever went through the standards process. (This may not be a
bad thing.) So none of what we know about the MARBI discussions and
resulting
Karen Coyle
Sent: Tuesday, April 17, 2012 15:41
Subject: Re: [CODE4LIB] MarcXML and char encodings
The discussions at the MARC standards group relating to Unicode all had
to do with using Unicode *within* ISO2709. I can't find any evidence
that MARCXML ever went through the standards
Let me make some recommendations. These are what I would consider best
practices for interoperability.
1) Never put marc8 in xml. Just don't do it. No one expects it. Few will be
willing to bother with it.
2) Always prefer utf8 for marcxml. You can use any standard charset if you need
to,
On 4/17/2012 3:01 PM, Sheila M. Morrissey wrote:
No -- it is perfectly legal - -but you MUST declare the encoding to BE Marc8 in
the XML prolog,
Wait, how canyou declare a Marc8 encoding in an XML
decleration/prolog/whatever it's called?
The things that appear there need to be from a
No -- it is perfectly legal - -but you MUST declare the encoding to
BE Marc8 in the XML prolog,
Wait, how canyou declare a Marc8 encoding in an XML
decleration/prolog/whatever it's called?
Nope, you can't do that. There is no approved name for the MARC-8
encoding. As Andy said, the closest
In XML standard:
It is RECOMMENDED that character encodings registered (as charsets)
with the Internet Assigned Numbers Authority [IANA-CHARSETS], other than those
just listed, be referred to usingtheir registered names; other encodings
SHOULD use names starting with an x- prefix.
MARC-8. Cool in its time. Dumb now. Typical. --ELM
I think this is a case of being in violent agreement -- see some earlier
replies in this thread --
Pragmatically, if you are going to hew to marc-8 encoding transported in XML --
you are losing the usefulness of standard tools for xml --
smm
-Original Message-
From: Code for Libraries
Okay, forget XML for a moment, let's just look at marc 'binary'.
First, for Anglophone-centric MARC21.
The LC docs don't actually say quite what I thought about leader byte
09, used to advertise encoding:
a - UCS/Unicode
Character coding in the record makes use of characters from the
On Tue, Apr 17, 2012 at 7:55 PM, Jonathan Rochkind rochk...@jhu.edu wrote:
Okay, forget XML for a moment, let's just look at marc 'binary'.
First, for Anglophone-centric MARC21.
Actually Anglo and Francophone centric. And the USMARC style 245 was a poor
replacement for the UKMARC approach
On Tue, Apr 17, 2012 at 8:46 PM, Simon Spero sesunc...@gmail.com wrote:
Actually Anglo and Francophone centric. And the USMARC style 245 was a poor
replacement for the UKMARC approach (someone at the British Library hosted
Linked Data meeting wondered why there were punctation characters
29 matches
Mail list logo