Hi Jan,

Yes, I'm somewhat familiar with solrmarc in that I know a couple of
the developers and I know it is used internally by blacklight and
vufind. It is definitely a speedy solution for getting marc into solr,
but that part wasn't where I was having problems. It was more the
getting usable marc out of Invenio that was the challenge.

--jay

On Tue, Jun 15, 2010 at 11:14 AM, Jan Iwaszkiewicz
<[email protected]> wrote:
> Hi Jay,
>
> Let me add my 2 cents.
> Have you come across the solrmarc project (
> http://code.google.com/p/solrmarc/) ?
> It is supposed to be a solution for the problem of uploading MARC data to
> Solr.
>
> --Jan
>
> Jay Luker wrote:
>>
>> On Thu, Jun 10, 2010 at 5:02 PM, Samuele Kaplun <[email protected]>
>> wrote:
>>
>>
>>>
>>> Currently when you use xmlmarc2textmarc utility, and you export to
>>> Aleph, a dummy leader is generated, which proved to be enough for Aleph.
>>>
>>
>> Ahh, I wasn't aware of that existing utility. I showed the marc output
>> from bibformat to some librarian coder friends and they said, "huh,
>> looks like aleph sequential" :)
>>
>> So Aleph doesn't care about the leader values?
>>
>>
>>>
>>> I am currently not an expert on the leader subject, but to me it seems
>>> that the leader makes sense mostly in the MARC21 binary format, and when
>>> dealing with plain library records. And it exists in MARCXML just as a
>>> conversion consequence. Is this correct? Proof is that Invenio can do
>>> powerful and extremely flexible things without any need for the leader.
>>>
>>
>> Not an expert either. I think for Invenio it really just boils down to
>> interoperability. I mean, saying, Software X can do incredible,
>> amazing things with internal, non-standard format Y, isn't really a
>> remarkable statement.
>>
>>
>>>
>>> In particular if Invenio has to support the leader in MARCXML how can we
>>> map its workflows with the rigid schema of the leader:
>>>
>>
>> I think sensible defaults for some values combined with a minimum of
>> conditional logic should suffice. The first part of that may be the
>> trickier as I'm still trying to figure out defaults myself.
>>
>>
>>>
>>> Also what is the meaning of certain bytes of the leader in MARCXML:
>>> (from <http://www.loc.gov/marc/bibliographic/bdleader.html>):
>>>
>>> [...]
>>> Character Positions
>>> 00-04 - Record length
>>> [...]
>>> 12-16 - Base address of data
>>> [...]
>>>
>>
>> leader/05 = 'n' -  the term "new" in this context is confusing but
>> I've been told "don't overthink it"
>> leader/06 = 'a' - "...electronic resources that are basically textual in
>> nature"
>> leader/07 is where i'm less confident but I *think* the logic is
>> simply 'b' for articles, 'a' for things that are part of a collection
>> or proceedings, and 'm' for everything else. For ADS we are currently
>> storing our internal item-type description in the 690a (which may be
>> incorrect) and this is how i'm determining the leader/07
>> leader/08 = '#' - not sure about this one
>> leader/09 = 'a' - assuming unicode
>>
>> I've been looking at http://www.itsmarc.com/crs/bib1465.htm for some
>> guidance and when I have a dumb question about something I'll ask in
>> the #code4lib irc channel.
>>
>> Apparently the 008 control field
>> (http://www.loc.gov/marc/bibliographic/bd008.html) is also important
>> to many applications, but I haven't really explored it or determined
>> the level of importance.
>>
>>
>>>
>>> In the end, probably the best thing is still to put a fake leader like
>>> xmlmarc2textmarc currently does, with the most neutral values.
>>>
>>
>> Yeah, I guess I agree, although I'm not sure what a "neutral" value
>> would be for something like the leader/07. also it's important to get
>> the leader/09 correct as tools like pymarc need to know how to decode.
>>
>>
>>>
>>> It is true that, on the other hand, when Invenio records have been
>>> imported from original MARC21 or from MARCXML with a leader, Invenio
>>> should not throw away such information.
>>>
>>
>> Agreed.
>>
>> --jay
>>
>> ******************************************************
>> Jay Luker               Astrophysics Data System (ADS)
>> [email protected]  Center for Astrophysics
>> 617-495-4588            60 Garden Street  MS 67
>> 617-495-7356 fax        Cambridge, MA  02138
>> ******************************************************
>>
>
>



-- 
******************************************************
Jay Luker               Astrophysics Data System (ADS)
[email protected]  Center for Astrophysics
617-495-4588            60 Garden Street  MS 67
617-495-7356 fax        Cambridge, MA  02138
******************************************************

Reply via email to