Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-08 Thread Jonathan Rochkind
I don't understand from your description how Topic Maps solve the 
identifying multiple versions of a standard problem. Which was the 
original question, right?  Or have I gotten confused? I didn't think the 
original question was even about topic vocabularies, but about how to 
best provide an identifier for (eg) Marc 2.1 and another for Marc 2.2, 
while still allowing machines to ignore versions if they like and just 
request and/or identify generic marc.  And you said that Topic Maps 
had a solution to this?


I am genuinely curious -- not neccesarily because I'm ever going to use 
Topic Maps (sorry!), but because if they have a well thought out tested 
solution to this, it could serve as a model in other contexts.


Jonathan

Alexander Johannesen wrote:

On Wed, May 6, 2009 at 18:44, Mike Taylor m...@indexdata.com wrote:
  

Can't you just tell us?



Sorry, but surely you must be tired of me banging on this gong by now?
It's not that I don't want to seem helpful, but I've been writing a
bit on this here already and don't want to be marked as spam for Topic
Maps.

In the Topic Maps world our global identificators are called PSI, for
Published Subject Indicators. There's a few subtleties within this,
but they are not so different from any other identificator you'll find
elsewhere (RDF, library world, etc.) except of course they are
*always* URIs. Now, the thing here is that they should *always* be
published somewhere, whether as a part of a list or somewhere. The
next thing is that they always should resolve to something (although
the standard don't require this, however I'd say you're doing it wrong
if you couldn't do this, even if it sometimes is an evil necessity).

This last part is really the important bit, where any PSI will act as
1) a global identificator, and 2) resolve to a human text explaining
what it represents. Systems can just use it while at the same time
people can choose the right ones for their uses.

And, yes, the identificators can be done any way you slice them. Some
might think that ie. a PSI set for all dates is crazy as you need to
produce identificators for all dates (or times), and that would be
just way too much to deal with, but again, that's not an identifcation
problem, that's a resolver problem. If I can browse to a PSI and get
the text that this is 3rd of June, 19971, using the whatsnot calendar
style, then that's safe for me to use for my birthday. Let's pretend
the PSI is http://iso.org/datetime/03061971. By releasing an URI
template computers can work with this automatically, no frills.

Now a bit more technical; any topic (which is a Topic Map
representation of any subject, where subject is defined as anything
you can ever hope to think of) can have more than one PSI, because I
might use the PSI http://someother.org/time/date/3/6/1971 for my date.
If my application only understand this former set of PSIs, I can't
merge and find similar cross-semantics (which really is the core of
the problem this thread has been talking about). But simply attach the
second PSI to the same Topic, and you do. In fact, both parties will
understand perfectly what you're talking about.

More complex is that the definitions of PSI sets doesn't have to
happen on the subject level, ie. the Topic called Alex to which I
tried to attach my birthday. It can be moved to a meta model level,
where you say the Topic for Time and dates have the PSI for both
organsiations, and all Topics just use one or the other; we're
shifting the explicity of identification up a notch.

Having multiple PSIs might seem a bit unordered, but it's based on the
notion of organic growth, just like the web. People will gravitate
towards using PSIs from the most trusted sources (or most accurate or
most whatever), shifting identification schemes around. This is a good
thing (organic growth) at the price of multiple identifiers, but if
the library world started creating PSIs, I betcha humanity and the
library world both could be saved in one fell swoop! (That's another
gong I like to bang)

I'm kinda anticipating Jonathan saying this is all so complex now. :)
But it's not really; your application only has to have complexity in
the small meta model you set up, *not* for every single Topic you've
got in your map. And they're mergable and shareable, and as such can
be merged and fixed (or cleaned or sobered or made less complex) for
all your various needs also.

Anyway, that's the basics. Let me know if you want me to bang on. :)
For me, the problem the library face isn't really the mechanisms of
this (because this is solvable, and I guess you just have to trust
that the Topic Maps community have been doing this for the last 10
years or so already :), however, but how you're going to fit existing
resources into FRBR and RDA, but that's a separate discussion.


Regards,

Alex
  


[CODE4LIB] Curious about Cell Phone Barcode Scanning Apps

2009-05-08 Thread Matt Amory
I'm interested in some advice on building an app to pickup barcode data
through a cell phone camera and return OPAC/Library Thing/WorldCat etc.
results to a mobile interface.
I know that Android has a UPC barcode reader linked to a shopping app, and
I'm wondering if this can be used or repurposed, or if there's a better
place to begin.

Thanks!


Re: [CODE4LIB] Curious about Cell Phone Barcode Scanning Apps

2009-05-08 Thread Jonathan Rochkind
I started to do a just bit of web research in this. Open source barcode 
photo recognition software looks like it's _just_ starting to become 
realistically available. This was the product that looked most 
promissing in my web research (not sure if it's what the Android app is 
using):


http://code.google.com/p/zxing/

My Umlaut software would be an _ideal_ end-point of barcode recognition, 
is why I started to look into it. Umlaut is designed specifically to 
meet the goal of taking a known item citation (such as an ISBN, sure), 
and returning a range of library availability and services for that 
item.  http://wiki.code4lib.org/index.php/Umlaut


The next step, which I haven't figured out yet, is how to get your 
software to participate in MMS/SMS architecture -- in particular to 
receive MMS/SMS messages in a way that's affordable to you and 
convenient to your users. (It looks like some but not all cell phones 
can send MMS messages to email, but not necessarily as conveniently as 
sending MMS to a cell number; but I'm not sure if there's a cheap way to 
have software receive MMS messages at a cell number. The Android app of 
course performs all it's processing on the Android itself, which you can 
do on a device-by-device basis for devices powerful enough for that; but 
I too am attracted to the idea of an MMS solution that would work on any 
MMS capable device, with no need to customize per device).


I also haven't actually looked at the zxing code yet.

But I'd love to have Umlaut able to receive an MMS message, and give the 
user back a concise list of library services/links. So many interesting 
projects, not enough time.


Jonathan

Matt Amory wrote:

I'm interested in some advice on building an app to pickup barcode data
through a cell phone camera and return OPAC/Library Thing/WorldCat etc.
results to a mobile interface.
I know that Android has a UPC barcode reader linked to a shopping app, and
I'm wondering if this can be used or repurposed, or if there's a better
place to begin.

Thanks!
  


Re: [CODE4LIB] One Data Format Identifier (and Registry) to Rule Them All

2009-05-08 Thread Alexander Johannesen
On Sat, May 9, 2009 at 00:32, Jonathan Rochkind rochk...@jhu.edu wrote:
 I don't understand from your description how Topic Maps solve the
 identifying multiple versions of a standard problem.

It's the mechanism of having multiple identifiers for Topics, so, in pseudo ;

Topic MARC21
  psi info:ofi/fmt:xml:xsd:MARC21
  psi http://loc.org/stuff/marc21;
  property #mime-type whatever for the binary

Topic MARC 1.1
  is_a MARC
  psi info:srw/schema/1/marcxml-v1.1
  psi http://loc.org/stuff/marcxml-v1.1;
  property #mime-type whatever 1.1

Topic MARC 1.2
  is_a MARC
  psi info:srw/schema/1/marcxml-v1.2
  psi http://bingo.com/psi/marcxml;
  property #mime-type whatever 1.2

Or, if if MARC 1.2 is backwards compatible with 1.1 ;

Topic MARC 1.2
  is_a MARC 1.1
  psi info:srw/schema/1/marcxml-v1.2

Or, if I make my own unofficial version ;

Topic MARC 2.0
  is_a MARC 1.2
  psi http://alex.com/psi/marc-2.0;

This is enough to hobble together what is and isn't compatible in
types of formats, so if your application is Topic Maps aware, this
should be trivial (including what format to ignore or react to). The
point is that you don't need *one* identifier for things; Topics are
proxies for knowledge, and part of the notion of knowledge is what
identifies that knowledge. Multiple PSIs help us leverage both rigid
and fuzzy systems.

As to the identifiers themselves (as in, the formatting), is that important?

Anyway, I'm suspecting I don't see what the problem seems to be. To
create the best identifier for things seems a bit of a strange
notion to me, but is this based on that there is only (or rather, that
you're trying to create) one identifier for any one thing?


Alex
-- 
---
 Project Wrangler, SOA, Information Alchemist, UX, RESTafarian, Topic Maps
-- http://shelter.nu/blog/ 


Re: [CODE4LIB] Curious about Cell Phone Barcode Scanning Apps

2009-05-08 Thread Eric Lease Morgan

On May 8, 2009, at 10:39 AM, Matt Amory wrote:

I'm interested in some advice on building an app to pickup barcode  
data
through a cell phone camera and return OPAC/Library Thing/WorldCat  
etc.

results to a mobile interface.
I know that Android has a UPC barcode reader linked to a shopping  
app, and
I'm wondering if this can be used or repurposed, or if there's a  
better

place to begin.



Consider Semapedia, funny looking barcodes to map the physical world  
objects to the Webbed world:


  http://en.semapedia.org/

--
Eric Lease Morgan


Re: [CODE4LIB] Curious about Cell Phone Barcode Scanning Apps

2009-05-08 Thread Joe Atzberger
Google provided the barcode-recognition line-interpolation software as open
source for Android developers to build on. That explains why I have about 4
barcode-scanning apps on the G1.

Note that most common cellphone camera's haven't advanced enough to get
reliable resolution for barcodes, in particular the up-close macro-like
distances you would use a scanner at.  My old nokia, despite the 3 MP
camera, couldn't get focus up close.

In a year or two that should be different for the currently available
models.

--Joe

On Fri, May 8, 2009 at 10:39 AM, Matt Amory matt.am...@gmail.com wrote:

 I'm interested in some advice on building an app to pickup barcode data
 through a cell phone camera and return OPAC/Library Thing/WorldCat etc.
 results to a mobile interface.
 I know that Android has a UPC barcode reader linked to a shopping app, and
 I'm wondering if this can be used or repurposed, or if there's a better
 place to begin.

 Thanks!



[CODE4LIB] exploiting z39.50

2009-05-08 Thread Eric Lease Morgan
How might I go about exploiting Z39.50 to extract specific MARC  
records from a library catalog?


More precisely, I am trying to download sets of MARC records from  
remote library catalogs destined for a sort of union catalog. I see  
each of these records being identified with a specific string in a  
local note or local subject field. This string might be CRRA, and my  
question is three-fold:


  1. What MARC field/subfield might I put this string?
  2. How would I go about getting the string indexed?
  3. How might I go about querying the server for records with this  
string?


For example, I might put CRRA in 599 $a. I would cross my fingers  
hoping this field gets indexes, and I might be able to search for it  
like this [1]:


 @attr 1=63 2=3 3=3 4=2 5=100 6=1 CRRA

Am I on the right track?

[1] http://www.collectionscanada.gc.ca/bath/tp-bath2.23-e.htm#b

--
Eric Morgan


Re: [CODE4LIB] exploiting z39.50

2009-05-08 Thread Ray Denenberg, Library of Congress

From: Eric Lease Morgan emor...@nd.edu

  1. What MARC field/subfield might I put this string?
  2. How would I go about getting the string indexed?
  3. How might I go about querying the server for records with this 
string?


I can at least talk about the third question.  There was work on a marc 
attribute set, though not completed.  If you look at the oid register at 
http://www.loc.gov/z3950/agency/defns/oids.html you'll see that the latest 
work on it (second draft) was in 2000, 
http://www.nlc-bnc.ca/iso/z3950/MARC_attribute_set_2.doc. So if someone 
actually wanted to put it to use it would have to be completed.


For SRU there is a complete marc context set, 
http://www.loc.gov/standards/sru/resources/marc-context-set.html.


--Ray


Re: [CODE4LIB] exploiting z39.50

2009-05-08 Thread Jonathan Rochkind
I wonder how xID handles superceded OCLCnums, if it'll still succesfully 
find the right matches for you?


Ray Denenberg, Library of Congress wrote:

From: Eric Lease Morgan emor...@nd.edu
  

  1. What MARC field/subfield might I put this string?
  2. How would I go about getting the string indexed?
  3. How might I go about querying the server for records with this 
string?



I can at least talk about the third question.  There was work on a marc 
attribute set, though not completed.  If you look at the oid register at 
http://www.loc.gov/z3950/agency/defns/oids.html you'll see that the latest 
work on it (second draft) was in 2000, 
http://www.nlc-bnc.ca/iso/z3950/MARC_attribute_set_2.doc. So if someone 
actually wanted to put it to use it would have to be completed.


For SRU there is a complete marc context set, 
http://www.loc.gov/standards/sru/resources/marc-context-set.html.


--Ray

  


Re: [CODE4LIB] exploiting z39.50

2009-05-08 Thread Xiaoming Liu
On Fri, May 8, 2009 at 3:08 PM, Jonathan Rochkind rochk...@jhu.edu wrote:

 I wonder how xID handles superceded OCLCnums, if it'll still succesfully
 find the right matches for you?


This is documented in
http://xisbn.worldcat.org/xisbnadmin/xoclcnum/api.htm#deleted


Worldcat uses OCLC Control Number Cross-Reference to track deleted OCLC
numbers. When an OCLC number is deleted, it's still search-able from this
service. In the response, we use presentOclcnum to specify present OCLC
number. For example 2416076 was merged into 24991049, a request of the
deleted number 2416076 will return:

  rsp xmlns=http://worldcat.org/xid/xoclcnum/; stat=ok
  oclcnum lccn=34025476 presentOclcnum=249910492416076/oclcnum
  /rsp


The presentOclcnum field is omitted when an OCLC number is active, so
request to current OCLC number 24991049 returns:

  rsp xmlns=http://worldcat.org/xid/xoclcnum/; stat=ok
  oclcnum lccn=34025476 24991049/oclcnum
  /rsp


Xiaoming





 Ray Denenberg, Library of Congress wrote:

 From: Eric Lease Morgan emor...@nd.edu


  1. What MARC field/subfield might I put this string?
  2. How would I go about getting the string indexed?
  3. How might I go about querying the server for records with this
 string?



 I can at least talk about the third question.  There was work on a marc
 attribute set, though not completed.  If you look at the oid register at
 http://www.loc.gov/z3950/agency/defns/oids.html you'll see that the
 latest work on it (second draft) was in 2000,
 http://www.nlc-bnc.ca/iso/z3950/MARC_attribute_set_2.doc. So if someone
 actually wanted to put it to use it would have to be completed.

 For SRU there is a complete marc context set,
 http://www.loc.gov/standards/sru/resources/marc-context-set.html.

 --Ray