from:"Jonathan Gorman"

Re: [CODE4LIB] PHP5 Help

2008-07-01 Thread Jonathan Gorman

There's some complaints about this on the mailing list but it is not clear if 
it was solved.  I would try checking to see if you have the latest version.  
The home page for this project seems to have a few dead links so I don't know 
how active it is.

Like Alex said, cast the $seconds or do something like adding by zero or 
multiplying by 1 to force the conversion into a numeric type.

Hopefully though the latest version addresses this.  (It would be interesting 
to try to figure out why PHP5 is casting just that part as a string, but sadly 
I don't have time for that.)


Jon Gorman

 Original message 
>Date: Tue, 1 Jul 2008 07:42:31 -0400
>From: Nicole Engard <[EMAIL PROTECTED]>  
>Subject: [CODE4LIB] PHP5 Help  
>To: CODE4LIB@LISTSERV.ND.EDU
>
>I am missing something right in front of my eyes.  I'm rusty on my
>PHP, I'm wondering if someone can help me with this error:
>
>Warning: gmmktime() expects parameter 3 to be long, string given in
>/public_html/magpierss-0.72/rss_utils.inc on line 35
>
>I went through the manual and didn't see anything wrong with the code below.
>
>###FROM MY PHP:
>
>$del_user = 'nengard'; # del.icio.us 
>username
>
># Use magpie to get del.icio.us links via RSS
>$feed = fetch_rss('http://del.icio.us/rss/' . $del_user);
>
># Only make a post if there are any links today
>if (count($feed->items) > 0) {
>
>$content = "\n";
>
>foreach ($feed->items as $link) {
>
>$publishdate = parse_w3cdtf($link['dc']['date']);
>
>###CODE CUT HERE##
>
>##FROM magpierss-0.72/rss_utils.inc
>
>function parse_w3cdtf ( $date_str ) {
>
># regex to match wc3dtf
>$pat = 
> "/(\d{4})-(\d{2})-(\d{2})T(\d{2}):(\d{2})(:(\d{2}))?(?:([-+])(\d{2}):?(\d{2})|(Z))?/";
>
>if ( preg_match( $pat, $date_str, $match ) ) {
>list( $year, $month, $day, $hours, $minutes, $seconds) =
>array( $match[1], $match[2], $match[3], $match[4],
>$match[5], $match[6]);
>
># LINE 35 BELOW HERE - calc epoch for current date assuming GMT
>$epoch = gmmktime( $hours, $minutes, $seconds, $month, $day, $year);
>
>$offset = 0;
>if ( $match[10] == 'Z' ) {
># zulu time, aka GMT
>}
>else {
>list( $tz_mod, $tz_hour, $tz_min ) =
>array( $match[8], $match[9], $match[10]);
>
># zero out the variables
>if ( ! $tz_hour ) { $tz_hour = 0; }
>if ( ! $tz_min ) { $tz_min = 0; }
>
>$offset_secs = (($tz_hour*60)+$tz_min)*60;
>
># is timezone ahead of GMT?  then subtract offset
>#
>if ( $tz_mod == '+' ) {
>$offset_secs = $offset_secs * -1;
>}
>
>$offset = $offset_secs;
>}
>$epoch = $epoch + $offset;
>return $epoch;
>}
>else {
>return -1;
>}
>}
>
>Nicole C. Engard
>Open Source Evangelist, LibLime
>(888) Koha ILS (564-2457) ext. 714
>[EMAIL PROTECTED]
>AIM/Y!/Skype: nengard
>
>http://liblime.com
>http://blogs.liblime.com/open-sesame/

Re: [CODE4LIB] planet.code4lib.org -- 3 suggestions

2008-05-21 Thread Jonathan Gorman

Catching up on some of Mark's posts I can see why some might want him off.  
Perhaps someone who's more emotionally attached to the issue of removal might 
just want to contact him and see if he knows he's on the list or if he wants to 
remain on?

I realized I don't honestly care enough about the planet one way or the other.  
I'd be sad to see it go, but I wouldn't wail in misery.

Jon

 Original message 
>Date: Wed, 21 May 2008 17:31:03 -0400
>From: Jonathan Rochkind <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] planet.code4lib.org -- 3 suggestions
>To: CODE4LIB@LISTSERV.ND.EDU
>
>No one other than me is managing it at present. Pretty much the only
>'management' I do is adding blogs whenever someone asks me too. (I also
>did just a bit of fine-tuning of the CSS for the html version).  I think
>it may be the planet software that decides what order to display
>lastname and firstname, but feel free to email me ones that are
>displaying oddly, and I'll see if I can fix them. I'm not going to get
>into serious hacking of the planet software though, or replacing it with
>other software (I _maybe_ could be convinced to upgrade it if there's an
>upgrade available).  (if anyone else wants to do any of that stuff,
>raise your hand on the list, and we can probably get you access).
>
>An unanswered question is when or if the community ever expects me to
>_remove_ blogs from the planet.  It's not clear. I don't want to remove
>them if people are going to see it as an abuse of power or something, as
>some have indicated they would. (Most could probably care less either way).
>
>Other blogs people have suggested I remove from the code4lib aggregator,
>as consisting of mainly nontopical content for code4lib, are Mark
>Lindner and Meredith Farkas.  I guess say so if you'd like to LEAVE
>those on the aggregator, and if nobody says so, I'll leave them. If
>someone does say so... then I have no idea. :)
>
>Jonathan
>
>Jodi Schneider wrote:
>> I'm a big fan of the planet aggregator. Normally I make suggestions on
>> #code4lib. However, Jonathan Rochkind asked me to bring them up onlist
>> this time. (Who besides Jonathan is managing the planet at present?)
>>
>> (1) Bjorn Tipling suggested removing him, since he's going to focus on
>> politics:
>> "Some of the places where my blog is being tracked, such as code4lib and
>> netlamers, might want to look at whether or not they want to continue to
>> follow me."
>> http://bjorn.tipling.com/2008/05/17/blog-pundits/
>> Can we remove his blog please?
>>
>> (2) I'd really like a changelog--which might further justify
>> adding/dropping blogs without discussion.
>>
>> (3) Could we please label blogs consistently? For individuals, we have
>> mostly lastname, firstname with a few firstname lastname. Either way
>> works. But the mixture rankles (sad, I know!).
>>
>> Thanks!
>>
>> -Jodi
>>
>> Jodi Schneider
>> Science Library Specialist
>> Amherst College
>> 413-542-2076
>>
>>
>
>--
>Jonathan Rochkind
>Digital Services Software Engineer
>The Sheridan Libraries
>Johns Hopkins University
>410.516.8886
>rochkind (at) jhu.edu

Re: [CODE4LIB] planet.code4lib.org -- 3 suggestions

2008-05-21 Thread Jonathan Gorman

I'd guess I'd be fine removing Bjorn as he's changing the focus of his blog and 
he's suggested we actually remove him.  I don't necessarily agree with removing 
either Meredith or Mark's blogs.  Sure, those two might have more personal 
content, but there are certainly others on there that have done that as well.  
Better solution seems to be just truncating the posts.  (Or offering a "full" 
and truncated feed).  If there's a particular person who you don't like, then 
filter it out with Yahoo pipes or something similar.

A change log might be useful as well if it's not too much of a hassle to 
maintain.

What software does the planet run on?  Some sort of drupal module?

Jon Gorman


 Original message 
>Date: Wed, 21 May 2008 17:31:03 -0400
>From: Jonathan Rochkind <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] planet.code4lib.org -- 3 suggestions
>To: CODE4LIB@LISTSERV.ND.EDU
>
>No one other than me is managing it at present. Pretty much the only
>'management' I do is adding blogs whenever someone asks me too. (I also
>did just a bit of fine-tuning of the CSS for the html version).  I think
>it may be the planet software that decides what order to display
>lastname and firstname, but feel free to email me ones that are
>displaying oddly, and I'll see if I can fix them. I'm not going to get
>into serious hacking of the planet software though, or replacing it with
>other software (I _maybe_ could be convinced to upgrade it if there's an
>upgrade available).  (if anyone else wants to do any of that stuff,
>raise your hand on the list, and we can probably get you access).
>
>An unanswered question is when or if the community ever expects me to
>_remove_ blogs from the planet.  It's not clear. I don't want to remove
>them if people are going to see it as an abuse of power or something, as
>some have indicated they would. (Most could probably care less either way).
>
>Other blogs people have suggested I remove from the code4lib aggregator,
>as consisting of mainly nontopical content for code4lib, are Mark
>Lindner and Meredith Farkas.  I guess say so if you'd like to LEAVE
>those on the aggregator, and if nobody says so, I'll leave them. If
>someone does say so... then I have no idea. :)
>
>Jonathan
>
>Jodi Schneider wrote:
>> I'm a big fan of the planet aggregator. Normally I make suggestions on
>> #code4lib. However, Jonathan Rochkind asked me to bring them up onlist
>> this time. (Who besides Jonathan is managing the planet at present?)
>>
>> (1) Bjorn Tipling suggested removing him, since he's going to focus on
>> politics:
>> "Some of the places where my blog is being tracked, such as code4lib and
>> netlamers, might want to look at whether or not they want to continue to
>> follow me."
>> http://bjorn.tipling.com/2008/05/17/blog-pundits/
>> Can we remove his blog please?
>>
>> (2) I'd really like a changelog--which might further justify
>> adding/dropping blogs without discussion.
>>
>> (3) Could we please label blogs consistently? For individuals, we have
>> mostly lastname, firstname with a few firstname lastname. Either way
>> works. But the mixture rankles (sad, I know!).
>>
>> Thanks!
>>
>> -Jodi
>>
>> Jodi Schneider
>> Science Library Specialist
>> Amherst College
>> 413-542-2076
>>
>>
>
>--
>Jonathan Rochkind
>Digital Services Software Engineer
>The Sheridan Libraries
>Johns Hopkins University
>410.516.8886
>rochkind (at) jhu.edu

Re: [CODE4LIB] free movie cover images?

2008-05-19 Thread Jonathan Gorman

 Original message 
>Date: Mon, 19 May 2008 17:02:06 -0500
>From: Peter Keane <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] free movie cover images?
>To: CODE4LIB@LISTSERV.ND.EDU
>
>On Mon, May 19, 2008 at 05:29:38PM -0400, Jonathan Rochkind wrote:
>> But I would agree that it is our duty as libraries to be pushing the
>> boundaries of these grey areas in a world where much of copyright _is_
>> currently a gray area, not automatically taking the most expansive
>> perspective with regard to copyright holders rights, out of fear.  Not
>> just "society", but I think we have a special duty as libraries whose
>> mission involves expanding access to information. [That is, all public
>> and most academic libraries; a private or corprate library may not share
>> this mission and this duty.].
>>
>
>I would say it's something of a moral obligation (in the academic/public
>side) to go ahead and use the thumbnails ('cause it's the right
>thing to do AND models good information practices), in the face of
>fear/uncertainty/doubt.
>

I've argued similar, but I think thumbnails is low ambition.  If we're going to 
do this, I don't want to just grab other people few scattered images and either 
squirrel them away.

I want to see us digitizing our own stuff, using the full-text to index 
ourselves and generating images of covers, title pages, indexes for our 
workflow.  Then storing the rest as appropriate.  (I'll reserve judgement on 
how free we should share it).  Certainly we should be analysing the heck out of 
the full text to try to extract every nugget of data and also looking for 
relation between books.  (Heck with just saying how they are related, what if 
we also tried to find similar writing styles etc).  All of this is derivable 
from the full text.

>I wonder what the effect of this very thread will be on folks wondering
>it they should or shouldn't use thumbnails? Honestly, folks, this is our
>profession. (Where's Larry Lessig when you need him... ;-).

Most likely, absolute none.  Sorry to be depressing, but I don't know how many 
people on this list really disagree with you but sadly many of us lack means 
and resources to do this.  However, I remain somewhat optimistic we might soon 
start getting some images through better and less entangled sources than Amazon 
from the efforts of the OCA.

It's late.  Perhaps I'll be nostalgistic and reread a favorite Sandman story 
about enough cats dreaming will allow them to rule the world.  Think the same 
happens with visionaries?

Jon Gorman

Re: [CODE4LIB] free movie cover images?

2008-05-19 Thread Jonathan Gorman

>Actually, this is one of a number of links out there (esp. regarding the
>Arriba Soft case) suggesting that fair use, regarding thumbnail images,
>is quite often the applicable standard, the key (often) being that there
>is no "Effect of the use upon the potential market for or value of the
>copyrighted work".
>

I'm not trying to argue against the heart of your argument, just perhaps 
suggesting that we should be careful about terminology.  Fair use of something 
is not the same as saying the source is not copyrightable.  It's an important 
distinction to keep.  It is, after all, how licenses are enforced.  It's fair 
use for me to cite a passage.  I may even be able to reproduce the whole work 
under the conditions of a license.  This does not immediately propagate 
downstream to those who might copy my work.

>It's just depressing to me that the society, in the shadow of DCMA, RIAA
>action, etc. has essentially cowered in the face of these copyright
>issues, and I would go so far as to say the we librarians often abrogate
>our duty. I mean it is our job to *create* access to information
>not *prevent* it. Right? Geez, nothing like the free flow of information
>getting privatized. My aim is just to promote the idea of assuming that
>"information wants to be free" and proceed under that assumption unless
>there is clear and obvious proof otherwise.
>

I agree that many have let fear blind themselves or make themselves hesitate 
from providing certain services.  But at the same time there still is a lot 
that is undefined or tenuous.  It doesn't help to cite material that talks 
about creating thumbnails as fair use and then take another step and claim 
thumbnails are not copyrightable.  If you're going to make that next step, it 
would be nice to see material supporting it.

I agree we have a responsibility to our users and I do feel that universities 
and other academic organizations should be fighting even on legal fronts to 
protect reasonable use (such as using thumbnails and automatically derived 
metadata).  However, as reasonable individuals we also do have to evaluate the 
best ways to do this and likelihood of legal ramifications and their costs.  If 
you believe in this there are plenty of ways as well as possible civil 
disobedience.  You can lobby congress and the copyright office to establish 
laws and policies to protect this use.  Certainly, risking an institution's 
financial and legal status this day and age should be carefully considered.

>Looked at another way: a thumbnail is just a bit of "visual" metadata,
>and you cannot copyright metadata.

At what point does something become a thumbnail?  50%?  75%? 100% but with poor 
resolution?  If it's cropped?   Missing colors?  I would also point out there 
are many who include the work itself as metadata, in which case it most 
certainly falls under copyright.  Certain visualizations may also fall into a 
gray area, regardless if the source is text or an image.

The cases seem to to me to point to two important poitns.  First, Google is not 
responsible when it creates fair use thumbnails of someone else who has already 
infringed.  The infringement only applies to the original person who copied the 
image.  I'm not sure how this compares to other case law at this point.  I'm 
also not sure how it would deal with a service that refused to take down the 
derived thumbnail if the original image is illegal or violating copyright.

Second, if someone used Google's service to profit on their own end (took the 
images and then sold them) the judge might regard that as not fair use.

But again, I'm not a lawyer.  So I'm going to stop thinking about this 
particular issue right now.

Jon Gorman

Re: [CODE4LIB] free movie cover images?

2008-05-19 Thread Jonathan Gorman

>Another link about thumbnail images not being copyright-able:
>
>http://www.publicknowledge.org/node/947
>

I don't think this particular case is saying thumbnail images are not 
copyrightable, but rather that the creation of them is fair use.  I haven't 
read it closely, but if you look at the case and some of the description it's 
talking about the thumbnail images created by Google itself to represent 
another source.  The key words here are "highly transformative".  Google is 
transforming an existing work and creating a derived work for a non-competitive 
purpose (as the judge ruled).  Much in  the similar way traditionally creating 
indexes and the like are protected by copyright.

Just copying another source's thumbnail does not seem to be quite the same.  
After all, you are then not doing anything to the thumbnail, just copying it.  
How do you what you are then printing/publishing counts as transformative work 
or that the "new work" derived from the existing one is not in itself 
copyrighted to the person who originally transformed it?  For example, were I 
to compose a play and then you made a series of paintings inspired by it, it's 
different enough I would probably not be infringing on copyright.  That doesn't 
mean my paintings are now not under copyright.

Of course, I'm not a lawyer, but it does seem a leap to make off of what I have 
read in these documents.

Jon Gorman

 Original message 
>Date: Mon, 19 May 2008 15:23:49 -0500
>From: Peter Keane <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] free movie cover images?
>To: CODE4LIB@LISTSERV.ND.EDU
>
>Hi All-
>
>Perhaps for some reason these precedents do not apply here (although I
>doubt it) -- I am no lawyer. But I DO think that it is our responsibilty
>as librarians and educators to *not* shy away from cases where copyright
>issues are not clear and obvious. It is our job to provide the highest
>possible service to our users, not to be timid in the face of false
>and/or faulty claims about copyright infingement.
>
>--peter keane
>
>On Mon, May 19, 2008 at 03:00:08PM -0500, Charles Ledvina wrote:
>> I know my suggestion is probably filled with copyright infringements but
>> you could use your Amazon API to get links to all of their images. Your
>> url would look something like this:
>>
>> http://webservices.amazon.com/onca/xml?Service=AWSECommerceService&SubscriptionId=[your_api_code]&Operation=ItemSearch&SearchIndex=Blended&Keywords=[upc_code]ResponseGroup=Images
>>
>> Using the 024 will usually generate a unique result and then you can
>> choose from a variety of image sizes.  I have a kind of API of an API
>> service running at chopac.org as an example.  Simply enter a UPC or ISBN
>> and you get back an xml file with cover link and product link. Small,
>> medium (default) and large images are available by adding s, m or l at
>> the end of the UPC.
>>
>> Examples:
>>
>> Simspons Movie-- http://chopac.org/cgi-bin/tools/upc2image.pl?024543484271
>> Simpsons Movie (small image)--
>> http://chopac.org/cgi-bin/tools/upc2image.pl?024543484271s
>>
>>
>> The product link is supplied to somewhat fulfill Amazon's requirements
>> to link to their items.
>>
>> Later,
>> Charles Ledvina
>> infosoup.org
>> chopac.org
>>
>>
>>
>>
>> Ken Irwin wrote:
>>> Hi folks,
>>>
>>> With some limitations, the Google Books API allows folks to access book
>>> covers for free. (How's that working out? Anyone having luck with it?)
>>> -- what about movie/DVD/VHS covers? Are there any free sources for those
>>> images?
>>>
>>> I'd like to work up a virtual-browsing interface for our library's
>>> pretty small collection of feature films, and I'd love to include
>>> covers. Any ideas on how I might get them? Anyone else doing this?
>>>
>>> Thanks
>>> Ken
>>>
>>> --
>>> Ken Irwin
>>> Reference Librarian
>>> Thomas Library, Wittenberg University

Re: [CODE4LIB] coverage of google book viewability API

2008-05-07 Thread Jonathan Gorman

>
>The Google API returns sufficient information to NOT point people to
>books with no preview--it tells if full view, partial view, or no view
>is provided for a given book. I agree that our software that uses this
>API ought to either suppress no-preview books entirely, or present them
>in a particular way that makes it clear that they're no preview (if
>there's any point to this at all).

What I've done in my very rough demo is to say something like
"Full text available", "Parts of text available" and "Additional information 
available".  (Not exactly, but it's not worth the time to look it up).  The 
general consensus around here seems to be even the minimal records tend to have 
useful information, more so than if Google was just repeating the catalog 
entry.  The links don't even show up at all if there is no google information.

Jon Gorman

Re: [CODE4LIB] Exporting RSS Source from a Blog

2008-05-06 Thread Jonathan Gorman

Have a link to your server?

Hopefully your system has a pretty flexible feed system and lets you specify 
date ranges or the like.

Then just do something like
wget url_to_rss.
That will create the rss file.

Similarly, lynx -dump will do the same.

(You'll want to be careful about overwriting files).

The other thing you may have to do is crawl/filter your website to get the 
links to each post, then request the rss2 version of that.

I don't have any actual samples now, but I might be able to give you some after 
work.

(I can create a hypothetical scenario though...let's say there's a well know 
blog system that always will give you an rss version of a document by merely 
adding feed to the end of the url.  It also contains links on the front page to 
monthly archives.

In a Unixish environment I  might do something like
lynx -dump | grep http | sed -e 's/^[^h]*h/h/' -e 's/ *$/\/feed/' > temp.txt
(edit temp to include just the monthly archives)
then  cat temp.txt | xargs wget -r -l 1

Of course, without knowing how much you can actually get out of your system as 
rss it's a gamble.

(Sorry, this all isn't probably very helpful, but if you give your actual url I 
might be able to give something more meaningful tonight.)

Jon Gorman


 Original message 
>Date: Tue, 6 May 2008 11:23:29 -0400
>From: "The Ford Library at Fuqua" <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] Exporting RSS Source from a Blog
>To: [EMAIL PROTECTED]
>Cc: CODE4LIB@listserv.nd.edu
>
>   Hi John,
>
>   Thanks for the quick response. I tried accessing the
>   feed with lynx to no avail. Its been quite awhile
>   since I worked w/ lynx. I'll take a quick look at
>   wget as well and see if its deployed here and
>   usable.
>
>   Can you spare a few moments to send an example of
>   your "quick and dirty" method?
>
>   Feel free to do this on- or off-list if you have the
>   time.
>   Thanks again!
>   --
>   Carlton Brown
>   Associate Director & IT Services Manager
>   Ford Library - Fuqua School of Business
>   Duke University
>
>   On Tue, May 6, 2008 at 10:11 AM, Jonathan Gorman
>   <[EMAIL PROTECTED]> wrote:
>
> The quick and dirty way I've done something
> similar in the past is to download individual rss
> pages by running something like wget. Other
> command-line browsers/spiders could do something
> similar.
>
> After all, the mechanisms for pulling rss feeds
> are really at the base the same mechanisms for
> pulling web pages of any type.
>
> Jon Gorman
>  Original message 
> >Date: Tue, 6 May 2008 10:01:48 -0400
> >From: The Ford Library at Fuqua
> <[EMAIL PROTECTED]>
> >Subject: [CODE4LIB] Exporting RSS Source from a
> Blog
> >To: CODE4LIB@LISTSERV.ND.EDU
> >
> > Hello All,
> >
> >We're attempting to migrate our java-based
> Blojsom blog to the more
> >user-friendly WordPress software. WordPress has
> built import wizards for
> >many popular blog platforms; but there isn't one
> for Blojsom which is
> >different from *bloxsom* which does have an
> import wizard. Blojsom does have
> >an export blog plugin; but the data is not in RSS
> 2.0 and would require more
> >Perl than I know to convert.
> >
> >WP can import data in RSS 2.0, and I can grab the
> RSS source of some posts
> >by simply viewing/copying the source in my
> browser. But I need to migrate
> >more than the limited number of posts that can be
> extracted by viewing the
> >RSS source in the browser.
> >
> >Does anyone know of a tool or hack to extract -
> export the entire contents,
> >or a large fixed number of posts from a blog as
> RSS 2.0? Google Reader and
> >some others will grab a large number of posts;
> but I can't view the RSS
> >source.
> >
> >I've done considerable googling already and the
> few scripts/tools I've
> >located call for PHP or Ruby -- neither of which
> are deployed in our
> >environment.
> >
> >Thanks in advance for any tips or pointers.
> >
> >--
> >Carlton Brown
> >Associate Director & IT Services Manager
> >Ford Library - Fuqua School of Business
> >Duke University

Re: [CODE4LIB] Exporting RSS Source from a Blog

2008-05-06 Thread Jonathan Gorman

The quick and dirty way I've done something similar in the past is to download 
individual rss pages by running something like wget. Other command-line 
browsers/spiders could do something similar.

After all, the mechanisms for pulling rss feeds are really at the base the same 
mechanisms for pulling web pages of any type.

Jon Gorman

 Original message 
>Date: Tue, 6 May 2008 10:01:48 -0400
>From: The Ford Library at Fuqua <[EMAIL PROTECTED]>
>Subject: [CODE4LIB] Exporting RSS Source from a Blog
>To: CODE4LIB@LISTSERV.ND.EDU
>
> Hello All,
>
>We're attempting to migrate our java-based Blojsom blog to the more
>user-friendly WordPress software. WordPress has built import wizards for
>many popular blog platforms; but there isn't one for Blojsom which is
>different from *bloxsom* which does have an import wizard. Blojsom does have
>an export blog plugin; but the data is not in RSS 2.0 and would require more
>Perl than I know to convert.
>
>WP can import data in RSS 2.0, and I can grab the RSS source of some posts
>by simply viewing/copying the source in my browser. But I need to migrate
>more than the limited number of posts that can be extracted by viewing the
>RSS source in the browser.
>
>Does anyone know of a tool or hack to extract - export the entire contents,
>or a large fixed number of posts from a blog as RSS 2.0? Google Reader and
>some others will grab a large number of posts; but I can't view the RSS
>source.
>
>I've done considerable googling already and the few scripts/tools I've
>located call for PHP or Ruby -- neither of which are deployed in our
>environment.
>
>Thanks in advance for any tips or pointers.
>
>--
>Carlton Brown
>Associate Director & IT Services Manager
>Ford Library - Fuqua School of Business
>Duke University

Re: [CODE4LIB] Announcement: Open Source In Libraries Website

2008-03-27 Thread Jonathan Gorman

I don't suppose we could see a post on some of the new FUD, could we? *nudge, 
nudge*.  I hear somethings from my current position, but I haven't really heard 
anything new or stronger in the past few years.  If you've heard some more I'd 
be curious what it is.  (In no small part so I'm somewhat prepared for it if 
similar rhetoric is used to justify unwise decisions).

Jon Gorman

 Original message 
>Date: Thu, 27 Mar 2008 15:20:44 -0400
>From: "K.G. Schneider" <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] Announcement: Open Source In Libraries Website
>To: CODE4LIB@LISTSERV.ND.EDU
>
>Keep in mind that anti-OSS FUD has reached new levels, now that vendors
>see that it is gaining traction. So OSS has to be presented
>strategically and in context of the dumb statements I hear, which
>include all the stereotypes and b.s. I discussed in my 2007 c4l keynote
>but now go beyond it.
>
>K.G. Schneider

Re: [CODE4LIB] poll of javascript libraries

2008-03-26 Thread Jonathan Gorman

Cool.  Thanks for doing this.

Jon

 Original message 
>Date: Wed, 26 Mar 2008 11:58:49 -0400
>From: Keith Jenkins <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] poll of javascript libraries
>To: CODE4LIB@LISTSERV.ND.EDU
>
>As of right now, the results of the informal poll of Javascript
>libraries stands as follows:
>
>jQuery = 23 votes
>Prototype = 17 votes
>Scriptaculous = 10 votes
>YUI = 9 votes
>ExtJS = 5 votes
>Dojo = 2 votes
>MooTools = 2 votes
>MochiKit = 1 votes
>LowPro = 0 votes
>
>Note that these poll results are completely unscientific and
>necessarily incomplete (superdelegates have not been counted yet...),
>but hopefully not entirely uninformative.
>
>If you still want to add your input, the poll is here:
>   http://doodle.ch/sr5z4vusiwi4yssi
>
>Maybe someone wants to write an article for the
>code4lib journal, or present at next year's conference about their
>favorite javascript library...
>
>Cheers,
>Keith

Re: [CODE4LIB] Free covers from Google

2008-03-17 Thread Jonathan Gorman

 Original message 
>Date: Mon, 17 Mar 2008 11:13:58 -0400
>From: Tim Spalding <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] Free covers from Google
>To: CODE4LIB@LISTSERV.ND.EDU
>
>>  limits. I don't think it's a strict hits-per-day, I think it's heuristic
>>  software meant to stop exactly what we'd be trying to do, server-side
>>  machine-based access.
>
>Aren't we still talking about covers? I see *no* reason to go
>server-side on that. Browser-side gets you what you want—covers from
>Google—without the risk they'll shut you down over overuse.


I could see one reason to do cover images server-side.  Say a library has a 
"new titles" list.  These books (and hence the images) are going to be the same 
for quite a while.  It might make sense to try to optimize it by downloading 
said images once and caching it on a local server.  Otherwise every time 
someone hits the new titles list they'll have to wait for Google to respond.  
Not sure how much of an advantage it really would be to host it server side.

Jon Gorman

[CODE4LIB] Code4Lib 2008: Haunted Tour Gave Up the Ghost

2008-02-19 Thread Jonathan Gorman

Hi all,

The mob has most empathetically not spoken out about the tours or registered 
for them.  As such, it's been canceled.  If you're one of the few, few folks 
who signed up, sorry.  I'm sure you'll find some fun people to hang out with at 
the conference.

I thought I would let people know.

Jon Gorman

[CODE4LIB] Code4Lib 2008: Haunted Tours, Breakout sessions, Signups for dinners, and reminders

2008-02-12 Thread Jonathan Gorman

Hi all,

I thought I would send out another update on some of the social activities.
Hopefully most people got the email recently about a Haunted Tour. If you
didn't, and are interested I've included the text at the bottom of this email.

We have only had a trickle of signups for the tour so far and if we don't get
more conference services may end up cancelling it. So try to sign up in the
next few days if you're interested.

Some may remember earlier I was talking about the Shanghai Tunnel Tour
(http://cgsstore.tripod.com/id18.html/index.html) and that this isn't quite the
same one. There was a miscommunication somewhere along the process and we
ended up with this tour. If anyone feels like they would actually go on that
one but not the Haunted Tour let me know.

If you prefer eating and drinking to walking though, don't feel obliged to sign
up ;).

Meanwhile, I'll be compiling a list of dinner places that have been posted at
various spots and we'll try to start setting up ways for individuals and groups
to let people know where they are planning on eating dinner. (It's optional,
but it could help us from flooding a place or at least not be surprised when
it's full). We'll hopefully have those up soon.

I've been asked to remind people about the breakout sessions. For those who
haven't attended previous years, breakout sessions are a pretty loose block of
time where a group may gather for a more involved presentation, a group
discussion, or create some piece of software. Want to take on Casey Durfee and
do a whole ILS in 250 lines or less? Want to talk about Library 3.2? See who
can gather the most MARC records in the shortest amount of time? Discuss the
impact of archiving of valuable digital historical materials such as
Breakout (http://en.wikipedia.org/wiki/Breakout)? Suggest them on the breakout
signup sheet at http://code4lib.org/conference/2008/breakout.

As a reminder, we've got some interesting lists for social activities created
by volunteers:

Things to do, places to go
http://groups.google.com/group/code4libcon/web/portland-in-late-february

Some information gathered about size of certain places
http://groups.google.com/group/code4libcon/web/possible-code4lib-dinner-locations

A map of interesting spots
http://maps.google.com/maps/ms?hl=en&ie=UTF8&msa=0&msid=107913207927802716313.0004447d18ac57a8c07d8&z=12&om=0

Till next time,

Jon Gorman

== About Haunted Tour =
Interested in seeing a different side of Portland! Let off some
steam after a long day at the conference by going on a spooky walking
tour! Recognized by USA Today, The Los Angeles Times, the Wall Street
Journal, and Fox TV as the "Best City Tour".

Wednesday February 27 at 6:30pm.

For a complete description, visit the website at
http://www.portlandwalkingtours.com/tours/bizarre.htm

There are only 25 spots available, cost is $20 per person. Please
sign up using our registration system, payment will only be accepted
by Visa, MasterCard or Discover.

https://secure.oregonstate.edu/ocs/register.php?event=290

[CODE4LIB] Social Activities for Code4Lib 2008

2008-02-01 Thread Jonathan Gorman

Hi all,

I thought I'd give a brief update on how social activities are going for
Code4Lib 2008. I'm posting to the general list because it doesn't seem
everyone's on the code4libcon list. (Of course, I believe that list is
primarily for volunteers and planning, so that's not surprising).

Right now Jeremy Frumkin and I are working with OSU' s Conference Services to
try to set up a social activity and some signups. The signups will be for at
least the activity that we've planned and possibly to organize trips to
restaurants for dinner. The activity that we're going to try to organize is
the Shanghai Tunnel tours:
http://www.shanghaitunnels.info/. They're pretty flexible and were pretty
eager to work with us. We're still trying to see if we can't get some other
activities going, but with the limited funds and time we have at this point we
can't promise much.

There's also a happy hour at the hotel that we're working on making a social
event. We'll update on that as we get more information.

I'd strongly encourage anyone who has a social event or activity they want to
do to forge ahead and start organizing it. I'm willing to help, so send me an
email. Some people have already taken the initiative and started themselves,
as evidenced by the recent talk on code4libcon about a PGP signing party.

To help make your planning go smoother, several volunteers have put a lot of
work into a couple of pages. I'm afraid of listing them here in case I miss
someone and my list of . I'll try to do so later when I have a bit more time.
A huge thanks for those who did contribute.

A page of events, restaurants, and places to visit.
http://groups.google.com/group/code4libcon/web/portland-in-late-february

Some volunteers were working on seeing if it would be possible to try to do a
dinner. At this point it doesn't look too likely, but we can hopefully use
this information to organize signups for those who might want to go with a
group of new people to a particular restaurant.
http://groups.google.com/group/code4libcon/web/possible-code4lib-dinner-locations

Michael Giarlo started a google map to keep track of some places he was
interested in and he and a few other volunteers are busy adding even more
information. (Again, sorry, don't know all the volunteers).
http://maps.google.com/maps/ms?hl=en&ie=UTF8&msa=0&msid=107913207927802716313.0004447d18ac57a8c07d8&z=12&om=0

Jon Gorman

Feel free to email my gmail account as well, [EMAIL PROTECTED]

Re: [CODE4LIB] perl questions

2008-01-23 Thread Jonathan Gorman

Don't know.  I'm on both lists, as I imagine most people are.  I didn't pay 
much attention to the various threads to see which list they were on ;).

The perl4lib list doesn't get much traffic, that's for sure.

Jon Gorman

 Original message 
>Date: Wed, 23 Jan 2008 14:15:04 -0600
>From: "Doran, Michael D" <[EMAIL PROTECTED]>
>Subject: [CODE4LIB] perl questions
>To: CODE4LIB@listserv.nd.edu
>
>> Subject: [CODE4LIB] perl question
>> Sent: Tuesday, January 22, 2008 1:54 PM
>
>> Subject: [CODE4LIB] perl6
>> Sent: Monday, January 21, 2008 7:01 AM
>
>There *is* still a perl4lib list and these would have been relevant postings 
>[1].  Do the code connoisseurs on *this* list now consider perl4lib déclassé 
>or redundant for perl questions and discussions?  I'm not trying to dictate 
>where people post -- I'm just curious.
>
>Always the last one to know...
>-- Michael
>
>[1] The perl4lib page
>http://perl4lib.perl.org/
>
># Michael Doran, Systems Librarian
># University of Texas at Arlington
># 817-272-5326 office
># 817-688-1926 mobile
># [EMAIL PROTECTED]
># http://rocky.uta.edu/doran/

Re: [CODE4LIB] perl question

2008-01-22 Thread Jonathan Gorman

This may sound like a stupid question, so the issue is that the files aren't be 
generated at all?  Or are the files actually be generated, but something is 
wrong with your apache configuration so that you can't access the webpage?

That's the first thing that pops into my head.  I haven't looked at the script 
too closely though.  I'd step through it step by step and see what files are 
getting created/destroyed.

Also, use
$!/usr/bin/perl -w

Jon
 Original message 
>Date: Tue, 22 Jan 2008 14:53:56 -0500
>From: "Iglesias, Edward G. (Library)" <[EMAIL PROTECTED]>
>Subject: [CODE4LIB] perl question
>To: CODE4LIB@listserv.nd.edu
>
>Okay,  I should be able to solve this on my own but I can't.  I've had
>to move to a new web server and can't get an old script to work.  All it
>does is convert a fund activity report to a web page.  It worked fine on
>the old server.  I updated the directory paths and checked permissions.
>All are fine.  What's more I don't get an error.  It just works and
>returns nothing.  Any help would be appreciated.
>
>EI
>
>
>
>#!/usr/bin/perl
>## usage: cat inputfile | abiglobal.pl
>
>
>
>$i=0;
>while ($_ = ) {
>
>if(/(FUND ACTIVITY REPORT)/ ) {
>
>$input=$_;
>chop($input);
>$fundcode=substr($input,0,5);
>$fundcode=~ s/ //g;
>$input=~ tr/\"//;
>$input=~ tr/\'//;
>#   $input=~ tr/,/\t/;
>$input=~s/REPORT,/REPORT  /g;
>$input=~s/   /  /g;
>@in=split(/  /,$input);
>
>#   @fund=$in[0];
>
>if($i=="0"){
>$final=$in[3].".html";
>open (OUT,">>body.txt") || die "I am not able to write to file";
>open (HEAD,">>head.txt") || die "not able to open temp header";
>print HEAD "Fund Activity
>ReportFund Activity
>Reports:   $in[3]href='#$fundcode'>$fundcode \n";
>$i=99;
>}
>print HEAD "$fundcode \n";
>print OUT "Fund: $in[0]
>FundInfo: $in[1] $in[2]\n";
>
>}
>else {
>print OUT "$_\n";
>}
>
>
>}  end while stdin
>print HEAD "\n"; print OUT "\n";
>close OUT; close HEAD;
>
>#$outfile=~tr/ /_/;
>#$outfile=~tr/,/_/;
>print "$outfile\n";
>system "cat head.txt body.txt > /data/www/htdocs/fundlist/$final";
>system "rm head.txt";
>system "rm body.txt";
>
>opendir(DIR, "/public_html");
>open (OUT,"> /data/www/htdocs/fundlist/index.html") || die "I am
>not able to write to file"; print OUT ("Fund List
>Directory\n");
>print OUT ("Directory Listing\n");
>
>while($file = readdir(DIR) ) {
>
>print OUT ("$file\n");
>
>}
>print OUT ("\n");
>closedir(DIR);
>close (OUT);
>
>
>Edward Iglesias
>Systems Librarian
>Central Connecticut State University
>860.832.2082

Re: [CODE4LIB] CODE4LIB Web archive??

2007-11-23 Thread Jonathan Gorman

Quick google search turns up this one, haven't used it though ;)

http://serials.infomotions.com/code4lib/

Also, I think this is the official page and it has a couple of links

http://dewey.library.nd.edu/mailing-lists/code4lib/

Jon
 Original message 
>Date: Fri, 23 Nov 2007 13:31:35 -0600
>From: "Hahn, Harvey" <[EMAIL PROTECTED]>
>Subject: [CODE4LIB] CODE4LIB Web archive??
>To: CODE4LIB@listserv.nd.edu
>
>Is there a Web-accessible archive of CODE4LIB messages?  If so, what's
>the URL?  Thanks!
>
>Harvey
>
>--
>===
>Harvey E. Hahn, Manager, Technical Services Department
>Arlington Heights (Illinois) Memorial Library
>847/506-2644 - FX: 847/506-2650 - Email: hhahn(at)ahml(dot)info
>OML & Scripts web pages: http://www.ahml.info/oml/
>Personal web pages: http://users.anet.com/~packrat

Re: [CODE4LIB] Library Software Manifesto

2007-11-06 Thread Jonathan Gorman

 Original message 
>Date: Tue, 6 Nov 2007 14:16:05 -0500
>From: Tim McGeary <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] Library Software Manifesto
>To: CODE4LIB@listserv.nd.edu
>
>I think this depends entirely on what type of developer we are talking
>about.  Let's say it is a large ILS vendor who promises that their
>software will do all things for all types of library.  When a promised
>feature or a discovered bug that only applies to a small subset of their
>customer base (let's say academic or public or government) is found, the
>reason that is does not benefit a large enough community to put the
>expense is simply bogus.
>

At some point there simply isn't enough time/money etc.
I think the flaw here isn't in priorities, but in the vendor
being too broad in the first place.

The drive of my point is more is that it's bothersome when a
developer honestly decides that they cannot do the feature
with their given resources.  Then a customer pulls some
strings and the manager tells them they have to get it done
anyhow.  Do this consistently and you start seeing things like
training, support, and internal organization slip as you have
developers waiting for the "next sign from above".  It's too
dangerous to try to stick to schedules because of the
likelihood of disruptions.  I've had to deal with similar
situations from several sides of the issue and it can be
extremely frustrating.

I'm not advocating that the number of people always be the
sole basis of adding a feature of fixing a bug.  I'd
advocate a more complicated algorithm, similar to ones I've
seen advocated by most software design  books.

This would count a couple of factors
* Number of people affected
* Severity of problem
* resources required to solve problem
* risk of not solving problem

I've seen some differentiate this from severity, usually
taking a more business like approach.


So, maybe there's issues with screen readers.  This affects a
very small group of users.  However, the impact on those
users is quite severe.  On top of that, it's likely an
indicator of bad design since other tools.  The risk of not
solving the problem is also quite high from a legal
standpoint.

Compare this with some feature request that might really only
apply to a small group, but has a workaround, even if
uncomfortable.  The severity is low, since they have an
existing workaround.  They're the only ones who want the
feature.  The cost of fixing it might be high, since it
requires some redesign.

>The end result is that type of library essentially sitting on a product
>for years because there is no commitment to improve their service in
>their future.  This is happening frequently with "new" products that are
>introduced (at least in my ILS community) which, while are sold as
>usable to all types of libraries, are clearly designed for one specific
>or their largest base in mind only.
>

This is a huge issue.  The library vendors are trying to be
too many things to too many people.  That's a deeper issue
than customer responsibilities  and a failing of the vendor.
I just sent Roy some suggestions of vendor responsibilities,
and that would have been a good one to add.  (As well as the
vendor has a responsibility to be open on decisions on these
types of requests and future software development plans).

There's an excellent chapter on this exact phenomenon in
Alan Cooper's "The Inmates are Running The Asylum".

>A smaller development company or cooperative team is a bit different.
>Hopefully they have communicated their product specifically for what it
>does, and communicated their organizational size, strength, and focus so
>that the consumer understands that going in.  Large library software
>corporations should really be doing the same, but that doesn't happen.

Yeah, I think we're talking about the same thing here.  The
issue is with the communication process.  It's the
responsibility of the vendor to be open and clear in it's
communication process.  The customer should respect this.  The
clear communication should hopefully give the customer an idea
too of what measures they have to take.  Devote time to a
workaround, try to revise their case for the fix, or simply
accept it.

Right now we're seeing the large library vendors having a host
of features, including not doing things a long-term look at
their software.  This is leading to software created largely
by political maneuvering and consensus.  That's not the quite
the same as evaluating needed features and bugs.


Tim, the response I'm sending to the list is a bit different
from the one I sent to you earlier.  Some minor improvements
and hopefully clarified a bit more, but the general thrust is
the same.

Jon Gorman

Re: [CODE4LIB] Library Software Manifesto

2007-11-06 Thread Jonathan Gorman

some ideas for vendor's responsibilities:

1) Care for the software as a whole. This means sometimes not giving what your 
customers what in the short term to make a better product in the long term.

2) Care about the end user, despite whoever your customers are.  Frequently 
they're not the same.

3) Make it easy for customers to request feature and report bugs.  Work with 
them to do so, since it's appears extremely difficult for people to gauge what 
information is needed.  I suspect it has something to do with the 
non-physicality of software and poor mental models of software.  For some 
reason, people who would say "Well, the engine makes a put-put sound whenever I 
accelerate, especially on hills." have difficulty saying "Every time I try to 
send an email to these sites, I get this error message".  They just say things 
like "the email  is broken" or "your website links aren't working".  They seem 
to have a difficult time just giving details about what isn't working.

Quick example, in an in-house web-application there's a "report issues" link.  
It takes you to a form that also lets you know that there's some diagnostic 
information being included about the current state of the application to help 
the developers.  Frequently we can learn more from the diagnostic information 
than what the users supply.

4) Offer real information, not just marketing bull.  Can I call you up and ask 
questions about how many developers you have?  Projects they're working on?  
Timetables and goals?
This is more a pipe dream than anything, I've never seen any vendor offer this 
amount of information.  I can stand and watch my mechanic tinker, but I can't 
do the same with my software.

5) Keep your staff well-trained, review their work, and don't let things rot.  
Even if it means charging more money, because otherwise your company will 
become mediocre and depend on the inertia of existing customers more than 
expansion.

Well, that's enough for now, I got other work to do ;).

Jon Gorman

 Original message 
>Date: Tue, 6 Nov 2007 10:33:33 -0800
>From: Roy Tennant <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] Library Software Manifesto
>To: CODE4LIB@listserv.nd.edu
>
>On 11/6/07 10:27 AM, "Jonathan Gorman" <[EMAIL PROTECTED]> wrote:
>
>> How about an equivalent list from the vendor/software developer's 
>> perspective?
>> I think that would help balance the picture, but perhaps that's already in
>> your plans ;).
>
>Funny you should ask...I had originally intended to do this, but then I was
>wondering if it start to be redundant -- that is, would a number of points
>simply be restated from the vendor's viewpoint? But if there are unique
>points to make from that perspective it would be worthwhile to include them.
>This is an area where I consider myself even more ignorant than usual, so if
>those of you who work on that side of the fence would like to chime in with
>relevant manifesto points from the perspective of developers and vendors,
>I'm all ears. Thanks,
>Roy

Re: [CODE4LIB] Library Software Manifesto

2007-11-06 Thread Jonathan Gorman

Hmmm, I'm tempted to add something to responsibilities along the lines of "Seek 
to understand the priorities of the software developers".  Similar to 
"requesting features responsibly".  I can see an important difference.  
Sometimes it's important to let people know of a desired feature, even if in 
the end the vendor/developers decide resources can't be dedicated to fixing 
that bug or adding that feature.  Often it's difficult for "customers" to know 
the relative difficult of adding a feature or doing a bug fix.  We don't want 
them not to request.  When they're requesting features for others, they do have 
a responsibility to document those desires (usability testing, interviews, etc).

However, sometimes fixing a bug or adding a particular feature will only have a 
small benefit to a small community, be simply too expensive given it's 
priority,  or may be in a part of the system that requires a more radical 
rewrite.  When these conclusions are reached it's helpful for the customer not 
to try to do a "run-around" or pull strings to get that feature added anyhow.  
Say, by calling their buddy the CEO and convincing him the developers are just 
avoiding work unnecessarily.

How about an equivalent list from the vendor/software developer's perspective?  
I think that would help balance the picture, but perhaps that's already in your 
plans ;).


Jon Gorman

 Original message 
>Date: Tue, 6 Nov 2007 10:07:45 -0800
>From: Roy Tennant <[EMAIL PROTECTED]>
>Subject: [CODE4LIB] Library Software Manifesto
>To: CODE4LIB@listserv.nd.edu
>
>I have a presentation coming up and I'm considering doing what I'm calling a
>"Library Software Manifesto". Some of the following may not be completely
>understandable on the face of it, and I would be explaining the meaning
>during the presentation, but this is what I have so far and I'd be
>interested in other ideas this group has or comments on this. Thanks,
>Roy
>
>Consumer Rights
>
>- I have a right to use what I buy
>- I have a right to the API if I've bought the product
>- I have a right to accurate, complete documentation
>- I have a right to my data
>- I have a right to not have simple things needlessly complicated
>
>Consumer Responsibilities
>
>- I have a responsibility to communicate my needs clearly and specifically
>- I have a responsibility to report reproducible bugs in a way as to
>facilitate reproducing it
>- I have a responsibility to report irreproducible bugs with as much detail
>as I can provide
>- I have a responsibility to request new features responsibly
>- I have a responsibility to view any adjustments to default settings
>critically

Re: [CODE4LIB] Cannot use windows search text inside .java .jsp or .bas files?

2007-07-20 Thread Jonathan Gorman

A better link than the one I just sent, still not great

http://support.microsoft.com/kb/309173.





 Original message 
>Date: Fri, 20 Jul 2007 16:18:12 -0400
>From: Joe Atzberger <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] Cannot use windows search text inside .java .jsp or 
>.bas files?
>To: CODE4LIB@listserv.nd.edu
>
>I can corroborate your experience here.  Search for filename "*.java" and
>get hits.  View one of those .java files, copy a string out of it, and go
>back to the search.  Search for filename "*.java" again, with contents
>matching the string you paste in.  Get zero hits.  Lame!
>
>Google Desktop search does the trick for me, however.  Try that instead.
>
>-- joe atzberger
>
>On 7/20/07, Jeffrey Barnett <[EMAIL PROTECTED]> wrote:
>>
>> Yes, I know I real programmers use grep ;-)
>> But I still want an explanation!
>>
>> Jeffrey Barnett wrote:
>> > Is this a well known feature or something I've managed to bring on
>> > myself through an excess of customization?
>> >
>> > Try this:  In the windows search tool specify
>> > All or part of file name: .java
>> > A word or phrase in the file: import
>> > Look in: 
>> >
>> > I've tried this on three different work stations and the result has
>> > always been:
>> >
>> > "Search Complete: No results to display"
>> >
>> > Same thing happens searching for common statements inside .jsp and .bas
>> > files.
>> >
>> > PS: I also have "search system files" enabled, so they are not being
>> > skipped for that reason
>>

Re: [CODE4LIB] Cannot use windows search text inside .java .jsp or .bas files?

2007-07-20 Thread Jonathan Gorman

There's a registry setting that controls what file extensions windows will 
explore.

Found reference to it in this thread, it's somewhere halfway down

http://forum.java.sun.com/thread.jspa?threadID=673595&messageID=3935013.

I assume it's to avoid searching in binary files.  Of course, grep will still 
say there's a match and warn you that it seems to be a binary file.

Jon Gorman


 Original message 
>Date: Fri, 20 Jul 2007 16:18:12 -0400
>From: Joe Atzberger <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] Cannot use windows search text inside .java .jsp or 
>.bas files?
>To: CODE4LIB@listserv.nd.edu
>
>I can corroborate your experience here.  Search for filename "*.java" and
>get hits.  View one of those .java files, copy a string out of it, and go
>back to the search.  Search for filename "*.java" again, with contents
>matching the string you paste in.  Get zero hits.  Lame!
>
>Google Desktop search does the trick for me, however.  Try that instead.
>
>-- joe atzberger
>
>On 7/20/07, Jeffrey Barnett <[EMAIL PROTECTED]> wrote:
>>
>> Yes, I know I real programmers use grep ;-)
>> But I still want an explanation!
>>
>> Jeffrey Barnett wrote:
>> > Is this a well known feature or something I've managed to bring on
>> > myself through an excess of customization?
>> >
>> > Try this:  In the windows search tool specify
>> > All or part of file name: .java
>> > A word or phrase in the file: import
>> > Look in: 
>> >
>> > I've tried this on three different work stations and the result has
>> > always been:
>> >
>> > "Search Complete: No results to display"
>> >
>> > Same thing happens searching for common statements inside .jsp and .bas
>> > files.
>> >
>> > PS: I also have "search system files" enabled, so they are not being
>> > skipped for that reason
>>

Re: [CODE4LIB] Code4Lib listserv archives [munging]

2007-07-18 Thread Jonathan Gorman

 Original message 
>Date: Wed, 18 Jul 2007 14:46:48 -0400
>From: Eric Lease Morgan <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] Code4Lib listserv archives [munging]
>To: CODE4LIB@listserv.nd.edu
>
>On Jul 18, 2007, at 1:44 PM, Eric Lease Morgan wrote:
>
>>> I say 'yes', as long as the archives don't expose email addresses
>>> in a
>>> format suitable for harvesting.
>>
>> Hmmm... I do not know whether or not the mailing list software
>> (LISTSERV) supports the munging of addresses, nor do I know whether
>> or not the archives will even be crawlable by robots. I will check.
>
>
>According to our technical support folks here at Notre Dame, the
>mailing list archives are not suppose to be crawlable by things like
>Google, etc, but just as importantly, according to the same people,
>it is not possible to munge email addresses in archives so they are
>not harvestable/readable.
>

it is not possible to do that?  That's odd, I thought ListServ let you do that 
option.  Of course, finding it in their documentation may not be an easy 
exercise.

I guess if no email munging is there I'm a little torn.  On one hand, I'm not 
sure if having my email address a little less visible will help much with spam.

I guess my vote is still yes.  Meanwhile I guess I'll start working with 
filters ;).

Jon Gorman

Re: [CODE4LIB] Code4Lib listserv archives

2007-07-18 Thread Jonathan Gorman

Open your archive and let the light shine in.


 Original message 
>Date: Wed, 18 Jul 2007 13:14:52 -0400
>From: Eric Lease Morgan <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] Code4Lib listserv archives
>To: CODE4LIB@listserv.nd.edu
>
>On Jul 18, 2007, at 11:52 AM, Jonathan Rochkind wrote:
>
>> At present, the Code4Lib listserv archives at:
>> http://listserv.nd.edu/cgi-bin/wa?A0=code4lib
>>
>> Require one to be a subscriber in order to view.
>>
>> Can this be changed?
>
>
>I, as maintainer of the mailing list, have no problem allowing non-
>subscribers to view the "official" archives. If, by noon tomorrow, I
>get more "yes" votes as opposed to "no" votes regarding this issue,
>then I will change things accordingly. (I'm going on vacation as of
>tomorrow afternoon.) Feel free to send your votes to me or the list.
>
>--
>Eric Lease Morgan <[EMAIL PROTECTED]>
>University Libraries of Notre Dame
>
>(574) 631-8604

Re: [CODE4LIB] Code4Lib journal idea revival?

2007-04-11 Thread Jonathan Gorman

 Original message 
>Date: Wed, 11 Apr 2007 11:13:58 -0400
>From: Ryan Eby <[EMAIL PROTECTED]>
>Subject: Re: [CODE4LIB] Code4Lib journal idea revival?
>To: CODE4LIB@listserv.nd.edu
>
>I think there was also a plan for an anthology that could be wrapped
>into the first few issues. I'm not sure if that project fizzled or not
>either. The woes of a volunteer force with little free time.
>

Do you mean the anthology of blog postings?

In either case, I'd be interested in working on it, although I'm a bit nervous 
I wouldn't have much time for it either or wouldn't be able to keep up with it. 
 It does sound like an interesting project.

>I'm still interested in the idea and I could probably set up an
>installation on code4lib in the near future if people are interested
>and approve. journal.code4lib.org I think would be appropriate.


That would rock.

Re: [CODE4LIB] Everything okay in Georgia

2007-03-02 Thread Jonathan Gorman

It rained pretty hard, but I think the worst that happened is a few of us got a 
bit soaked.  I haven't heard anything yet.


Jon Gorman

 Original message 
>Date: Fri, 2 Mar 2007 07:12:20 -0500
>From: Peter Murray <[EMAIL PROTECTED]>
>Subject: [CODE4LIB] Everything okay in Georgia
>To: CODE4LIB@listserv.nd.edu
>
>-BEGIN PGP SIGNED MESSAGE-
>Hash: SHA1
>
>NPR is talking about the line of storms that went through Georgia last
>night.  Is everyone okay?
>
>
>Peter
>- --
>NOTE: New Position ... http://dltj.org/2007/01/new-title-new-challenges/
>
>Peter Murrayhttp://www.pandc.org/peter/work/
>Assistant Director, New Service Development  tel:+1-614-728-3600;ext=338
>OhioLINK: the Ohio Library and Information NetworkColumbus, Ohio
>The Disruptive Library Technology Jesterhttp://dltj.org/
>Attrib-Noncomm-Share   http://creativecommons.org/licenses/by-nc-sa/2.5/
>-BEGIN PGP SIGNATURE-
>Version: GnuPG v1.4.5 (Darwin)
>Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>
>iD8DBQFF6BR44+t4qSfPIHIRAhy3AJ9xQzTjBB/p5KsZZISIhjsrqwt0uACePiB8
>E40U4u4z9Z/RA2Rn/hE4y/w=
>=J0Cj
>-END PGP SIGNATURE-

[CODE4LIB] Book swap?

2007-03-02 Thread Jonathan Gorman

Hiya folks, I'm here at the conference and was wondering if anyone was 
interested in a bookswap.  I've got a copy of Stephen King's "Cell" and 
Burroughs "Running With Scissors".  I'm finished with them and wondering if 
anyone else had anything interesting to read.  I'll bring them with me 
downstairs.


Jon Gorman

Re: [CODE4LIB] 2007 Conference Attendee List--Dumb question

2007-02-23 Thread Jonathan Gorman


I made the same mistake, read the instructions at the top of the page.


Jonathan T. Gorman
Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688


On Fri, 23 Feb 2007, Joan Starr wrote:


Okay--I admit I forgot my password. So, I went through the
change-your-password flow, but the ListofAddressees page still won't
accept it. Is there a time lag involved?

--Joan Starr

-Original Message-
From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of
Roy Tennant
Sent: Thursday, February 22, 2007 7:46 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] 2007 Conference Attendee List

To better enable networking of Code4Lib 2007 Conference attendees
before, during, and after the meeting, I have put the list of attendees
on the Code4Lib wiki:

http://code4lib.org/wiki/ListofAttendees

Only names are on the list, so if you wish to make your contact
information available please edit the page to include the information
you care to include. Basic access and editing instructions are on the
page. A link to this page can be found on the conference web page at

http://www.code4lib.org/2007

I've edited my entry as an example, but feel free to do whatever feels
right. Thanks, Roy

Re: [CODE4LIB] Polls open for Code4Lib 2007 T-Shirt design

2007-01-29 Thread Jonathan Gorman


US copyright law is absolutely confounding.  I think what it the defining
factor here is if this image went through the paperwork for American
copyright.  I believe then the status is out of copyright by two years.
Notice the test for a work that is in compliance with US Formalities
that was in copyright in it's home country as of Jan. 1, 1996 is under
copyright for 95 years after publication date.

Of course, I'm not a lawyer.  I'm basing this mostly off this cheat sheet
from cornell:
http://www.copyright.cornell.edu/training/Hirtle_Public_Domain.htm.


Jonathan T. Gorman
Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688


On Mon, 29 Jan 2007, Rob Styles wrote:


The photo is an original WWII photo from 1944, it's outside of the 50
years covered by copyright here in the UK in is in use by several
different organisations. I believe we don't need any clearance.

rob


Rob Styles
Programme Manager, Data Services, Talis
tel: +44 (0)870 400 5000
fax: +44 (0)870 400 5001
direct: +44 (0)870 400 5004
mobile: +44 (0)7971 475 257
msn: [EMAIL PROTECTED]
irc: irc.freenode.net/mrob,isnick



-Original Message-
From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf

Of

Roy Tennant
Sent: 26 January 2007 21:10
To: CODE4LIB@listserv.nd.edu
Subject: Re: [CODE4LIB] Polls open for Code4Lib 2007 T-Shirt design

I hate to be the one to raise this, but it seems like I must since the
design is leading in the polls, but do we have (or can we obtain) the
right
to reproduce that photo?
Roy


The very latest from Talis
read the latest news at www.talis.com/news
listen to our podcasts www.talis.com/podcasts
see us at these events www.talis.com/events
join the discussion here www.talis.com/forums
join our developer community www.talis.com/tdn
and read our blogs www.talis.com/blogs


Any views or personal opinions expressed within this email may not be those of 
Talis Information Ltd. The content of this email message and any files that may 
be attached are confidential, and for the usage of the intended recipient only. 
If you are not the intended recipient, then please return this message to the 
sender and delete it. Any use of this e-mail by an unauthorised recipient is 
prohibited.


Talis Information Ltd is a member of the Talis Group of companies and is 
registered in England No 3638278 with its registered office at Knights Court, 
Solihull Parkway, Birmingham Business Park, B37 7YB.

Re: [CODE4LIB] MINUS in MySQL (was Re: Thanks!) -- also intesection with LIKE?

2007-01-26 Thread Jonathan Gorman


On Fri, 26 Jan 2007, Kohler, Andy wrote:


Ken Irwin asked:

I wonder if it's possible to use LIKE with
the results of a subquery, eg.:
SELECT * FROM table WHERE ip [NOT LIKE ANYTHING IN] (SELECT
ip_range FROM known_ips) where [NOT LIKE ANYTHING IN] is
probably some different wording.


In general, you'd do this like (hah):

SELECT *
FROM table t
WHERE NOT EXISTS
(  SELECT *
  FROM known_ips
  WHERE ip = t.ip
)



I think what's Ken is asking for though is there some combination of the
IN operator and LIKE operator.  He's trying to exclude a set of patterns,
ie converting (ip NOT LIKE "127.%" OR ip NOT LIKE "143.123.%").

Off the top of my head I can't think of anything like that (which isn't to
say much), but if by some stroke of luck you filter based on a certain
part of the address you could do a substring function.  (You'd have to
check MySQL manual for the exact syntax)... Ie

SELECT *
FROM iptables
WHERE substring(iptables,1,7) NOT IN (127.475,...)

Jon Gorman

Re: [CODE4LIB] Polls open for Code4Lib 2007 T-Shirt design

2007-01-26 Thread Jonathan Gorman


On Fri, 26 Jan 2007, Emily Lynema wrote:


Even though I didn't vote for this design, I would vote for the shorter
wording. But I will point out that it seems slightly less than
democratic to have folks choose a design, and then re-do that design w/o
putting it up for another vote on code4lib.

Maybe I'm being too picky?


I don't think so.  Part of the reason I didn't vote for that particular
design is soley because I didn't like the phrasing of it.  Had it a
caption that doesn't throw my thinking process off in the middle of it I
probably would have voted for it ;).

That being said, if it does win the first round perhaps we should have a
"pick the caption?"  vote?  Or just redo the whole thing?  I imagine we're
starting to run into timeline issues now, but I leave that to the wise
heads running the conference to determine.

Jon Gorman

Re: [CODE4LIB] Thanks! Re: [CODE4LIB] SQL query

2007-01-26 Thread Jonathan Gorman


Last I checked MySQL doesn't support MINUS, but it's been a few years
since I used it.  I vaguly remember talk about the developers planning on
adding it.  I took a quick glance at the docs, but I can't seem to find
anything one way or another.  Is it in one of the later versions of MySQL?


On Fri, 26 Jan 2007, Jeffrey Barnett wrote:


You have gotten a lot of suggestions, but here is one more.

select * from lib_books where good_thing = 'TRUE'
MINUS
select * from lib_books where bad_thing = 'TRUE'

I think MINUS is faster than JOIN.

Other SET OPERATIONS include UNION and INTERSECT.

Set operations require that the underlying result sets be "compatible":
Same number of columns.
Corresponding columns have matching datatypes.


Ken Irwin wrote:

Hi all,

Thanks for these myriad responses! I've gotten at least three distinct
approaches to try. I knew there had to be a better way.

your sql-fu is appreciated!

joys
Ken

Re: [CODE4LIB] RE: [CODE4LIB] RE: [CODE4LIB] Polls open for Code4Lib 2007 T-Shirt design

2007-01-26 Thread Jonathan Gorman


Well, if it's open for a rewrite, perhaps something like:

It was hopeless.  Maude and Agnes had cracked top-secret
messages during World War II, but even Bletchley Park's
finest cryptographers were mystified by the enigmatic "008".


O, I like.  Ben++


Jon Gorman

Re: [CODE4LIB] RE: [CODE4LIB] Polls open for Code4Lib 2007 T-Shirt design

2007-01-26 Thread Jonathan Gorman


On Fri, 26 Jan 2007, Ben Ostrowsky wrote:


If the Bletchley Park design idea wins, may I offer a friendly
amendment?

"It was hopeless.  Despite years at Bletchley Park, Maude and Agnes
still had three characters of the 008 that they could not understand."

Ben



There was actually some discussions on the #code4lib channel recently that
came up with the same conclusion and the same sentence structure.
There's also some discussion that the last phrase should be redone to make
it clear that the problem doesn't lie with the two women.  People who
don't know of Bletchley Park might mis-interpret it.  Bletchley was one of
the locations in WWII for codebreaking.  I think the suggestion was
something like "had three characters that remained encrypted".  Someone
can read through the logs if they want to find out what the exact
suggestion was.

I apologize if I've pointed out the obvious though ;).



Jon Gorman

Re: [CODE4LIB] Polls open for Code4Lib 2007 T-Shirt design

2007-01-25 Thread Jonathan Gorman


On Thu, 25 Jan 2007, Joan Starr wrote:


Maybe. I didn't get it, and I saw from the message at the bottom of the
thread that it got sent to a Google list, [EMAIL PROTECTED]
Not sure about that one.

--Joan


Right, I believe that's the list for the conference organizers.  The
original email had both groups on the cc line, so when Peter responded to
whichever one he responded, the response was sent to both.  Probably
because he did a reply all or something like that.

Jon Gorman

Re: [CODE4LIB] Polls open for Code4Lib 2007 T-Shirt design

2007-01-25 Thread Jonathan Gorman


On Thu, 25 Jan 2007, Joan Starr wrote:


Why didn't the invitation to vote go out to the Code4Lib listserv? I
only found out about it because of Peter's email below.

--Joan Starr



There's an email (actually, for some reason I have two identical ones)
from Ross Singer on Tuesday announcing it, Peter's email is a response
email to that one.

I have it in my inbox, perhaps there's a listserv issue if people aren't
getting emails.


Jonathan T. Gorman
Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688



-Original Message-
From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of
Binkley, Peter
Sent: Thursday, January 25, 2007 12:59 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Polls open for Code4Lib 2007 T-Shirt design

I get this error when I try to visit the tshirt page:

Fatal error: Call to undefined function format_name() in
/var/www/code4lib.org/htdocs/themes/sunflower/sunflower.theme on line
183

Peter

-Original Message-
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]
On Behalf Of Ross Singer
Sent: Tuesday, January 23, 2007 10:56 AM
To: Code for Libraries; [EMAIL PROTECTED]
Subject: Polls open for Code4Lib 2007 T-Shirt design


Cast your vote for the Code4Lib 2007 T-Shirt!

Polls will open until ~5PM EST on Friday, January 26th.

http://www.code4lib.org/node/150

May the best design win.

-Ross.

--~--~-~--~~~---~--~~
You received this message because you are subscribed to the Google
Groups "code4libcon" group.
To post to this group, send email to [EMAIL PROTECTED] To
unsubscribe from this group, send email to
[EMAIL PROTECTED]
For more options, visit this group at
http://groups.google.com/group/code4libcon?hl=en
-~--~~~~--~~--~--~---

Re: [CODE4LIB] SQL query: looking for NON-intersection of tables

2007-01-25 Thread Jonathan Gorman


I want to generate a list of books that are in lib.books that doesn't
have any subjects assigned to it.



There's a couple ways to do this.  I'm a little rusty on MySQL, so this
will just be a bit of a generic SQL, you may have to adapt.

The always popular left outer join with a null.  Outer joins will join
tables even when there isn't a matching element in one the tables. So...in
our fictional example: (LibBooks has the book ids, and BooksSubjects has
the relation of books to subject ids)

SELECT *
FROM LibBooks LEFT OUTER JOIN BooksSubjects ON (LibBoooks.BookId = 
LibSubjects.BookId)
WHERE LibSubjects.BookId IS NULL.

The left outer join here will still "join" a BookId and row to the
BookSubjects table if there is no match, but all those tuples that would
come from the table on the "right" will be set to null.  Hence, it only
gets those books that are always in LibBooks.

You can do a not in sub-query (syntax here might vary between SQL
servers)...soemthing like

SELECT *
FROM LibBooks
Where LibBooks.BookId NOT IN (SELECT DISTINCT BookId FROM LibSubjects)


There's a few other methods, but I can't remember them off the top of my
head.


Jonathan T. Gorman
Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688


On Thu, 25 Jan 2007, Ken Irwin wrote:


Hi folks,

I'm trying to put together a MySQL query to do something I don't know
how to do: get a list of materials that DON'T show up in a relational table.

For example, 3 tables:

1) lib.books : lots of bib data including book_id
2) lib.subjects: subj_code, selector, subject_name
3) relational: lists book_id & subj_code


I could do this with 2 queries, but it gets unwieldy: get a list of
distinct book_ids and AND/NOT them all together like:
SELECT * FROM books WHERE book_id != '4' and book_id != '7'...
That works on really small sets, but I don't want to go that route.

Is there a savvy way to structure this MySQL query. I don't even know
the language to use to look for this information.

Thanks for any help you can provide!
Ken

--
Ken Irwin
Reference Librarian
Thomas Library, Wittenberg University

Re: [CODE4LIB] Getting data from Voyager into XML?

2007-01-25 Thread Jonathan Gorman


There has never been any contractual issue that I am aware of with going
under the Oracle hood and using the data. It is your data...


I think Rob's point isn't that it's a license issue for the end user, but
rather the vendors who make the ILS software.  It used to be more typical
for there to be varying license agreements for how you wanted to "sell"
applications that depended on another vendor's database.  So the ILS
vendor, if it wanted it's database structure to be "open" to it's
end-users for local customization would essentially have to pay more to
get this and end up passing this cost to the end-user.  Essentially,
every customer would end up paying indirectly for the ability to do this,
regardless of if they did it or not.

We're not talking about data here, but how a third party's programs and
software is used in the vendor's software.  It's a complex issue and can
be one of the things that make companies nervous about "viral" licenses
such as GNU as opposed to BSD licenses.

A quick example: Imagine a service for recommending books.  They allow all
the libraries apis and web services for accessing this service, but they
keep it on their machines.  Now they only need one licenses for the
database vendor.  They later decide to develop a way to install local
servers. Now that means they either have to re-negotate with the database
vendor or just tell those installing the local servers that they have to
negotate for the database software and support themselves.


Jonathan T. Gorman
Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688

Re: [CODE4LIB] Getting data from Voyager into XML?

2007-01-19 Thread Jonathan Gorman


On Fri, 19 Jan 2007, Erik Hatcher wrote:


Tod,

Great information.  I apologize for being a late comer to the game
and bringing up FAQs.

What about date normalization?

One thing that must be considered when doing faceted browsing is that
it works best with some pre-processed data, such as years rather than
full dates.  The question becomes where does the logic for stripping
out the years belong?  Solr could do it if configured with a custom
analyzer for certain fields, or the client could do it.  Is there
XSLT to do this sort of thing with dates available?


I know XSLT 2.0 can handle them far better due to the support for types.
However, MARC still has oddities which would probably need to be address
directly.  If doing it entirely in XSLT I'd probably actually pipeline it
and do several transformations in a row.

There's also been work done to provide libraries and the like in XSLT.
EXSLT comes to mind right away.

One example of an MARC oddity I had recently is that a report required the
260 |c field.  I got complaints that the dates were malformed.  Why?  They
appeared like 1922].  Those with some catalog experience can guess the
problem.  The whole 260 field looks like this $a [Chicago: $b some
publisher $c 1922].

I'm not entirely sure how that would get parsed into MARCXML in the first
place.

There's techniques to deal with this in xslt, but the string manipulations
are generally more cumbersome in that language than in a scripting
language as you mention.

In XSLT 2.0 I'd probably have a template/function to parse out
punctuation, then something to possibly normalize dates.

Which reminds me, I need to start reviewing some XSLT/Cocoon for the
pre-conference ;).


Jonathan T. Gorman
Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688




  Erik


On Jan 19, 2007, at 5:58 AM, Tod Olson wrote:


On Jan 19, 2007, at 4:07 AM, Erik Hatcher wrote:


On Jan 17, 2007, at 3:26 PM, Andrew Nagy wrote:

One thing I am hoping that can come out of the preconference is a
standard XSLT doc.  I sat down with my metadata librarian to
develop our
XSLT doc -- determining what fields are to be searchable what fields
should be left out to help speed up results, etc.

It's pretty easy, I think you will be amazed how fast you can have a
functioning system with very little effort.


You're quite right with that last statement.

I am, however, skeptical of a purely MARC -> XSLT -> Solr solution.
The MARC data I've seen requires some basic cleanup (removing dots at
the end of subjects, normalizing dates, etc) in order to be useful as
facets.  While XSLT is powerful, this type of data manipulation is
better (IMO) done with scripting languages that allow for easy
tweaking in a succinct way.  I'm sure XSLT could do everything that
you'd want done; you can also drive screws in with a hammer :)


So the punctuation stripping has already been done in XSLT.

LoC has a MARCXML -> MODS XSLT stylesheet [1] which strips out the
evil
ISBD punctuation. I've generally found mapping from MODS to be more
convenient than mapping from MARC, so while it's an extra step, it
does
save a little programmer time since some of the hidden hierarchy in
the
MARC data is made explicit in the MODS structure.

If hopping through MODS is unacceptable, the LoC has the punctuation-
stripping nicely tucked away into a MARC Conversion Utility Stylesheet
that you could use directly in a MARC XML -> Solr transformation. [2]

[1] http://www.loc.gov/standards/mods/v3/MARC21slim2MODS.xsl
[2] http://www.loc.gov/marcxml/xslt/MARC21slimUtils.xsl


Tod Olson <[EMAIL PROTECTED]>
Programmer/Analyst
University of Chicago Library

Re: [CODE4LIB] Getting data from Voyager into XML?

2007-01-17 Thread Jonathan Gorman


On Wed, 17 Jan 2007, Doran, Michael D wrote:


The mfhd has a location, but in Voyager
I find the item perm_location to be more
accurate, at least with our practice


For Voyager, I've found this to be a useful algorithm for getting accurate 
location information:

if item record
then
 if item temp_location not null
 then
   use item temp_location,
 else
   use item perm_location
else
 use mfhd location



Ah, true, for the online catalog you might actually want to display
both items of information.  (IE temporarily shelved at reserves,
normally found in Vet Med).


Jonathan T. Gorman
Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688

Re: [CODE4LIB] Getting data from Voyager into XML?

2007-01-17 Thread Jonathan Gorman


Hi Nate,

I've just started playing around with this a bit myself, although I have
to confess that it's quite recent and a hack.  But here's what I've been
doing.

Due to consortium issues and the like here's the track I'm taking.  I'm
using Perl, since I haven't yet started playing with Ruby.  I know, I
know, lame ;).

First, I run a query to grab the bib ids/mfhd ids of some groups of
records that I'm interested in.  I'm going for a smaller set probably,
since our whole collection is rather large.  Then I run it through a small
script that runs a query (where $recordtype = BIB or MFHD)

my $qry  = "SELECT ".$recordtype."_DATA.".$recordtype."_id,
record_segment, "
  . "seqnum FROM UIUDB.".$recordtype."_DATA "
  . "  WHERE ".$recordtype."_DATA.".$recordtype."_ID = ? "
  . "  ORDER BY seqnum ASC " ;


with the ? set to the id.  The results are merged into one string.  This
is used to create a MARC::Record object.  I add this to a MARC::File::XML
object.

I have in some notes that I need to pad out the bib records when using
this method, but I'm not sure why.  I'm not currently duing that
and it hasn't caused any issues yet.

That's it ;).  Gives me some nice marcxml.  And I haven't tested it at all
yet, just started playing around with it.


I've included some of the code below.  Like I said, I just came up with
this really quickly, haven't had a chance to really test it heavily yet.


use strict;
use DBI;
use MARC::Record;
use MARC::file::XML;
use Getopt::Std;

# set the database connection options
#removed :P

my %opts;

getopt('ft',\%opts);

my $recordtype="BIB";
if (uc($opts{'t'}) eq "MFHD" ) {
  $recordtype="MFHD";
}

#need to change so can take from command line
if ($opts{'f'}) {
open(IDS,"< ".$opts{'f'}) or die "must set -f input ids";
}

my $qry  = "SELECT ".$recordtype."_DATA.".$recordtype."_id,
record_segment, "
  . "seqnum FROM UIUDB.".$recordtype."_DATA "
  . "  WHERE ".$recordtype."_DATA.".$recordtype."_ID = ? "
  . "  ORDER BY seqnum ASC " ;
my $sth = $dbh->prepare($qry) or die "preparing SQL query $qry.";


my $file = MARC::File::XML->out( $recordtype.'records.xml'  );



while () {
  chomp;
  s/ *$//;
  s/^ *//;
  my $id = $_;
  my $MARC = "";


  $sth->execute($id);
  while (my ($rec_id, $recseg, $seqnum) = $sth->fetchrow_array) {
$MARC .= $recseg ;
#not sure that the below is needed...there is some
#sort of issue with the "last" record, maybe?
#  $MARC .= ' ' x (990-length($recseg)); # Pad each segment to 990
chars.
  }


  $sth->finish;

  my $record = MARC::Record->new_from_usmarc($MARC);
  $file->write($record);



}

$file->close();



Jonathan T. Gorman
Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688

Re: [CODE4LIB] Getting data from Voyager into XML?

2007-01-17 Thread Jonathan Gorman


On Wed, 17 Jan 2007, Nathan Vack wrote:


On Jan 17, 2007, at 2:26 PM, Andrew Nagy wrote:


Nate, it's pretty easy.  Once you dump your records into a giant marc
file, you can run marc2xml
(http://search.cpan.org/~kados/MARC-XML-0.82/bin/marc2xml).  Then
run an
XSLT against the marcxml file to create your SOLR xml docs.


Unless I'm totally, hugely mistaken, MARC doesn't say anything about
holdings data, right? If I want to facet on that, would it make more
sense to add holdings data to the MARC XML data, or keep separate xml
files for holdings that reference the item data?


Depends a bit what you mean about holding data.  There are MARC holding
records (mfhds) that do provide some of this information.  However, much
of the information you want to know on a Voyager system is held in the
database item's record.  (The mfhd has a location, but in Voyager I find
the item perm_location to be more accurate, at least with our practice.).

Some information might be considered too "real time" to index, but it's
worth considering trying it anyway.  I'm hoping to get some queries to get
some of the most useful stuff dumped out and added to yet a third file,
but haven't started yet.  That's a personal project at this point.




In a lot of cases, location data might not be a hugely important
facet; at Madison, we have something like 42 libraries spread thinly
across campus (gah!) -- each with different loan policies -- as well
as a few request-only storage facilities. So there's a lot of "Stuff
I Can't Check Out" and a lot of "Stuff I'll Need To Wait For" in our
collection.



I feel your pain ;).  And worse, there's isn't always a good way to know
if a location actually circulates from within voyager besides sometime
misleading naming conventions.  (For us anyways).  I've been considering
setting up a database that will just act as a mapping for individual
libraries and their shelving locations and preferred policies, if only to
help keep track of them all.

Jon Gorman

Re: [CODE4LIB] Code4lib 2007 Registration Open

2006-11-16 Thread Jonathan Gorman


On Thu, 16 Nov 2006, Darla Grediagin wrote:


Do many school librarians go to this conference.  I have a scholarship
to attend a conference in the Spring of 2007.  I am getting more and
more interested in the technology side of library.  I have just helped
with the installation of Koha for our library circulation/catalog
program.  I am trying to feel out whether this would be the conference
to choose.



Hi Darla,

It's only the second year of the conference.  I managed to make it to
first ever code4lib last year and enjoyed it.  It is definitly on the
technology side of librarianship.  Most of the people there from my
impression were involved in academic (University level) libraries one way
or another, with some sprinklings of people from various vendors.
Overall though it's a quite inviting conference.  If you've helped
installed Koha I'd imagine people would be interested in hearing about
that (I know I am).  It sounds like excellent material for at least a
lightning talk.  It seems a pretty good group of people and a mix of
topics.  I will say some were on the more technical and
technology-orientated, but if that's what you're interested in I think the
networking opprotunities will be pretty good.

I think that some of the more general "library technology" conferences are
Internet Librarian and Computers in Libraries, but I've never made it to
either of those.  Access is supposed to be great, but it's later in the
year.  I haven't had a chance to go there yet either.


I think some of the audio files of presentations for code4lib2006 are up
at the website (www.code4lib.org)

Here's the schedule from the conference last year
(http://www.code4lib.org/2006/schedule).


I'm not sure what the conference will be like this year ;).  I'm guessing
the next few years of this conference might be a bumpy but it'll smooth
out as it becomes more estabished.


Jonathan T. Gorman
Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688

[CODE4LIB] Session Timeouts in OPACs

2006-09-22 Thread Jonathan Gorman


Hi everyone,

There are several of us at Urbana-Champaign that are interested in getting
a better picture of how our current management of sessions for our OPAC
affects our users.  We're in a pretty early state in the project and
thought it might be useful to gather some rough numbers about other
institutions as well for comparison.  The major issue we're wondering
about is how long do you let your sessions go for?  We timeout sessions
after 5 minutes of inactivity, any more and the system may become
unstable.

We also have recently done a rough metric by counting how many times our
timeout warning page was accessed to find out how many users were timed
out of our system.  On a slow day the timeout page is accessed around 300
times, on a busy day over a 1000 times.  This page is only accessed if the
user attempts a search on a page but their current session has timed out.
Some of these are likely to be due to people trying to use another
person's session at a public terminal.  In other words, this data includes
the situation where someone does a search, leaves, and another person
comes up and tries to click "new search" but the session has already timed
out.

So some useful information might be: size of your school/institution or
average users a day, how long is your timeout, and how many timeouts do
you have?

(Note - I'm cross-posting on Web4Lib, Code4Lib, LITA-L using my work
email, [EMAIL PROTECTED], and LITA-L using my gmail account
[EMAIL PROTECTED]  Feel free to send to either.)



Jonathan T. Gorman
Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688

Re: [CODE4LIB] zoom

2006-09-06 Thread Jonathan Gorman


Been a while since I tried installing ZOOM on a windows machine.  Last I
did even with YAZ installed it wasn't working.  Any
pointers to better documentation would be nice.  (I remember the CPAN
module flaking out even with paths set to the YAZ libraries).


Jonathan T. Gorman
Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688


On Wed, 6 Sep 2006, Joshua Ferraro wrote:


On Wed, Sep 06, 2006 at 01:43:50PM -0500, Jonathan Gorman wrote:

Now if only it could work on our windows servers.

It absolutely works on a windows server. Just make sure you've
got Yaz installed and go ahead and install the Perl module
on CPAN.

Cheers,

--
Joshua Ferraro   SUPPORT FOR OPEN-SOURCE SOFTWARE
President, Technology   migration, training, maintenance, support
LibLimeFeaturing Koha Open-Source ILS
[EMAIL PROTECTED] |Full Demos at http://liblime.com/koha |1(888)KohaILS

Re: [CODE4LIB] zoom

2006-09-06 Thread Jonathan Gorman


Now if only it could work on our windows servers.

Jon Gorman

On Wed, 6 Sep 2006, Eric Lease Morgan wrote:



This is just a posting in praise of the ZOOM Perl module.

Using ZOOM is it possible to query bunches o' things using a
consistent API. I used it to create a rudimentary Z39.50 client to
our local catalog, attached. I think ZOOM is a bit of an unsung hero.

--
Eric Lease Morgan
University Libraries of Notre Dame

Re: [CODE4LIB] "Expect" OPAC output to web?

2006-07-07 Thread Jonathan Gorman


3) Is there a much smarter way to do this?


What OPAC system do you use?  There certainly might be a better way,
but I can't be sure.  (I suppose the "logging in" should be a clue that
it's perhaps III, but I'm not sure).  I've never used Expect, so I can't
really comment on that.


Jonathan T. Gorman
Visiting Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688





What think you all?

expect "ingenious replies"
send "Ken"
interact

--
Ken Irwin
Reference Librarian
Thomas Library, Wittenberg University

Re: [CODE4LIB] next generation opac mailing list

2006-06-06 Thread Jonathan Gorman


On Tue, 6 Jun 2006, Michael J. Giarlo wrote:


On 6/5/06, Alexander Johannesen <[EMAIL PROTECTED]> wrote:



Oh, this one is easy to answer; we need to get away from MARC. No, not
the content of MARC, nor the idea of it, nor necessarily even the MARC
format and standard itself, but we need to get away from "we need
MARC" and the idea that knowledge sharing in libraries are best done
through MARC and that Z39.50 must be part of our requirements.



Lib Tek: The Next Generation -- The Wrath of MARC.


Is it too geeky to point out that The Warth of MARC would be more properly
associated with Lib Tek, the original?

You'd need something like

Lib Tek: The Next Generation -- MARC-Nemsis

(If we allow adaptation of episode titles maybe something like MARC-pid.
Thanks to Wikipedia for being an ever useful source of pop culture.)

Jon Gorman

Re: [CODE4LIB] OAI Static Repository

2006-06-06 Thread Jonathan Gorman


Knowning nothing about OAI, little about XML Schema, and out of practice
with validation in general, I still attempted to at least validate your
docs.  (It was interesting question in the back of my head of exactly how
to validate some of the XML schema docs, so it was worth it.)

I downloaded oai.xml, oai_dc.xsd, ran xmllint --schema  oai_dc.xsd oai.xml
and got a couple of errors.  They all look like parser errors and not
anything wrong with the schema.  I'm assuming that's because I don't know
what the heck I'm doing in that respect.

oai.xml:4: namespace error : Namespace prefix oai on repositoryName is not
defined
Lunar and Planetary Institute
   ^
oai.xml:5: namespace error : Namespace prefix oai on baseURL is not
defined
http://www.lpi.usra.edu/library/oai.xml
^
oai.xml:6: namespace error : Namespace prefix oai on protocolVersion is
not defined
2.0
^
oai.xml:7: namespace error : Namespace prefix oai on adminEmail is not
defined
[EMAIL PROTECTED]
   ^
oai.xml:8: namespace error : Namespace prefix oai on earliestDatestamp is
not defined
2006-05-20
  ^
oai.xml:9: namespace error : Namespace prefix oai on deletedRecord is not
defined
no
  ^
oai.xml:10: namespace error : Namespace prefix oai on granularity is not
defined
YYY-MM-DD
^
oai.xml:13: namespace error : Namespace prefix oai on matadataFormat is
not defined

   ^
oai.xml:14: namespace error : Namespace prefix oai on metadataPrefix is
not defined
oai:dc
   ^
oai.xml:15: namespace error : Namespace prefix oai on schema is not
defined
http://www.openarchives.org/OAI/2.0/oai_dc.xsd
   ^
oai.xml:16: namespace error : Namespace prefix oai on metadataNamespace is
not defined
http://www.openarchives.org/OAI/2.0/oai_dc/http://www.openarchives.org/OAI/2.0/oai_dc/
 ^
oai.xml:100: parser error : Premature end of data in tag dc line 92

^
oai.xml:100: parser error : Premature end of data in tag metadataNamespace
line 16

^
oai.xml:100: parser error : Premature end of data in tag matadataFormat
line 13

^
oai.xml:100: parser error : Premature end of data in tag
ListMetadataFormats line 12

^
oai.xml:100: parser error : Premature end of data in tag Repository line 2

^


Jonathan T. Gorman
Visiting Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688


On Tue, 6 Jun 2006, Bigwood, David wrote:


I'm hoping someone can spare the time to take a look at the OAI Static
Repository file I've created and let me know where I've gone wrong.

I've created an OAI Static Repository for my institution's publications.

http://www.lpi.usra.edu/library/oai.xml

However, it is not quite correct. I've looked at it and can't seem to
find the error. Anyone willing to take a look and see where I've left
off the " or > or whatever? Getting this up is a big deal, I've been
fighting for a year now to get this OKed. Now that it is up, the LANL
static repository gateway rejects it. I used MarcEdit to create the file
BTW.

Thanks,
David Bigwood
[EMAIL PROTECTED]
Lunar & Planetary Institute
http://www.lpi.usra.edu/library

Re: [CODE4LIB] Musings on using xISBN in our Horizon catalog

2006-05-24 Thread Jonathan Gorman


On Wed, 24 May 2006, David J. Fiander wrote:


Since the collection of ISBNs can be treated as an equivalence class,
can't any arbitrary member of the class be designated as the group
identifier?  This eliminates the need to create a synthetic id, and it
means that, for singular items, there's no need to create a separate
group id.


I can see two drawbacks:

1) It could be confusing.  If someone doesn't read the documentation
properly they could be mislead to believe that the table is an
un-normalized one.

2) One could optimize by only storing those numbers that have more than
one in a group.  From Thom's description it seems like that is what xISBN
does.  (I could be wrong here, going from memory). Problem of course if
you don't store the 1 to 1 mapping is that you don't know if there really
is no known relationships or if that particular isbn hasn't been examined
for any relationships with other material yet.  Of course, even if you
examined it there might be a relationship that hasn't been caught, so
the difference might not be that huge.  In either case you could say it is
really saying we don't know of any relation with this isbn and other
materials.

Jon Gorman

Re: [CODE4LIB] Musings on using xISBN in our Horizon catalog

2006-05-23 Thread Jonathan Gorman



That's why I'd love to know whether the xISBN database uses a common
identifier for each set of ISBNs, and whether (and I know 'pretty
please' is a poor justification for changing an API) it might be exposed
for this reason.



Hopefully the OCLC people can answer that.  It might be in the work Andy
suggested yesterday.  One idea I had while yesterday was if you don't
care that much about the id internally you could use an auto-increment.

To clarify, we'll assume that any isbn in a set will return the same set
in xISBN.
IE asking for isbns related to a returns a,b and c.  Asking for b or c
should return a,b,c.

So we can do as Andy suggested and start building our table by taking the
set of all current isbns, normalized a bit I'd imagine.

In a computationally-expensive method:

Start with the first isbn (x) and get the set of isbns from xISBN that is
related (A).  Iterate over every member of A testing for the following:
is the member assigned to a group already.  If it has, stop the loop and
assign x to the same group.  If none in A have been assigned a group,
start a new group and add x.

You'll have to do this every once in a while to make sure you're getting
all the new books.

Hopefully this makes up for the advice I gave yesterday ;).  I'm
sure you can probably come up with a better algorithm though,
something about the backward-lookup everytime makes me think that
there's a better way.


ps.  Andy's right, normalization is a good, good thing.  Only reason I
suggested looking at the costs was I was thinking it would be a lot easier
than trying to come up with a method to generate unique ids for a "group"
since my grasp of FRBR/xISBN is a little shaky I'll avoid any specific
terminology.
(Like I said in my original email, having a identifier or groups is a
definite advangtage).

Re: [CODE4LIB] Musings on using xISBN in our Horizon catalog

2006-05-22 Thread Jonathan Gorman


On Mon, 22 May 2006, Houghton,Andrew wrote:



I don't think a two column table relating ISBN to ISBN is the
proper data model.  As Ben pointed out in his message: "But if
we cross-reference every ISBN to every other, we'll have a
factorial number of rows, which is probably less than optimal."
It also means that the SQL table isn't fully normalized.



No, I don't either, but it's simpler.  Which was why I was a bit curious
about it.  Also, I missed the important fact they were only going to
display the links for those contained within the same catalog.  My mind
was already going to a openurl-type service.  A scenario where one might
want to know the related ISBNs, but there might be relatively few source
ISBNs compared to "related" ISBNs.  The performance cost might be less
than the effort of creating and maintaining a "FRBR" key for each work.
Of course, I'm happy to find some of the OCLC work on FRBR groups that I
didn't know about.  Looks like something to go back and look at some more
during lunch.

So sorry for the cruddy advice Ben ;).

Jonathan T. Gorman
Visiting Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688

Re: [CODE4LIB] Musings on using xISBN in our Horizon catalog

2006-05-22 Thread Jonathan Gorman


So the ideal use case for our xISBN cache is that we would be querying
only a local database, and that database would only return ISBNs (or bib
numbers) of other editions which are actually in our catalog.


Very nice idea.



My guess is that we should have a row for each ISBN in the system, along
with a column that links that ISBN to some common identifier that will
symbolically mean e.g. "SQL for Dummies, any version".  We could then
ask "What is the identifier for the first ISBN of the item being
displayed?" and then do a second query that asks "What other ISBNs in
our catalog have that identifier?"


My gut instinct is that while this is nice, I'm wondering if for this
particular project it couldn't be simplier.  Why not just simply have a
two column table, both being ISBNs.  Something like source,related.  Then
you simply feed in the results for every hit from the xISBN service.

And of course you might want to refresh this over time, or try to use some
other techniques.   (Ie occasionally look for places where source isbn has
related isbn, but related does not have source.).

Of course, there would be a definite advantage to be able to have the
identifier.  I don't know what number you'd use.  I would think you'd just
have to use a internal id (auto-incrementing or something of that nature).



The only remaining question, I think, is whether we should have one
table with ISBN as primary key and identifier as a second column (one
row per ISBN), and then another table with identifier as primary key and
ISBN as second column (again, one row per ISBN).  This could make our
queries faster at the cost of doubling storage space.  On the other
hand, perhaps we don't need to optimize that far.  It depends on how
mySQL reacts to searching the entire table for a non-primary-key column
value.  I'll have to fiddle with it and find out, unless one of you
knows already.



Nope, sounds like a fiddling thing to me.  I'd suspect that a two column
table should be optimized for lookups in either direction since they tend
to be used in joining queries (ie are a relation).  But I don't know the
behavior of MySQL by default.

Jon Gorman

Re: [CODE4LIB] code4lib journal

2006-05-04 Thread Jonathan Gorman


I suppose that I shall have to write an article for the journal
entitled "Code 4 dealing gracefully with idiotic journal names".



Actually, it sounds like a great article for the journal.  :)  Could
subtitle it: "For magazines named by people thinking they're clever.".  Of
course, I'm not sure that the title I liked the best "Indexed" would
have improved findability any.

Ah, the fun of serials tracking.  We could just have a couple of series
names, ala some of the German publishers with their monographic series.
(Ah, memories of the year I spent as a serials cataloging GA)

So every fourth (and A) issue would be a "code4lib journal", "/lib/dev/"
and "Indexed" series issue, while those with articles containing the word
"Regex" would be also be in the "Possibly Perl Series" which will be
incremented according to the m . (n . i) where n is the previous number of
i in the last of the series and m is the phase of the moon.

:P

Jon Gorman

Re: [CODE4LIB] code4lib journal

2006-05-04 Thread Jonathan Gorman


On Wed, 3 May 2006, Eric Hellman wrote:


Here's the latest on the code4lib journal:

"/lib/dev: A Journal for Library Programmers" won the journal name
vote.  (See http://www.code4lib.org/node/96 for more details.)


The idea of a journal name that contains punctuation in the title is
so breathtakingly idiotic that I can only assume that it is a
reference to the bug in the name of the computer language  C++


First, thanks to Jeff Davis for all the hard work on getting the journal
up and running.

Second, I'm tempted to make a snarky reply to Eric, but I'll try to be
civil.  I'm not a fan of the name either and didn't vote for it.  But the
community did.  And it seems insulting to follow up on the account of
Jeff's hard work with a complaint that could have been (and was) made
weeks ago.  It also could have been done in a much more polite manner.
Well, ok, on the irc you probably could have said this ;).  But the
atmosphere is a bit different than the mailing list.

Jon Gorman

Re: [CODE4LIB] code4lib journal

2006-05-04 Thread Jonathan Gorman


The editors of THE Journal suggest we establish a 246 field for "Slash
Lib Slash Dev".


Did you know that there really is/was a journal named THE Journal.
THE was a pseudo-acronym for Technology in Higher Education.





Ummm, who do you think the editors were that made the suggestions?  I
assume they meant the editors of Technology in Higher Education gave
suggestions due to the similar problems they have had.  As for /lib/dev, I
don't know if we even have any volunteers for editors yet.


Jon Gorman

Re: [CODE4LIB] tagging

2006-03-08 Thread Jonathan Gorman


Although tagging is the hot new term for it, I remember reading about
similar ideas way, way back in my undergraduate days.  (Ok, that was only
around four years ago, but still).  I'd have to dig to get some of the
research, but my overall impression is that there are a couple of
qualities that could make tagging very useful:

1) Community tagging of all records

2) experts applying a hierarchal controlled vocabulary on top of the
tagging being assisted by using various statistical analysis.  This
tends to be done somewhat poorly by community tagging.

3) Multi-word tags (with normalization applied in both directions)

4) Thesauri to map between tagging, similar concepts, and the controlled
vocabulary.  This is really an extention of point 2.  I remember reading a
study that indicated people don't like to use thesauri when they have to
select it as an option  but they like the results when it's done for them.
(Think Google's suggestion to search term X instead of Y).

5) Use some statistical crunching of user-submitted reviews and
information to suggest possible tag words.  This can be tricky.  In Spring
02 right after Google opened up their api I worked on a project with some
other people that used somewhat circular logic that tried to find a
combination of words from a webpage that would bring that page up in
Google in the rankings.  The idea was if the combination of words was
first, it would be a decent description to find similar pages.  The
problem is if you did it too perfectly, it's likely to be nonsenical.
(An old IR problem, forget the name.).

For example, for many large hobby sites it was fine since it offered
things like "model train", but for small sites with little content we got
things like "steel factory singleton".  In that case it was the home page
of a professor who offhandly mentioned a trip to a steel factory.  The
professor rarely used words that would be the most useful, such as
computer science.  But it's useful to offer tips to human beings that can
quickly throw out garabage like "steel factory" as a suggestion for
the page.


Of course, if you're asking for the nitty-gritty implementation details of
what I would doI'd have to think a little longer ;).  But not too
much.  At it's heart it would just be an index.  The statistical analysis
could be harvested by playing around with the large body of IR research
already out there.  Ah, and lots of promoting to get the critical mass of
people.


I'm typed the response rather quickly, so sorry if it doesn't make a
whole lot of sense.  This stuff is part of the reason I got into library
science so I get carried away on occasion.



Jonathan T. Gorman
Visiting Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688


On Wed, 8 Mar 2006, Eric Lease Morgan wrote:


Over the weekend I had the opportunity to chat with a friend about
"tagging" -- a sort of self- keyword cataloging as implemented by
del.icio.us and flikr.

I'm wondering, to what degree does this group here think tagging
would be beneficial in Library Land? For example, we could allow
tagging to be done against items in a library catalog or against a
personalized collection of Internet resources. If it were beneficial,
then how would y'all implement it?

--
Eric Lease Morgan
University Libraries of Notre Dame

[CODE4LIB] Registration Materials?

2006-02-12 Thread Jonathan Gorman

Hi all,

Probably not the best forum for this, but I was preparing
some stuff for the conference this week and it dawned on me
I was assuming I'd pick up any registration materials at the
conference.  But looking at the scheduling info doesn't
really seem to indicate any time to pick the material up or
place to do so.

I'm hoping it's just before the conference starts.  I'm a
little nervous it was supposed to be sent or something.  (Or
that there was a mistake in the registration.)  If there was
any info at the end of the process I told this.  (A fund
covered part of the conference costs including registration
so the person in charge of the fund had to put in the
financials and do the registration form).

Anyone have an idea of what's going on?   Hopefully I
haven't made myself look silly by overlooking something.
(Well, I'm really hoping that I actually am registered for
the conference.  I meant to double-check it last week but
forgot.  Guess I'll be calling Monday).

Don't know if this is the best forum for this but I figure
at least a few of you will be attending code4lib ;).

Jon Gorman

Re: [CODE4LIB] Greasemonkey Script for Tulsa City-County Library

2006-01-24 Thread Jonathan Gorman


On Tue, 24 Jan 2006, kevin smith wrote:


Hello,

I am trying to adapt the Amazon SPL Linky for a friend who is a patron
of the Tulsa City-County Library.


huh, well, I just started playing with Greasemonkey myself last week to do
something pretty similar.  Some things I noticed right away:

1) the script doesn't have the .user.js suffix on the end.  I think
Greasemonkey and probably userscripts.org expect this.  I would think they
would have a more meaningful error message for this.


I am no coder, I just tried to
modify the variables at the top, but I get an error from
userscripts.org when I ty to install the script.  :


So you uploaded it to userscripts.org?  That seems like a painful way to
do development.  Instead, just save the javascript file to your machine
and open with firefox.  There should be a little yellow line on top that
says something like "hello do you want to install this userscript?"  (The
exact words vary a bit).

I installed it in a similar manner after changing the script name to be
atcclinky.user.js.  It does install.  My guess is you'll need to do some
debugging though.  Perhaps a friendly soul might help you out there.

Also, do you have the latest version of GreaseMonkey?

Jon Gorman

Re: [CODE4LIB] PHP and SSL

2006-01-20 Thread Jonathan Gorman


A quick look at that site and it looks like it was made with MediaWiki.
Looking for info about setting up ssl with PHP might be a bit of a red
herring.  I don't think there's anything special about ssl and php.  It
depends on the web server that's providing it.

I'd recommend looking for documentation on enabling ssl with MediaWiki.
Also check out his webserver documentation.  SSL can be a bit confusing to
set up, but there's a decent amount of info out there.


Jonathan T. Gorman
Visiting Research Information Specialist
University of Illinois at Champaign-Urbana
216 Main Library - MC522
1408 West Gregory Drive
Urbana, IL 61801
Phone: (217) 244-4688


On Fri, 20 Jan 2006, Jeffrey Barnett wrote:


Can someone tell me how to enable https for a particular php script?  I
was just looking at the newly created Library Success Wiki
http://www.libsuccess.org/ and noticed that its login page is
unencrypted.  I mentioned this to the Wiki admin who then asked me how
to fix it.  Unfortunately I know zilch about php.  Is there a simple answer?

PS: The site is "powered by MediaWiki" http://www.mediawiki.org/ but a
search of their documentation for https returns no result.

61 matches

Mail list logo