Re: [CODE4LIB] [Fwd: z39.50 holdings schema]
just here: http://blogs.talis.com/panlibus/archives/2007/12/ the_library_20_1.php rob On 17 Dec 2007, at 16:55, Andrew Nagy wrote: Where is Roy and his manifesto when you need him!
Re: [CODE4LIB] pspell aspell: make your own word lists/dictionaries
I've had sight of this and generated a dictionary from the LoC bib data. It's very fast and the suggestions are excellent, including multi-word corrections. rob Here's Martin's mail with the details - but I encourage you to join the group But I'm sure you'd like to get your hands on it, so I packaged up a convenient way to quickly make a dictionary from your own data files, and run your own queries through it. Grab this jar: http://groups.google.com/group/spelt/web/spelt.jar Everything is included (Spelt and Lucene). The only requirement is that you have JDK 1.5 (5.0) -- it won't work with JDK 1.4. The utility includes a fast-but-dumb text ripper that does a deep directory scan for textual files, and pulls out all the words. It should be able to handle XML, HTML, and plain text files (provided they're in UTF-8 encoding.) You can build a dictionary this way: java -jar spelt.jar -build your-src-dir speltDictDir If you run it on a big data set, I'd suggest giving it more RAM, like this: java -Xmx750m -jar spelt.jar -build your-src-dir speltDictDir You can run a set of test queries (e.g, http://groups.google.com/group/spelt/web/test.list) like this: java -jar spelt.jar -test speltDictDir test.list Finally, if you are curious about how this compares with the exisitng code in Lucene, you can add the -old flag just before -build or -test. Warning: the build process is about 35 times slower on my machine, so I'd suggest doing this on a small data set. -Original Message- From: Code for Libraries on behalf of Jonathan Rochkind Sent: Tue 03/04/2007 7:01 PM To: CODE4LIB@listserv.nd.edu Subject: Re: [CODE4LIB] pspell aspell: make your own word lists/dictionaries I haven't had time to look at it yet, but someone at Code4Lib conference proposed a more sophisticated approach to spell checking that sounded really interesting to me, and said he was going to share the code. I hope to have time to investigate at some point. Let's see if I can find it on the conference page yeah, it was Martin Haye. You can watch his presentation here: http://video.google.com/videoplay?docid=4028600349627496246hl=en Looks like he's *martin*.*haye*[at]gmail.com. During the lightning talk, he said he didn't want to distribute the code seperately but wanted to include it in Lucene if possible---but later in the conference, he said he had been convinced by the interest in it to distriburte the code as it's own standalone thing, and planned to do that presently. If anyone does or has explored using martin's code, please let us know about your experience. Jonathan Kevin Kierans wrote: Has anyone created their own dictionaries for aspell? We've created blank delimited lists of words from our opac. One for title, one for subjects, and one for authors. (We're thinking of a series one as well) We would like to use one of these word lists to offer suggestions depending on which search the patron is making. We're assuming we can make better suggestions if the words come from our actual opac. We've got it working with the dictionary that comes with aspell, but having problems (we can't do it!) substituting our own dictionaries. Does anyone have any experience/knowledge/hints/pointers they can share with us? We are using linux, php 5, aspell 0.50.5, and php - pspell functions. Thanks, Kevin TNRD Library System, Kamloops, British Columbia, Canada -- Jonathan Rochkind Sr. Programmer/Analyst The Sheridan Libraries Johns Hopkins University 410.516.8886 rochkind (at) jhu.edu The very latest from Talis read the latest news at www.talis.com/news listen to our podcasts www.talis.com/podcasts see us at these events www.talis.com/events join the discussion here www.talis.com/forums join our developer community www.talis.com/tdn and read our blogs www.talis.com/blogs Any views or personal opinions expressed within this email may not be those of Talis Information Ltd. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited. Talis Information Ltd is a member of the Talis Group of companies and is registered in England No 3638278 with its registered office at Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
Re: [CODE4LIB] GNU Metadata Exchange Utilities
Laurence, Does your work draw on any of the work Devon has been doing over at OCLC? http://www.code4lib.org/2006/smith rob Rob Styles Programme Manager, Data Services, Talis tel: +44 (0)870 400 5000 fax: +44 (0)870 400 5001 direct: +44 (0)870 400 5004 mobile: +44 (0)7971 475 257 msn: [EMAIL PROTECTED] irc: irc.freenode.net/mrob,isnick -Original Message- From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Laurence Finston Sent: 23 March 2007 20:38 To: CODE4LIB@listserv.nd.edu Subject: Re: [CODE4LIB] GNU Metadata Exchange Utilities Eric Lease Morgan wrote: On Mar 21, 2007, at 5:07 AM, Laurence Finston wrote: Cool, and interesting. I think I speak for the community when I sincerely say, Good luck. Thank you. The goals you desire to achieve with the software are the same sorts of goals many of us have. I'm sure some of us will install and experiment with the Exchange Utilities when they are easily installable on the platforms we support. Alas, many of us simply do not have access to Microsoft products. This list is clearly not the place to discuss my personal situation, but I will say that I need to find work in order to continue working on this package. If I'm not employed _to_ work on it, I would work on it in my free time. I think this kind of project could receive funding from some institution, but I'm not in a position to apply for it. One problem with Free Software is finding someone to finance it. Using Microsoft products wasn't my choice. Visual Studio promotes a style of programming that starts with the GUI and then adds functionality to the buttons, edit boxes, etc. This is the opposite of what I think is the right way to go about it. That's why I'm building the new package around an interpreter that I've written using GNU Bison. (Just in case anyone isn't familiar with this topic, Bison is the GNU version of the UNIX utility `yacc'. Bison and yacc are compiler generators.) The sub-package `scantest' can be installed on GNU/Linux systems. It should work on other UNIX-like systems, but I haven't tested this. At present, it's a toy program, since it doesn't perform a useful function, but it might be fun to try. I find it quite enjoyable watching the output from Bison parsers, but perhaps I'm easily amused. I don't think a GUI is necessary for this package, but one could be written. However, I would use a free library and certainly not Visual Studio. I'm not personally a big fan of GUIs, although they can be useful. An interpreter would also be useful in combination with a GUI. However, for this purpose, I think an interpreter for a machine-like language, and a scanner that reads binary files, would be more useful. I've planned to write an interpreter like this for my other package, GNU 3DLDF, but have never had the time. There are a lot of free tools, libraries, etc., for some of the tasks involved, notably `libxml' for handling XML data and YAZ for accessing Z39.50 servers. Much of the work will just be a matter of combining them. I believe that a good approach would be to program filters for the individual tasks I want to solve, i.e., programs that read from their standard input and write to their standard output. Such filters can be chained using pipes. As I'm sure many of you know, this is a typical style of programming in UNIX-like programming environments. Of course, the filter programs could also have side effects, such as writing files. A great deal of my previous work has involved Donald Knuth's TeX and related packages. It's very easy to write programs that output TeX input files, and it's possible to produce very high quality printable output using TeX, usually in the form of PostScript or PDF files. I will probably use TeX to represent the contents of the databases, along with HTML. At present, I'm very occupied with job applications. I also have to perform some tasks resulting from the package having been accepted by the GNU Project. For example, I must add the required options, change copyright notices, work on preparing a release, etc. When I've done something that might be of interest to readers of this list, I will post an announcement. Under the circumstances, it may be awhile before I'm able to devote the necessary time to programming. Laurence Finston The very latest from Talis read the latest news at www.talis.com/news listen to our podcasts www.talis.com/podcasts see us at these events www.talis.com/events join the discussion here www.talis.com/forums join our developer community www.talis.com/tdn and read our blogs www.talis.com/blogs Any views or personal opinions expressed within this email may not be those of Talis Information Ltd. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please
Re: [CODE4LIB] Screencast editing advice?
I've used Camtasia, it has a few quirks, but is ok. Although not available for Mac :-( Rob Styles Programme Manager, Data Services, Talis tel: +44 (0)870 400 5000 fax: +44 (0)870 400 5001 direct: +44 (0)870 400 5004 mobile: +44 (0)7971 475 257 msn: [EMAIL PROTECTED] irc: irc.freenode.net/mrob,isnick -Original Message- From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Eric Lease Morgan Sent: 06 March 2007 23:29 To: CODE4LIB@listserv.nd.edu Subject: Re: [CODE4LIB] Screencast editing advice? On Mar 6, 2007, at 6:04 PM, Nathan Vack wrote: Can anyone recommend something to this end? I'm willing to spend a little, but, say, Final Cut Express is probably of my budget (plus major feature overkill). Quicktime Pro, perhaps? I have had a lot of success with QuickTime Pro. It doesn't do transitions (fade-in, fade-out, etc.), but it does a great job of letting you cut out the stuff before, after, and in between your video. Once you create your movies you can add sound. When you are done you can save/export your movie file to a whole bunch o' formats. QuickTime++ -- Eric Lease Morgan The very latest from Talis read the latest news at www.talis.com/news listen to our podcasts www.talis.com/podcasts see us at these events www.talis.com/events join the discussion here www.talis.com/forums join our developer community www.talis.com/tdn and read our blogs www.talis.com/blogs Any views or personal opinions expressed within this email may not be those of Talis Information Ltd. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited. Talis Information Ltd is a member of the Talis Group of companies and is registered in England No 3638278 with its registered office at Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
Re: [CODE4LIB] Videos?
http://www.jonathancoulton.com/ Jonathan Coulton publishes all his work under a liberal CC license, which feels appropriate. One of his biggest hits, which did the rounds a few months back, was Code Monkey, which also felt appropriate. Perhaps a mashup of code4lib videos, photos and code monkey would be in order? rob Rob Styles Programme Manager, Data Services, Talis tel: +44 (0)870 400 5000 fax: +44 (0)870 400 5001 direct: +44 (0)870 400 5004 mobile: +44 (0)7971 475 257 msn: [EMAIL PROTECTED] irc: irc.freenode.net/mrob,isnick -Original Message- From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Noel Peden Sent: 05 March 2007 19:40 To: CODE4LIB@listserv.nd.edu Subject: Re: [CODE4LIB] Videos? Hi all, I'm finally back the office today and the videos are in process... I'm not sure where they'll go, but they'll be up somewhere. BTW, if anybody has any ideas for royalty free title music (a short 3+ second thing), I'm open. I'll whip up something if needed. Noel K.G. Schneider wrote: I just wondered, are the videos going online anywhere? Karen G. Schneider [EMAIL PROTECTED] The very latest from Talis read the latest news at www.talis.com/news listen to our podcasts www.talis.com/podcasts see us at these events www.talis.com/events join the discussion here www.talis.com/forums join our developer community www.talis.com/tdn and read our blogs www.talis.com/blogs Any views or personal opinions expressed within this email may not be those of Talis Information Ltd. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited. Talis Information Ltd is a member of the Talis Group of companies and is registered in England No 3638278 with its registered office at Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
Re: [CODE4LIB] auto-anthologizing
(Does your feedreader lose its flavor on your next post overnight?) If your readers say don't chew on it, but you edit it in spite? Or if the comments say you're wrong, but you edit so you're right? Wikipoem. Rob Styles Programme Manager, Data Services, Talis tel: +44 (0)870 400 5000 fax: +44 (0)870 400 5001 direct: +44 (0)870 400 5004 mobile: +44 (0)7971 475 257 msn: [EMAIL PROTECTED] irc: irc.freenode.net/mrob,isnick -Original Message- From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Walter Lewis Sent: 15 February 2007 12:55 To: CODE4LIB@listserv.nd.edu Subject: Re: [CODE4LIB] auto-anthologizing Daniel Chudnov wrote: (Does your feedreader lose its flavor on your next post overnight?) If your readers say don't chew on it, but you edit it in spite? Walter The very latest from Talis read the latest news at www.talis.com/news listen to our podcasts www.talis.com/podcasts see us at these events www.talis.com/events join the discussion here www.talis.com/forums join our developer community www.talis.com/tdn and read our blogs www.talis.com/blogs Any views or personal opinions expressed within this email may not be those of Talis Information Ltd. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited. Talis Information Ltd is a member of the Talis Group of companies and is registered in England No 3638278 with its registered office at Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
Re: [CODE4LIB] RE: [CODE4LIB] Polls open for Code4Lib 2007 T-Shirt design
ROTF, LMAO Daniel Chudnov++ ...in the face of MARC? By which, of course, we mean MARC with ISBD punctuation and AACR2 rules, the combination of which might make sense still to some members of our community but to us snarky geek types is really quite difficult to work with, so much so that we love dropping metaphorical bombs on it even while we don't really have a decent solution for replacing it quite yet hence the insecurity we share in great evidence by mocking it on a conference t-shirt. I vote for this whole paragraph on the back :- Rob Styles Programme Manager, Data Services, Talis tel: +44 (0)870 400 5000 fax: +44 (0)870 400 5001 direct: +44 (0)870 400 5004 mobile: +44 (0)7971 475 257 msn: [EMAIL PROTECTED] irc: irc.freenode.net/mrob,isnick -Original Message- From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Daniel Chudnov Sent: 26 January 2007 15:59 To: CODE4LIB@listserv.nd.edu Subject: Re: [CODE4LIB] RE: [CODE4LIB] Polls open for Code4Lib 2007 T- Shirt design On Jan 26, 2007, at 10:19 AM, Roy Tennant wrote: Those of you (are there any?) who can't or don't want to login to see berick's response should know that it is: aren't we all morons in the face of MARC? Why so negative toward a spec that's served our community for nigh unto forty years? I'd much rather see something fun or silly and not disparaging of anything and would vote against this if we were set up to vote against something. I mean, wouldn't you really then mean for it to say: ...in the face of MARC? By which, of course, we mean MARC with ISBD punctuation and AACR2 rules, the combination of which might make sense still to some members of our community but to us snarky geek types is really quite difficult to work with, so much so that we love dropping metaphorical bombs on it even while we don't really have a decent solution for replacing it quite yet hence the insecurity we share in great evidence by mocking it on a conference t-shirt. For good or ill, our profession is invested in MARC, and to parade around in clothes disparaging it (or in a shirt mocking any particular individual, which was, apparently, my first annual t-shirt suggestion objection, last year) seems like the wrong impression to give. Just because we blow off steam sometimes by acting like junior high schoolers on-channel doesn't mean we have to document similar behavior on clothing we'll all wear to other conferences. The very latest from Talis read the latest news at www.talis.com/news listen to our podcasts www.talis.com/podcasts see us at these events www.talis.com/events join the discussion here www.talis.com/forums join our developer community www.talis.com/tdn and read our blogs www.talis.com/blogs Any views or personal opinions expressed within this email may not be those of Talis Information Ltd. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited. Talis Information Ltd is a member of the Talis Group of companies and is registered in England No 3638278 with its registered office at Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
Re: [CODE4LIB] Getting data from Voyager into XML?
I can't speak for other vendors, but historically for Talis it's been a limitation of the contract with the RDBMS vendor. We ship Sybase as the RDBMS for our ILS and until recently that license was restricted to use of the RDBMS by our own product. We've recently re-negotiated that to allow much more freedom for our customers and to cover what was common practice anyway. Obviously the RDBMS landscape has changed somewhat and database independance is common place now, but how many ILSs are able to work with MySql? Ours can't (yet). rob -Original Message- From: Code for Libraries on behalf of Peter Schlumpf Sent: Fri 19/01/2007 2:26 PM To: CODE4LIB@listserv.nd.edu Subject: Re: [CODE4LIB] Getting data from Voyager into XML? Are there such limitations in contractual agreements with ILS vendors? That is weird. I agree generally that such a limitation should be intolerable. But I can understand their point of view though. The vendor is probably trying to avoid situations where users muck with their systems and call for support when they break things. This reminds me of the first Macintosh computers. Those suckers were pretty much welded shut and one could only open the computer with a special tool. Two different motivations at work though. I think in the former the situation is likely a vendor trying to protect users from mucking around with an inherently fragile system. In the latter it's trying to provide a consistent user experience with something well designed. There is something to be said with presenting solid and safe interfaces to a well designed system that users shouldn't feel the need to drill through. Peter -Original Message- From: Eric Lease Morgan [EMAIL PROTECTED] Sent: Jan 19, 2007 7:01 AM To: CODE4LIB@listserv.nd.edu Subject: Re: [CODE4LIB] Getting data from Voyager into XML? On Jan 19, 2007, at 6:37 AM, Birkin James Diana wrote: Since we can't SQL-query our own ILS data directly... (ok, blood pressure is fine again) this solved a lot of issues. I don't know why we tolerate such limitations in our contractual agreements. Maybe we should charge a fee or demand a reduction in fees for living with this. It's like this, No, you are not allowed to look under the hood of your car or take apart your radio. Weird. -- Earache The very latest from Talis read the latest news at www.talis.com/news listen to our podcasts www.talis.com/podcasts see us at these events www.talis.com/events join the discussion here www.talis.com/forums join our developer community www.talis.com/tdn and read our blogs www.talis.com/blogs Any views or personal opinions expressed within this email may not be those of Talis Information Ltd. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited. Talis Information Ltd is a member of the Talis Group of companies and is registered in England No 3638278 with its registered office at Knights Court, Solihull Parkway, Birmingham Business Park, B37 7YB.
Re: [CODE4LIB] Anyone from UK/Europe going to the Code4Lib Conference?
David, A couple of us from Talis are going over (Richard Wallis is sorting our registration as I type). Rob Styles Technical Lead, Talis -Original Message- From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of David Kane Sent: 04 January 2007 10:17 To: CODE4LIB@listserv.nd.edu Subject: [CODE4LIB] Anyone from UK/Europe going to the Code4Lib Conference? Just curious. Thanks, David Kane WIT Libraries http://library.wit.ie/ ++353.51302838 The very latest from Talis read the latest news at www.talis.com/news listen to our podcasts www.talis.com/podcasts see us at these events www.talis.com/events join the discussion here www.talis.com/forums join our developer community www.talis.com/tdn and read our blogs www.talis.com/blogs Any views or personal opinions expressed within this email may not be those of Talis Information Ltd. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited.
Re: [CODE4LIB] java application on a cd
When I've seen this done before (MS specific with an access db running from CD) it was done by having a lightweight web server running from the CD. This can be started automatically under Windows using an autorun.inf file, not sure how you'd auto-start it under Linux. So, given Eric's steps we can replace 7. with: 7. Write a Java program to act as a web server and search the index and return a search results page. Then the form looks like: form action='http://localhost:_some_port_/search/' method='get' input type='text' name='query' / input type='submit' / /form Rob Styles -Original Message- From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Binkley, Peter Sent: 17 October 2006 16:22 To: CODE4LIB@listserv.nd.edu Subject: Re: [CODE4LIB] java application on a cd This was more or less what I was thinking of in my hackfest suggestion to embed Lucene in a Firefox extension; but I hadn't thought of using it to access pre-distributed Lucene indexes. That might be very handy. (Though a Firefox-only approach probably isn't what Eric has in mind). Would it be stretching METS too far to encode the digital objects, the Lucene index, and Firefox and the extension as the software needed to access the stuff? (XULRunner would provide a non-browser-based way to deploy the same functionality). Peter -Original Message- From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Hickey,Thom Sent: Monday, October 16, 2006 7:31 AM To: CODE4LIB@listserv.nd.edu Subject: Re: [CODE4LIB] java application on a cd Seems to me you need a JavaScript version of the Lucene search engine. I've done search-only subsets of search engines, and they are a lot less complex than the whole thing. People have done similar things (like Google's JavaScript version of XSLT). It takes some work, but then all you need to run is a JavaScript browser. --Th -Original Message- From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Eric Lease Morgan Sent: Friday, October 13, 2006 1:52 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] java application on a cd Can someone here tell me about the feasibility of implementing a particular Java application on a CD, described below. For a good time I would like to distribute my Alex Catalogue of Electronic Texts on an operating system independent CD. Here is how I see it being implemented: 1. Collect electronic texts 2. Mark them up in TEI 3. Transform them into HTML and/or PDF 4. Create an author index in HTML 5. Create a title index in HTML 6. Use Lucene to index the texts 7. Write a Java program to search the index and return hyperlinks to the texts 8. Put the whole lot on a CD 9. Give it away With the exception of Step #7, I know the plan is implementable, but how can I do Step #7? This is what I want to do with Step #7. First I create an HTML form looking something like this: form action='search.java' method='get' input type='text' name='query' / input type='submit' / /form When people click the submit button the contents of query get passed to search.java and executed. The search results are formatted into HTML and returned to the browser for display. Is such a program implementable? Can a program like search.java get input from a form like this without the need of an intermediate HTTP server? Apparently Java applet technology will not work in this environment because applets are not allowed to read from the local file system. -- Eric Wishing I Was @ Access2006 Morgan University Libraries of Notre Dame The very latest from Talis read the latest news at www.talis.com/news listen to our podcasts www.talis.com/podcasts see us at these events www.talis.com/events join the discussion here www.talis.com/forums join our developer community www.talis.com/tdn and read our blogs www.talis.com/blogs Any views or personal opinions expressed within this email may not be those of Talis Information Ltd. The content of this email message and any files that may be attached are confidential, and for the usage of the intended recipient only. If you are not the intended recipient, then please return this message to the sender and delete it. Any use of this e-mail by an unauthorised recipient is prohibited.