Re: [CODE4LIB] Good advanced search screens
Although I don't always agree with him, Jacob Nielsen has advice on the provision of 'Advanced Search' - essentially, most users cannot use it effectively - http://www.useit.com/alertbox/20010513.html The only problem with this is that the short report doesn't make it very clear what 'advanced search' might consist of, and where users have a problem (it mentions that most users don't do Boolean, but I'm not sure this is what I'd regard as 'advanced search') The longer (charged for) report might have more detailed advice - anyone tried it? Owen Owen Stephens Assistant Director: eStrategy and Information Resources Central Library Imperial College London South Kensington Campus London SW7 2AZ t: +44 (0)20 7594 8829 e: [EMAIL PROTECTED] -Original Message- From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf Of Sean Hannan Sent: 15 November 2008 16:19 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Good advanced search screens If you haven't already, I'd suggest that you poke around in the IxDA mailing list archives (http://www.ixda.org/). I find that list (and its members) invaluable for design/usability best practices (often backed up with published research). Luke Wroblewski's blog (http://www.lukew.com/ff/index.asp) and book (http://rosenfeldmedia.com/books/webforms/info/description/) might be other good places to look for inspiration. -Sean Sean Hannan Web Developer Sheridan Libraries Johns Hopkins University [EMAIL PROTECTED] Walker, David [EMAIL PROTECTED] 11/14/2008 4:48 PM I'm working on an advanced search screen as part of our WorldCat API project. WorldCat has dozens of indexes and a ton of limiters. So many, in fact, that it's rather daunting trying to design it all in a way that isn't just a big dump of fields and check boxes that only a cataloger could decipher. So I'm looking for examples of good advanced search screens (for bibliographic databases or otherwise) to gain some inspiration. Thanks! --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu
Re: [CODE4LIB] Reference string parsing software available: ParsCit v080402
Hi Jonathan, PS: And indeed, mapping to OpenURL 1.0 is _exactly_ what I need to do. Sounds like I should look into L8X? There is a demo/testing site at http://www.lemon8.org ; you might want to try playing around there with some citations to get a feel for how it works without having to download or install anything. It would be convenient if there were a way to choose which parsers to use with L8X, via an API or configuration if I install the software locally. I'm not sure I'll need to pass the citation to _all_ of them. I am going to be doing this in realtime while the user is waiting, so speed matters. But just ParsCit alone isn't doing the job, perhaps ParsCit+regex plus maybe one more would be good enough. Absolutely -- setting a list of default parsers to use, and the ability to turn them on/off on-the-fly (ie. while editing any particular citation) is something that's been on the to-do list for a while. I'm hoping to have it done in the next week or two. I should add that having just added ParsCit, I've actually found that it doesn't do nearly as good a job as some of the other parsers, but that may just be on the citation formats that I happen to work with. Part of the way L8X is designed is to assign a simple statistical score to estimate how accurately each parser performs; one feature I've been planning is to simply allow a threshold to ignore results from parsers which have done a poor job on that particular citation. There is some additional functionality to take a parsed citation and look it up in a number of online indexes, and attempt to fetch correct information, both to supplement, say, an incomplete citation, and provide an additional level of quality improvement, but that's a somewhat more complex topic that I'm hoping to make the subject of a submission to the Code4Lib journal. :-) MJ MJ Suhonos [EMAIL PROTECTED] 11/14/08 3:18 PM Hi all, John, the supplemented approach you describe is how we go about it in our Lemon8-XML (L8X) software (http://pkp.sfu.ca/lemon8); The way L8X handles parsing is it passes the original unparsed string to a number of different parsers in turn (Freecite, each of the 3 Paracite parsers, and a home-grown regex parser), does a little cleaning and normalization, and then hands the results to the user to select the correct values for each element. Most of the time, it actually does a pretty good job of detecting the right elements -- in fact, numeric stuff like volume, issue, pages, etc. tend to be more accurate than names and titles, mostly because of the larger variance in the latter. Our experience has been that relying on a single approach (machine-learning vs. format-rule-based vs. regular-expression) is less reliable than getting partial matches from various approaches, and then assembling them. In this case, the whole is in fact greater than the sum of the parts. I haven't added the ParsCit web service explicitly since a SOAP-based interface is a bit more cumbersome in PHP than FreeCite's POST-type interface, but I'll make a point of doing so now. Incrementally adding services that all map to the same citation elements (we use the OpenURL 1.0 fields, with a few aberrations) means it's very easy to increase the accuracy by simply adding another parsing plugin/service. You'd have to pull out the relevant classes from L8X to get a standalone parser, but since this is one of the more appealing aspects of the software for many people, we're looking at making a simple API in L8X to just do the citation parsing, possibly without the UI to take it from semi-automated to completely automatic. MJ On 14-Nov-08, at 12:07 AM, Jonathan Rochkind wrote: Thanks Min, this is a great project, that I keep trying to find time to investigate more. Don't apologize for keeping us updated, please continue to! Do you know if any of the improvements have improved detection of volume/issue/page# information? For what I want to use it for, reasonably accurate parsing of volume/issue/page# is needed, and so far whenever I've looked at demos, this seems to be something that all of these machine-learning-type approaches do pretty awfully at. (I wonder if you are not including this in your training much, because it isn't neccesary for your purposes to have volume/issue/ page#?) I also have wondered if it would make sense to take a machine- learning-type approach to begin with, but then supplement it with formal-rule-based parsing to attempt to get vol/issue/page# according to common patterns? I don't have too much time to try work on this myself, but if anyone who is working on these various citation parsing efforts could improve volume/issue/page# to a reasonable level, it would make the libraries useful for a much greater range of applications. Jonathan Min-Yen Kan [EMAIL PROTECTED] 11/13/08 8:30 PM Dear all: (Sorry to resurrect an old thread...) We've seen the release of several new freely
[CODE4LIB] Announcement: LuSql: Database to Lucene indexing
I am proud to announce LuSql: LuSql is a simple but powerful tool for building Lucene indexes from relational databases. It is a command-line Java application for the construction of a Lucene index from an arbitrary SQL query of a JDBC-accessible SQL database. It allows a user to control a number of parameters, including the SQL query to use, individual indexing/storage/term-vector nature of fields, analyzer, stop word list, and other tuning parameters. In its default mode it uses threading to take advantage of multiple cores. LuSql can handle complex queries, allows for additional per record sub-queries, and has a plug-in architecture for arbitrary Lucene document manipulation. Its only dependencies are three Apache Commons libraries, the Lucene core itself, and a JDBC driver. LuSql has been extensively tested, including a large 6+ million full-text article metadata document collection, producing an 86GB Lucene index. http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql If you have any questions, please contact me. Thanks, Glen Newton :-) -- Glen Newton | [EMAIL PROTECTED] Researcher, Information Science, CISTI Research NRC W3C Advisory Committee Representative http://tinyurl.com/yvchmu tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246 Canada Institute for Scientific and Technical Information (CISTI) National Research Council Canada (NRC)| M-55, 1200 Montreal Road http://www.nrc-cnrc.gc.ca/ Institut canadien de l'information scientifique et technique (ICIST) Conseil national de recherches Canada | M-55, 1200 chemin Montréal Ottawa, Ontario K1A 0R6 Government of Canada | Gouvernement du Canada --
Re: [CODE4LIB] Good advanced search screens
Peter Morville has been putting search examples into a flickr collection for an upcoming book he's writing: http://www.flickr.com/photos/morville/collections/72157603785835882/ and http://www.findability.org/archives/000194.php There are some great examples of both simple and complex search Best, -- Susan Teague Rector Web Applications Manager Library Information Systems, VCU Libraries 804.827.3554 | [EMAIL PROTECTED] Walker, David [EMAIL PROTECTED] 11/14/2008 4:48 PM I'm working on an advanced search screen as part of our WorldCat API project. WorldCat has dozens of indexes and a ton of limiters. So many, in fact, that it's rather daunting trying to design it all in a way that isn't just a big dump of fields and check boxes that only a cataloger could decipher. So I'm looking for examples of good advanced search screens (for bibliographic databases or otherwise) to gain some inspiration. Thanks! --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu
Re: [CODE4LIB] Good advanced search screens
I think there are ways to dispense with it without actually dispensing with it. David, you've of course seen what I did with Xerxes, where instead of calling it 'advanced search' I call it 'more options'. That was of course a much simpler case--the difference between Metalib's advanced search and simple search is pretty small, there isn't that many features available even using all the features in Metalib search. Unlike WorldCat, where the difference between a 'start' screen and a 'full' list of options is greater. I think it makes sense that you might need an initial search screen with fewer options, and a way to then get more search options (more options typically including more fields to search/limit, as well as more complicated ways to boolean combine those searches). But I think we should find ways to provide that more functionality other than the typical click here to see advanced search pattern. I think it should never be called 'advanced search'. Something like 'more options' is better. But maybe not even just one 'more options' link. What exactly is it that this additional functionality is providing, and what are the use cases for it? Maybe provide links to add different components of this advanced functionality to the search page, identified by explaining what they are/are for, instead of just calling them advanced. They don't need to be added all at once neccesarily. For instance, if you want to add another search field to the screen for boolean combination, a button that says add another search field seems appropriate. Click it to add a second one, click it again to add a third one, etc. No advanced search, just offering functionality. I also like the idea of allowing a syntax for expressing fielded search and boolean combination even in the initial 'simple' search box. Even Google does this. Most users might not use it, but for power users it's awfully convenient. CQL would be one potential choice for a textual query syntax that can be entered in the initial search field. Jonathan Walker, David wrote: How about dispensing altogether with the basic/advanced dichotomy in a search interface? I'm not sure I can dispense with it completely, Peter. As Peter Morville said on the site Susan posted: [I]t may be worth offering advanced features that are useful to a small yet important subset of users. I'll give you three guesses as to who my small yet important subset of users are, and the first two don't count. ;-) --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu From: Code for Libraries [EMAIL PROTECTED] On Behalf Of Peter Schlumpf [EMAIL PROTECTED] Sent: Saturday, November 15, 2008 5:45 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] Good advanced search screens How about dispensing altogether with the basic/advanced dichotomy in a search interface? Just create a well designed interface that's consistent and works well for all users. The basic/advanced dichotomy is really quite arbitrary, and exists in the mind of the designer. One thing that seems to be underappreciated these days is a straightforward and flexible search syntax. A command line in the search field may be a much more elegant and consistent solution than trying to make all options available and visible in a GUI. Make the basic features of the search interface clear and easy to use, but design the interface in such a way that more advanced users can easily discover the features they need as they use it. With this approach Basic and Advanced exist on a continuum. There's a little learning curve but all users will have the motivation to learn to use the interface to the level that satisfies their needs, and in the long run probably find it much easier to use. Peter Peter Schlumpf [EMAIL PROTECTED] http://www.avantilibrarysystems.com -Original Message- From: Walker, David [EMAIL PROTECTED] Sent: Nov 14, 2008 4:48 PM To: CODE4LIB@LISTSERV.ND.EDU Subject: [CODE4LIB] Good advanced search screens I'm working on an advanced search screen as part of our WorldCat API project. WorldCat has dozens of indexes and a ton of limiters. So many, in fact, that it's rather daunting trying to design it all in a way that isn't just a big dump of fields and check boxes that only a cataloger could decipher. So I'm looking for examples of good advanced search screens (for bibliographic databases or otherwise) to gain some inspiration. Thanks! --Dave == David Walker Library Web Services Manager California State University http://xerxes.calstate.edu -- Jonathan Rochkind Digital Services Software Engineer The Sheridan Libraries Johns Hopkins University 410.516.8886 rochkind (at) jhu.edu
[CODE4LIB] last week for presentation proposals
We have some great ones, and want more. Submit! http://library.brown.edu/code4libcon09/proposals/ --- Birkin James Diana Programmer, Integrated Technology Services Brown University Library [EMAIL PROTECTED]
[CODE4LIB] Drupal4Lib Camp
This is going out to a couple of lists. Apologies for the duplication. --- Darien Library will be hosting a Drupal4Lib Camp on Friday, February 27, 2009 from 9 am to 4 pm. The camp will be an opportunity for libraries who are working with Drupal, or interested in implementing Drupal, to get together, share experiences, solve problems, and collaborate. This unconference will be a combination of a series of 10 min lightning talks given by Drupal veterans in the morning followed by break-out sessions in the afternoon. Audio and video from Drupal4Lib Camp sessions will also be streamed lived online. There is no registration fee. However, participation is limited to 70. Please register for the Drupal4Lib Camp athttp://drupalib.interoperating.info/node/167 --- ae-j -- Amanda Etches-Johnson User Experience Librarian McMaster University Library Mills L504H | 905.525.9140 x26006
Re: [CODE4LIB] djatoka
At Fri, 14 Nov 2008 06:10:45 -0500, Birkin James Diana [EMAIL PROTECTED] wrote: Yesterday I attended a session of the DLF Fall Forum at which Ryan Chute presented on djatoka, the open-source jpeg2008 image-server he and Herbert Van de Sompel just released. It's very cool and near the top of my crowded list of things to play with. If any of you have had the good fortune to experiment with it or implement it into some workflow, get over to the code4libcon09 presentation-proposal page pronto! And if you're as jazzed about it as I am, and know it'll be as big in our community as I think it will, consider a pre-conf proposal, too. Hi - This is a very cool tool. I am glad to see JPEG2k stuff hitting the open source world. Very nice! That said - It would be nice if somebody could make this work without OpenURL. Frankly I would much prefer the normal URI: http://an.example.org/ds/CB_TM_QQ432?level=4rotate=0y=899x=1210h=657w=1106 [1] to the OpenURL: http://an.example.org/djatoka/resolver? url_ver=Z39.88-2004 rft_id=info:lanl-repo/ svc_id=info:lanl-repo/svc/getRegion svc_val_fmt=info:ofi/fmt:kev:mtx:jpeg2000 svc.format=image/jpeg svc.level=4 svc.rotate=0 svc.region=899,1210,657,1106 and - so does the web, generally, consider that nobody uses OpenURL. I notice also that the example ajax tool put a duplicate URI box in the lower left hand corner for permanent URIs. It would be nice to have a ‘bookmark this’ type link - as in google maps, if the current bookmarkable URI is not going to be reflected in the location bar. best, Erik 1. I have left out the HTTP Accept header, part of the HTTP request but not part of the URI which is a more expressive replacement for the svc.format=image/jpeg parameter. pgpUeBs0lMEnp.pgp Description: PGP signature