Re: [CODE4LIB] Good advanced search screens

2008-11-17 Thread Stephens, Owen
Although I don't always agree with him, Jacob Nielsen has advice on the
provision of 'Advanced Search' - essentially, most users cannot use it
effectively - http://www.useit.com/alertbox/20010513.html The only
problem with this is that the short report doesn't make it very clear
what 'advanced search' might consist of, and where users have a problem
(it mentions that most users don't do Boolean, but I'm not sure this is
what I'd regard as 'advanced search')

The longer (charged for) report might have more detailed advice - anyone
tried it?

Owen

Owen Stephens
Assistant Director: eStrategy and Information Resources
Central Library
Imperial College London
South Kensington Campus
London
SW7 2AZ
 
t: +44 (0)20 7594 8829
e: [EMAIL PROTECTED]
 -Original Message-
 From: Code for Libraries [mailto:[EMAIL PROTECTED] On Behalf
Of
 Sean Hannan
 Sent: 15 November 2008 16:19
 To: CODE4LIB@LISTSERV.ND.EDU
 Subject: Re: [CODE4LIB] Good advanced search screens
 
 If you haven't already, I'd suggest that you poke around in the IxDA
 mailing list archives (http://www.ixda.org/).  I find that list (and
 its members) invaluable for design/usability best practices (often
 backed up with published research).
 
 Luke Wroblewski's blog (http://www.lukew.com/ff/index.asp) and book
 (http://rosenfeldmedia.com/books/webforms/info/description/) might be
 other good places to look for inspiration.
 
 -Sean
 
 Sean Hannan
 Web Developer
 Sheridan Libraries
 Johns Hopkins University
 [EMAIL PROTECTED]
 
  Walker, David [EMAIL PROTECTED] 11/14/2008 4:48 PM 
 I'm working on an advanced search screen as part of our WorldCat API
 project.
 
 WorldCat has dozens of indexes and a ton of limiters.  So many, in
 fact, that it's rather daunting trying to design it all in a way that
 isn't just a big dump of fields and check boxes that only a cataloger
 could decipher.
 
 So I'm looking for examples of good advanced search screens (for
 bibliographic databases or otherwise) to gain some inspiration.
 Thanks!
 
 --Dave
 
 ==
 David Walker
 Library Web Services Manager
 California State University
 http://xerxes.calstate.edu


Re: [CODE4LIB] Reference string parsing software available: ParsCit v080402

2008-11-17 Thread MJ Suhonos

Hi Jonathan,

PS: And indeed, mapping to OpenURL 1.0 is _exactly_ what I need to  
do. Sounds like I should look into L8X?


There is a demo/testing site at http://www.lemon8.org ; you might want  
to try playing around there with some citations to get a feel for how  
it works without having to download or install anything.


It would be convenient if there were a way to choose which parsers  
to use with L8X, via an API or configuration if I install the  
software locally. I'm not sure I'll need to pass the citation to  
_all_ of them. I am going to be doing this in realtime while the  
user is waiting, so speed matters. But just ParsCit alone isn't  
doing the job, perhaps ParsCit+regex plus maybe one more would be  
good enough.


Absolutely -- setting a list of default parsers to use, and the  
ability to turn them on/off on-the-fly (ie. while editing any  
particular citation) is something that's been on the to-do list for a  
while.  I'm hoping to have it done in the next week or two.


I should add that having just added ParsCit, I've actually found that  
it doesn't do nearly as good a job as some of the other parsers, but  
that may just be on the citation formats that I happen to work with.   
Part of the way L8X is designed is to assign a simple statistical  
score to estimate how accurately each parser performs; one feature  
I've been planning is to simply allow a threshold to ignore results  
from parsers which have done a poor job on that particular citation.


There is some additional functionality to take a parsed citation and  
look it up in a number of online indexes, and attempt to fetch  
correct information, both to supplement, say, an incomplete  
citation, and provide an additional level of quality improvement, but  
that's a somewhat more complex topic that I'm hoping to make the  
subject of a submission to the Code4Lib journal.  :-)


MJ


MJ Suhonos [EMAIL PROTECTED] 11/14/08 3:18 PM 

Hi all,

John, the supplemented approach you describe is how we go about it in
our Lemon8-XML (L8X) software (http://pkp.sfu.ca/lemon8); The way L8X
handles parsing is it passes the original unparsed string to a number
of different parsers in turn (Freecite, each of the 3 Paracite
parsers, and a home-grown regex parser), does a little cleaning and
normalization, and then hands the results to the user to select the
correct values for each element.

Most of the time, it actually does a pretty good job of detecting the
right elements -- in fact, numeric stuff like volume, issue, pages,
etc. tend to be more accurate than names and titles, mostly because of
the larger variance in the latter.  Our experience has been that
relying on a single approach (machine-learning vs. format-rule-based
vs. regular-expression) is less reliable than getting partial matches
from various approaches, and then assembling them.  In this case, the
whole is in fact greater than the sum of the parts.

I haven't added the ParsCit web service explicitly since a SOAP-based
interface is a bit more cumbersome in PHP than FreeCite's POST-type
interface, but I'll make a point of doing so now.  Incrementally
adding services that all map to the same citation elements (we use the
OpenURL 1.0 fields, with a few aberrations) means it's very easy to
increase the accuracy by simply adding another parsing plugin/service.

You'd have to pull out the relevant classes from L8X to get a
standalone parser, but since this is one of the more appealing aspects
of the software for many people, we're looking at making a simple API
in L8X to just do the citation parsing, possibly without the UI to
take it from semi-automated to completely automatic.

MJ

On 14-Nov-08, at 12:07 AM, Jonathan Rochkind wrote:


Thanks Min, this is a great project, that I keep trying to find time
to investigate more. Don't apologize for keeping us updated, please
continue to!

Do you know if any of the improvements have improved detection of
volume/issue/page# information? For what I want to use it for,
reasonably accurate parsing of volume/issue/page# is needed, and so
far whenever I've looked at demos, this seems to be something that
all of these machine-learning-type approaches do pretty awfully at.
(I wonder if you are not including this in your training much,
because it isn't neccesary for your purposes to have volume/issue/
page#?)

I also have wondered if it would make sense to take a machine-
learning-type approach to begin with, but then supplement it with
formal-rule-based parsing to attempt to get vol/issue/page#
according to common patterns?

I don't have too much time to try work on this myself, but if anyone
who is working on these various citation parsing efforts could
improve volume/issue/page# to a reasonable level, it would make the
libraries useful for a much greater range of applications.

Jonathan



Min-Yen Kan [EMAIL PROTECTED] 11/13/08 8:30 PM 

Dear all:

(Sorry to resurrect an old thread...)

We've seen the release of several new freely 

[CODE4LIB] Announcement: LuSql: Database to Lucene indexing

2008-11-17 Thread Glen Newton - NRC/CNRC CISTI/ICIST Research
I am proud to announce LuSql:

LuSql is a simple but powerful tool for building Lucene indexes from 
relational databases. It is a command-line Java application for the 
construction of a Lucene index from an arbitrary SQL query of a
JDBC-accessible SQL database. It allows a user to control a number of
parameters, including the SQL query to use, individual
indexing/storage/term-vector nature of fields, analyzer, stop word
list, and other tuning parameters. In its default mode it uses
threading to take advantage of multiple cores.

LuSql can handle complex queries, allows for additional per record
sub-queries, and has a plug-in architecture for arbitrary Lucene
document manipulation. Its only dependencies are three Apache Commons
libraries, the Lucene core itself, and a JDBC driver.

LuSql has been extensively tested, including a large 6+ million
full-text  article metadata document collection, producing an 86GB
Lucene index. 

http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/index.php/LuSql

If you have any questions, please contact me.

Thanks,

Glen Newton :-)

-- 
Glen Newton | [EMAIL PROTECTED]
Researcher, Information Science, CISTI Research
 NRC W3C Advisory Committee Representative
http://tinyurl.com/yvchmu
tel/tél: 613-990-9163 | facsimile/télécopieur 613-952-8246
Canada Institute for Scientific and Technical Information (CISTI)
National Research Council Canada (NRC)| M-55, 1200 Montreal Road
http://www.nrc-cnrc.gc.ca/
Institut canadien de l'information scientifique et technique (ICIST) 
Conseil national de recherches Canada | M-55, 1200 chemin Montréal
Ottawa, Ontario K1A 0R6  
Government of Canada | Gouvernement du Canada   
--


Re: [CODE4LIB] Good advanced search screens

2008-11-17 Thread Susan Teague Rector
Peter Morville has been putting search examples into a flickr collection 
for an upcoming book he's writing:
http://www.flickr.com/photos/morville/collections/72157603785835882/ and 
http://www.findability.org/archives/000194.php


There are some great examples of both simple and complex search

Best,

--
Susan Teague Rector
Web Applications Manager
Library Information Systems, VCU Libraries
804.827.3554 | [EMAIL PROTECTED]





Walker, David [EMAIL PROTECTED] 11/14/2008 4:48 PM 
  

I'm working on an advanced search screen as part of our WorldCat API
project.

WorldCat has dozens of indexes and a ton of limiters.  So many, in
fact, that it's rather daunting trying to design it all in a way that
isn't just a big dump of fields and check boxes that only a cataloger
could decipher.

So I'm looking for examples of good advanced search screens (for
bibliographic databases or otherwise) to gain some inspiration.
Thanks!

--Dave

==
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu



Re: [CODE4LIB] Good advanced search screens

2008-11-17 Thread Jonathan Rochkind
I think there are ways to dispense with it without actually dispensing 
with it.


David, you've of course seen what I did with Xerxes, where instead of 
calling it 'advanced search' I call it 'more options'. That was of 
course a much simpler case--the difference between Metalib's advanced 
search and simple search is pretty small, there isn't that many 
features available even using all the features in Metalib search. Unlike 
WorldCat, where the difference between a 'start' screen and a 'full' 
list of options is greater.


I think it makes sense that you might need an initial search screen with 
fewer options, and a way to then get more search options (more options 
typically including more fields to search/limit, as well as more 
complicated ways to boolean combine those searches).


But I think we should find ways to provide that more functionality 
other than the typical click here to see advanced search pattern. I 
think it should never be called 'advanced search'. Something like 'more 
options' is better.  But maybe not even just one 'more options' link. 
What exactly is it that this additional functionality is providing, and 
what are the use cases for it? Maybe provide links to add different 
components of this advanced functionality to the search page, identified 
by explaining what they are/are for, instead of just calling them 
advanced.  They don't need to be added all at once neccesarily. For 
instance, if you want to add another search field to the screen for 
boolean combination, a button that says add another search field seems 
appropriate. Click it to add a second one, click it again to add a third 
one, etc.  No advanced search, just offering functionality.


I also like the idea of allowing a syntax for expressing fielded search 
and boolean combination even in the initial 'simple' search box. Even 
Google does this. Most users might not use it, but for power users it's 
awfully convenient.  CQL would be one potential choice for a textual 
query syntax that can be entered in the initial search field.


Jonathan

Walker, David wrote:

How about dispensing altogether with the
basic/advanced dichotomy in a search interface?



I'm not sure I can dispense with it completely, Peter.

As Peter Morville said on the site Susan posted: [I]t may be worth offering advanced features 
that are useful to a small yet important subset of users.  I'll give you three guesses as to 
who my small yet important subset of users are, and the first two don't count. ;-)

--Dave
==
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu

From: Code for Libraries [EMAIL PROTECTED] On Behalf Of Peter Schlumpf [EMAIL 
PROTECTED]
Sent: Saturday, November 15, 2008 5:45 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: Re: [CODE4LIB] Good advanced search screens

How about dispensing altogether with the basic/advanced dichotomy in a search 
interface?  Just create a well designed interface that's consistent and works 
well for all users.  The basic/advanced dichotomy is really quite arbitrary, 
and exists in the mind of the designer.

One thing that seems to be underappreciated these days is a straightforward and 
flexible search syntax.  A command line in the search field may be a much more 
elegant and consistent solution than trying to make all options available and 
visible in a GUI.

Make the basic features of the search interface clear and easy to use, but design the 
interface in such a way that more advanced users can easily discover the 
features they need as they use it.  With this approach Basic and Advanced exist on a 
continuum.  There's a little learning curve but all users will have the motivation to 
learn to use the interface to the level that satisfies their needs, and in the long run 
probably find it much easier to use.

Peter

Peter Schlumpf
[EMAIL PROTECTED]
http://www.avantilibrarysystems.com



-Original Message-
  

From: Walker, David [EMAIL PROTECTED]
Sent: Nov 14, 2008 4:48 PM
To: CODE4LIB@LISTSERV.ND.EDU
Subject: [CODE4LIB] Good advanced search screens

I'm working on an advanced search screen as part of our WorldCat API project.

WorldCat has dozens of indexes and a ton of limiters.  So many, in fact, that 
it's rather daunting trying to design it all in a way that isn't just a big 
dump of fields and check boxes that only a cataloger could decipher.

So I'm looking for examples of good advanced search screens (for bibliographic 
databases or otherwise) to gain some inspiration.  Thanks!

--Dave

==
David Walker
Library Web Services Manager
California State University
http://xerxes.calstate.edu



  


--
Jonathan Rochkind
Digital Services Software Engineer
The Sheridan Libraries
Johns Hopkins University
410.516.8886 
rochkind (at) jhu.edu


[CODE4LIB] last week for presentation proposals

2008-11-17 Thread Birkin James Diana

We have some great ones, and want more. Submit!

http://library.brown.edu/code4libcon09/proposals/

---
Birkin James Diana
Programmer, Integrated Technology Services
Brown University Library
[EMAIL PROTECTED]


[CODE4LIB] Drupal4Lib Camp

2008-11-17 Thread Amanda Etches-Johnson

This is going out to a couple of lists. Apologies for the duplication.

---

Darien Library will be hosting a Drupal4Lib Camp on Friday, February  
27, 2009 from 9 am to 4 pm.


The camp will be an opportunity for libraries who are working with  
Drupal, or interested in implementing Drupal, to get together, share  
experiences, solve problems, and collaborate. This unconference will  
be a combination of a series of 10 min lightning talks given by Drupal  
veterans in the morning followed by break-out sessions in the afternoon.


Audio and video from Drupal4Lib Camp sessions will also be streamed  
lived online.


There is no registration fee. However, participation is limited to  
70.  Please register for the Drupal4Lib Camp athttp://drupalib.interoperating.info/node/167


---



ae-j
--
Amanda Etches-Johnson
User Experience Librarian
McMaster University Library
Mills L504H | 905.525.9140 x26006


Re: [CODE4LIB] djatoka

2008-11-17 Thread Erik Hetzner
At Fri, 14 Nov 2008 06:10:45 -0500,
Birkin James Diana [EMAIL PROTECTED] wrote:
 
 Yesterday I attended a session of the DLF Fall Forum at which Ryan  
 Chute presented on djatoka, the open-source jpeg2008 image-server he  
 and Herbert Van de Sompel just released.
 
 It's very cool and near the top of my crowded list of things to play  
 with.
 
 If any of you have had the good fortune to experiment with it or  
 implement it into some workflow, get over to the code4libcon09  
 presentation-proposal page pronto! And if you're as jazzed about it as  
 I am, and know it'll be as big in our community as I think it will,  
 consider a pre-conf proposal, too.

Hi -

This is a very cool tool. I am glad to see JPEG2k stuff hitting the
open source world. Very nice!

That said -

It would be nice if somebody could make this work without OpenURL.

Frankly I would much prefer the normal URI:

http://an.example.org/ds/CB_TM_QQ432?level=4rotate=0y=899x=1210h=657w=1106 
[1]

to the OpenURL:

http://an.example.org/djatoka/resolver?
url_ver=Z39.88-2004 
rft_id=info:lanl-repo/ 
svc_id=info:lanl-repo/svc/getRegion 
svc_val_fmt=info:ofi/fmt:kev:mtx:jpeg2000 
svc.format=image/jpeg 
svc.level=4 
svc.rotate=0 
svc.region=899,1210,657,1106

and - so does the web, generally, consider that nobody uses OpenURL.

I notice also that the example ajax tool put a duplicate URI box in
the lower left hand corner for permanent URIs. It would be nice to
have a ‘bookmark this’ type link - as in google maps, if the current
bookmarkable URI is not going to be reflected in the location bar.

best,
Erik

1. I have left out the HTTP Accept header, part of the HTTP request
but not part of the URI which is a more expressive replacement for the
svc.format=image/jpeg parameter.


pgpUeBs0lMEnp.pgp
Description: PGP signature