I absolutely agree that it's not a simple solution, I wasn't trying to imply that it was. That said, trotting out a half-assed (if half!) implementation is probably less helpful than just using fuzzy search.
I think what you see with Google (and what I think would work 'better' in libraries) is suggestions based on phrases, rather than individual terms. After all, archaic spellings and misspellings are the proper spelling, in context. Now, what constitutes a 'phrase' is probably open to debate... At the very least, don't show suggestions for things that still will produce zero results. -Ross. On Thu, Sep 6, 2012 at 9:44 AM, Jonathan Rochkind <rochk...@jhu.edu> wrote: > Solr has a feature to make spelling suggestions based on the actual terms in > the corpus... but it's hardly a panacea. A straightforward naive > implementation of the Solr feature, on top of a large library catalog corpus, > in many of our experiences still gives odd and unuseful suggestions > (including sometimes suggesting typos from the corpus, or suggesting taking > an already 'correct' word and suggesting a different entirely different but > lexicographically similar word as a 'correction'). And then there's > figuring out the right UI (and managing to make it work on top of the Solr > feature) for multi-term querries where each independent part may or may not > have a 'correction'. > > Turns out spell suggestions is kind of hard. And it's kind of amazing that > google does it so well (and they use some fairly complex techniques to do so, > I think, based on a whole bunch of data and metadata they have including past > searches and clickthroughs, not just the corpus). > ________________________________________ > From: Code for Libraries [CODE4LIB@LISTSERV.ND.EDU] on behalf of Ross Singer > [rossfsin...@gmail.com] > Sent: Thursday, September 06, 2012 9:37 AM > To: CODE4LIB@LISTSERV.ND.EDU > Subject: Re: [CODE4LIB] U of Baltimore, Final Usability Report, link > resolvers -- MIA? > > On Thu, Sep 6, 2012 at 9:06 AM, Cindy Harper <char...@colgate.edu> wrote: >> I was going to comment that some of the Encore shortcomings mentioned in >> the PDf do seem to be addressed in current Encore versions, although some >> of these issues have to be addressed - for instance, there is a >> spell-check, but it can give some surprising suggestions, though >> suggestions do clue the user in to the fact that they might have a >> misspelling/typo. > > I wrote about the woeful state of "spelling suggestions" a couple of > years ago (among a lot of other things): > > http://www.inthelibrarywiththeleadpipe.org/2009/were-gonna-geek-this-mother-out/ > > (you can skip on down to the "In the Absence of Suggestion, There is > Always Search…" - it's pretty TL;DR-worthy) > > Basically, the crux of it is, as long as spelling suggestions are > based on standard dictionaries and not built /on the actual terms and > phrases in the collection/ it's going to basically be a worthless > feature. > > I do note there, though, that BiblioCommons apparently must build > their dictionaries on the metadata in the system. > > -Ross. > >> >> III's reaction to studies that report that users ignore the right-side >> panel of search options was to provide a skin that has only two columns - >> the facets on the left, and the search results on the middle-to-right. >> This pushes important facets like the tag cloud very far down the page, and >> causes a lot of scrolling, so I don't like this skin much. >> >> I recently asked a question on the encore users' list about how the tag >> cloud could be improved - currently it suggests the most common subfield a >> of the subject headings. I would think it should include the general, >> chronological, geographical subdivisions - subfields x,y,z. For instance, >> it doesn't provide good suggestions for improving the search "civil war" >> without these. A chronological subdivision would help a lot there. But >> then again, I haven't seen a prototype of how many relevant subdivisions >> this would result in - would the subdivisions drown out the main headings >> in the tag cloud? >> >> Cindy Harper, Systems Librarian >> Colgate University Libraries >> char...@colgate.edu >> 315-228-7363 >> >> >> >> On Wed, Sep 5, 2012 at 5:30 PM, Jonathan LeBreton <lebre...@temple.edu>wrote: >> >>> Lucy Holman, Director of the U Baltimore Library, and a former colleague >>> of mine at UMBC, got back to me about this. Her reply puts this >>> particular document into context. It is an interesting reminder that not >>> everything you find on the web is as it seems, and it certainly is not >>> necessarily the final word. We gotta go buy the book! >>> Lucy is off-list, but asked me to post this on her behalf. >>> Her contact information is below, though.... >>> >>> Very interesting discussion This issue of what is right and feasible in >>> discovery services and how to configure it is important stuff for many of >>> our libraries and we should be able to build on the findings and >>> experiences of others rather than re-inventing the wheel locally.... (We >>> use Summon) >>> >>> - Jonathan LeBreton >>> >>> >>> ------------------------ begin Lucy's explanation -------------- >>> >>> The full study and analysis are included in Chapter 14 of a new book, >>> Planning and Implementing Resource Discovery Tools in Academic Libraries, >>> Mary P. Popp and Diane Dallis (Eds). >>> >>> The project was part of a graduate Research Methods course in the >>> University of Baltimore's MS in Interaction Design and Information >>> Architecture program. Originally groups within the course conducted >>> task-based usability tests on EDS, Primo, Summon and Encore. >>> Unfortunately, the test environment of Encore led to many usability issues >>> that we believed were more a result of the test environment than the >>> product itself; therefore we did not report on Encore in the final >>> analysis. The study (and chapter) does offers findings on the other three >>> discovery tools. >>> >>> There were six student groups in the course; each group studied two tools >>> with the same user population (undergrad, graduate and faculty) so that >>> each tool was compared against the other three with each user population >>> overall. The .pdf that you found was the final report of one of those six >>> groups, so it only addresses two of the four tools. The chapter is the >>> only document that pulls the six portions of the study together. >>> >>> I would be happy to discuss this with any of you individually if you need >>> more information. >>> >>> Thanks for your interest in the study. >>> >>> >>> Lucy Holman, DCD >>> Director, Langsdale Library >>> University of Baltimore >>> 1420 Maryland Avenue >>> Baltimore, MD 21201 >>> 410-837-4333 >>> >>> ------------------------- end insert -------------------- >>> >>> Jonathan LeBreton >>> Sr. Associate University Librarian >>> Temple University Libraries >>> Paley M138, 1210 Polett Walk, Philadelphia PA 19122 >>> voice: 215-204-8231 >>> fax: 215-204-5201 >>> mobile: 215-284-5070 >>> email: lebre...@temple.edu >>> email: jonat...@temple.edu >>> >>> >>> > -----Original Message----- >>> > From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of >>> > karim boughida >>> > Sent: Tuesday, September 04, 2012 5:09 PM >>> > To: CODE4LIB@LISTSERV.ND.EDU >>> > Subject: Re: [CODE4LIB] U of Baltimore, Final Usability Report, link >>> resolvers -- >>> > MIA? >>> > >>> > Hi Tom, >>> > Top players are EDS, Primo and Summon....the only reason I see encore in >>> the >>> > mix is if you have other III products which is not the case of Ubalt >>> library. They >>> > have now worldcat? Encore vs Summon is an easy win for summon. >>> > >>> > Let's wait for Jonathan LeBreton (Thanks BTW). >>> > >>> > Karim Boughida >>> > >>> > On Tue, Sep 4, 2012 at 4:26 PM, Tom Pasley <tom.pas...@gmail.com> wrote: >>> > > Yes, I'm curious to know too! Due to database/resource matching or >>> > > coverage perhaps (anyone's guess). >>> > > >>> > > Tom >>> > > >>> > > On Wed, Sep 5, 2012 at 7:50 AM, karim boughida <kbough...@gmail.com> >>> > wrote: >>> > > >>> > >> Hi All, >>> > >> Initially EDS, Primo, Summon, and Encore were considered but only >>> > >> Encore and Summon were tested. Do we know why? >>> > >> >>> > >> Thanks >>> > >> Karim Boughida >>> > >> >>> > >> >>> > >> On Tue, Sep 4, 2012 at 10:44 AM, Jonathan Rochkind <rochk...@jhu.edu> >>> > >> wrote: >>> > >> > Hi helpful code4lib community, at one point there was a report >>> online at: >>> > >> > >>> > >> > >>> > >> http://student-iat.ubalt.edu/students/kerber_n/idia642/Final_Usabilit >>> > >> y_Report.pdf >>> > >> > >>> > >> > David Walker tells me the report at that location included findings >>> > >> > about SFX and/or other link resolvers. >>> > >> > >>> > >> > I'm really interested in reading it. But it's gone from that >>> > >> > location, >>> > >> and >>> > >> > I'm not sure if it's somewhere else (I don't have a title/author to >>> > >> search >>> > >> > for other than that URL, which is not in google cache or internet >>> > >> archive). >>> > >> > >>> > >> > Is anyone reading this familiar with the report? Perhaps one of the >>> > >> authors >>> > >> > is reading this, or someone reading it knows one of the authors and >>> > >> > can >>> > >> be >>> > >> > put me in touch? Or knows someone likely in the relevant dept at >>> > >> > ubalt >>> > >> and >>> > >> > can be put me in touch? Or has any other information about this >>> > >> > report or ways to get it? >>> > >> > >>> > >> > Thanks! >>> > >> > >>> > >> > Jonathan >>> > >> >>> > >> >>> > >> >>> > >> -- >>> > >> Karim B Boughida >>> > >> kbough...@gmail.com >>> > >> kbough...@library.gwu.edu >>> > >> >>> > >>> > >>> > >>> > -- >>> > Karim B Boughida >>> > kbough...@gmail.com >>> > kbough...@library.gwu.edu >>>