Dear Karen, I am conducting a research experiment on automatic text classification and I am trying to retrieve top matching bib records (which include DDC fields) for a set of keyphrases extracted from a given document. So, I suppose this is a rather exceptional use case. In fact, the right approach for this experiment is to process the full dump of WorldCat database directly rather than sending a limited number of queries via the API.
I read here: http://dltj.org/article/worldcat-lld-may-become-available under-odc-by/ that WorldCat might become available as open linked data in future, which would solve my problem and help similar text mining projects. However, I wonder if it is currently available to researchers under a research/non-commercial use license agreement. Regards, Arash -----Original Message----- From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On Behalf Of Karen Coombs Sent: 17 May 2012 08:37 To: CODE4LIB@LISTSERV.ND.EDU Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records without a DDC no from the result set I forwarded this thread to the Product Manager for the WorldCat Search API. She responded back that unfortunately this query is not possible using the API at this time. FYI, the SRU interface to WorldCat Search API doesn't currently support any scan type searches either. Is there a particular use case you're trying to support? Know that would help us document this as a possible enhancement. Karen Karen Coombs Senior Product Analyst Web Services OCLC coom...@oclc.org On Wed, May 16, 2012 at 9:49 PM, Arash.Joorabchi <arash.joorab...@ul.ie> wrote: > Hi Andy, > > > > I am a SRU newbie myself, so I don't know how this could be achieved > using scan operations and could not find much info on SRU website > (http://www.loc.gov/standards/sru/). > > As for the wildcards, according to this guide: > http://www.oclc.org/support/documentation/worldcat/searching/refcard/sea > rchworldcatquickreference.pdf the symbols should be preceded by at least > 3 characters, and therefore clauses like: > > > > ... AND srw.dd=* > > ... AND srw.dd=?.* > > ... AND srw/dd=###.* > > ... AND srw/dd=?3.* > > > > > > do not work and result in the following error: > > Diagnostics > > Identifier: > > info:srw/diagnostic/1/9 > > Meaning: > > > > Details: > > > > Message: > > Not enough chars in truncated term:Truncated words too short(9) > > > > > > Thanks, > > Arash > > > > ________________________________ > > From: Houghton,Andrew [mailto:hough...@oclc.org] > Sent: 16 May 2012 11:58 > To: Arash.Joorabchi > Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of records > without a DDC no from the result set > > > > I'm not an SRU guru, but is it possible to do a scan and look for a > postings of zero? > > > > Andy. > > On May 16, 2012, at 6:39, "Arash.Joorabchi" <arash.joorab...@ul.ie> > wrote: > > Hi mark, > > Srw.dd=* does not work either: > > Identifier: info:srw/diagnostic/1/27 > Meaning: > Details: srw.dd > Message: The index [srw.dd] did not include a searchable > value > > I suppose the only option left is to retrieve everything and > filter the results on the client side. > > Thanks for your quick reply. > Arash > > > -----Original Message----- > From: Code for Libraries [mailto:CODE4LIB@LISTSERV.ND.EDU] On > Behalf Of Mike Taylor > Sent: 16 May 2012 10:43 > To: CODE4LIB@LISTSERV.ND.EDU > Subject: Re: [CODE4LIB] WorldCat SRU queries - elimination of > records without a DDC no from the result set > > There is no standard way in CQL to express "field X is not > empty". > Depending on implementations, NOT srw.dd="" might work (but > evidently > doesn't in this case). Another possibility is srw.dd=*, but > again > that may or may not work, and might be appallingly inefficient > if it > does. NOT srw.dd=null will definitely not work: "null" is not a > special word in CQL. > > -- Mike. > > > On 16 May 2012 10:32, Arash.Joorabchi <arash.joorab...@ul.ie> > wrote: > > Hi all, > > > > I am sending SRU queries to the WorldCat in the following > form: > > > > > > String host = > > "http://worldcat.org/webservices/catalog/search/"; > > String query = "sru?query=srw.kw=\"" + keyword + > "\"" > > + " AND srw.ln exact \"eng\"" > > + " AND srw.mt all \"bks\"" > > + " AND srw.nt=\"" + keyword + > "\"" > > + "&servicelevel=full" > > + "&maximumRecords=100" > > + "&sortKeys=relevance,,0" > > + "&wskey=[wskey]"; > > > > And it is working fine, however I'd like to limit the results > to those > > records that have a DDC number assigned to them, but I don't > know what's > > the right way to specify this limit in the query. > > > > NOT srw.dd="" > > NOT srw.dd=null > > > > Neither of above work > > > > > > Thanks, > > Arash > > > > ________________________________ > > No virus found in this message. > Checked by AVG - www.avg.com > Version: 2012.0.2176 / Virus Database: 2425/5001 - Release Date: > 05/15/12 ----- No virus found in this message. Checked by AVG - www.avg.com Version: 2012.0.2176 / Virus Database: 2425/5004 - Release Date: 05/16/12