On 19 Mar 2008, at 2:23 PM, A wrote: > > On 19 Mar 2008, at 11:14, Christiaan Hofman wrote: >> Search groups are mostly custom stuff. Each type of search group is >> based on a server object that gets a search string (from the search >> field) and should return a list of publication items. It should get a >> string representation of the items that can be parsed by one of our >> string parsers (like bibtex, JSTOR, MARC, etc). This is not the case >> with what gets out of this: this is a page that must be parsed with >> much more complicated methods, including downloading links. It's not >> just parsing the string you get back. Moreover, it does not accept a >> general query string, but a very specific request. (Unless there is >> another form that does support a query string?) > > What are the exact specifications of a search string? Do you mean a > complex query with boolean connectors (as specified in > http://bibdesk.sourceforge.net/manual/BibDesk%20Help_10.html#SEC34)
It depends on the server. What I mean is: a query string that the user can type in the search field. > > ? What do you mean by saying DBLP is limited to "a very specific > request"? I did not see any URL that accepts something like a "? query=searchterm" query component or something. It is a URL with a particular syntax for passing the name. You cannot expect a user to exactly type the query in that specific form. So basically, you need to be able to translate a query string the user types to a request (URL) you cna send to the server. I did not see anything like that in what you've told us. > As for the result, I think again that it should not be a problem. I > know I could very quickly code in Java a "protocol" class that takes a > (therefore limited) query, that posts it in the form on the DBLP site, > that takes back the html result, that iterates over each entry of the > table, that, for one such entry, looks for the DBLP reference, that > gets the corresponding web page, that extracts from it the bibtex > entry, that adds it in the result list. All this would be transparent > to BibDesk. Again, I do not know how to do this in Obj-C, but in Java, > it would definitely not be a "much more complicated method" for me. > I really don't want to mix with Java. That's messy, and moreover the Java-Cocoa bridge is deprecated. What I'm saying is that *search groups* are much more complicated to implement. It needs to keep track of state information, send notifications when it is done, etc (because some search group may work asynchronously). It's much much more than just parsing. In a web group the only thing that is needed is just the scraper, all the other code is generic and done by the group itself. > Also it could be worth mentioning that there is a raw XML file of all > the DBLP database (http://dblp.uni-trier.de/xml/dblp.xml). Again, I > have no real-life experience of developing under Cocoa, but there is > surely an API to easily build and query databases. The question is > whether or not that is possible without having to download the actual > xml file (which takes more than 400MB IIRC). At the very least, there > could be a local index file? Again, I don't know what I am talking > about, I am just mentioning this in case it's useful. > There's generic API to do al kind of things. I'm very much against building a local index file. >> Obj-C is pretty easy, and if you know Java, most of it is just a >> slightly different syntax. It uses [anObject method:argument] instead >> of anObject.method(argument), etc. > > Yes indeed, I know pretty well the specifications of ObjC (I used to > master SmallTalk quite well as well), but that isn't enough to > (quickly) code what we're talking about (for example, I have no idea > of library should I use to manipulate HTML). But I *will* at least > give a shot at the Google scrapper. > > A I think you can get a lot of the ideas from the google scrapers or the other web scrapers (all the parsers referenced in BDSKWebParser) , because it's very similar to what you need. Also the xcode developer tools contain a lot of documentation. Christiaan ------------------------------------------------------------------------- This SF.net email is sponsored by: Microsoft Defy all challenges. Microsoft(R) Visual Studio 2008. http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/ _______________________________________________ Bibdesk-develop mailing list Bibdesk-develop@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/bibdesk-develop