[ https://issues.apache.org/jira/browse/NUTCH-932?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12928355#action_12928355 ]
Andrzej Bialecki commented on NUTCH-932: ----------------------------------------- Examples (with the db equivalent to the one in db.formatted.gz): {code} $ curl -s 'http://localhost:8192/nutch/db?fields=url&end=http://www.freebsd.org/&start=http://www.egothor.org/'| ./json_pp [ { "url": "http://www.egothor.org/" }, { "url": "http://www.freebsd.org/" } ] {code} {code} $ curl -s 'http://localhost:8192/nutch/db?fields=url,outlinks,markers,protocolStatus,parseStatus,contentType&start=http://www.getopt.org/&end=http://www.getopt.org/'| ./json_pp [ { "contentType": "text/html", "url": "http://www.getopt.org/", "markers": { "_updmrk_": "1288890451-1134865895" }, "parseStatus": "success/ok (1/0), args=[]", "protocolStatus": "SUCCESS, args=[]", "outlinks": { "http://www.getopt.org/luke/": "Luke", "http://www.getopt.org/ecimf/contrib/ONTO/REA": "REA Ontology page", "http://www.getopt.org/CV.pdf": "CV here", "http://www.getopt.org/utils/build/api": "API", "http://svn.apache.org/viewvc/hadoop/hbase/trunk/src/java/org/apache/hadoop/hbase/util/JenkinsHash.java": "available here", "http://www.getopt.org/murmur/MurmurHash.java": "MurmurHash.java", "http://www.ebxml.org/": "ebXML / ebTWG", "http://www.freebsd.org/": "FreeBSD", "http://www.getopt.org/luke/webstart.html": "Launch with Java WebStart", "http://www.freebsd.org/%7Epicobsd": "PicoBSD", "http://home.comcast.net/~bretm/hash/6.html": "this discussion", "http://protege.stanford.edu/": "Protege", "http://jakarta.apache.org/lucene": "Lucene", "http://www.getopt.org/ecimf/contrib/ONTO/ebxml": "ebXML Ontology", "http://www.getopt.org/ecimf/": "here", "http://www.isthe.com/chongo/tech/comp/fnv/": "his website", "http://www.getopt.org/stempel/index.html": "Stempel", "http://www.sigram.com/": "SIGRAM", "http://www.egothor.org/": "Egothor", "http://thinlet.sourceforge.net/": "Thinlet", "http://www.getopt.org/utils/dist/utils-1.0.jar": "binary", "http://www.ecimf.org/": "ECIMF" } } ] {code} > Bulk REST API to retrieve crawl results as JSON > ----------------------------------------------- > > Key: NUTCH-932 > URL: https://issues.apache.org/jira/browse/NUTCH-932 > Project: Nutch > Issue Type: New Feature > Components: REST_api > Affects Versions: 2.0 > Reporter: Andrzej Bialecki > Assignee: Andrzej Bialecki > Attachments: db.formatted.gz, NUTCH-932.patch, NUTCH-932.patch > > > It would be useful to be able to retrieve results of a crawl as JSON. There > are a few things that need to be discussed: > * how to return bulk results using Restlet (WritableRepresentation subclass?) > * what should be the format of results? > I think it would make sense to provide a single record retrieval (by primary > key), all records, and records within a range. This incidentally matches well > the capabilities of the Gora Query class :) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.