[jira] [Commented] (NUTCH-2153) Nutch REST API (DB) uses POST instead of GET to request
[ https://issues.apache.org/jira/browse/NUTCH-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978854#comment-14978854 ] Chris A. Mattmann commented on NUTCH-2153: -- Yeah I think we may want to do something async here too and use GET. Let's think about this. It may be a 1.12+ improvement though. At a minimum I think we can update to GET for 1.11. > Nutch REST API (DB) uses POST instead of GET to request > --- > > Key: NUTCH-2153 > URL: https://issues.apache.org/jira/browse/NUTCH-2153 > Project: Nutch > Issue Type: Bug > Components: REST_api >Affects Versions: 1.11 >Reporter: Aron Ahmadia >Priority: Trivial > Labels: memex > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2153) Nutch REST API (DB) uses POST instead of GET to request
[ https://issues.apache.org/jira/browse/NUTCH-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978841#comment-14978841 ] Aron Ahmadia commented on NUTCH-2153: - If it's asynchronous, use a POST and return a crawldb_job identifier that can be used to query if the job is complete. I've got mixed feelings on the synchronous case. I'm happy to follow Mattmann's guidance on this. > Nutch REST API (DB) uses POST instead of GET to request > --- > > Key: NUTCH-2153 > URL: https://issues.apache.org/jira/browse/NUTCH-2153 > Project: Nutch > Issue Type: Bug > Components: REST_api >Affects Versions: 1.11 >Reporter: Aron Ahmadia >Priority: Trivial > Labels: memex > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2153) Nutch REST API (DB) uses POST instead of GET to request
[ https://issues.apache.org/jira/browse/NUTCH-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978826#comment-14978826 ] Sujen Shah commented on NUTCH-2153: --- Hi [~ahmadia] and [~chrismattmann], Currently, while using Nutch REST services in local mode, the crawldb job gets executed pretty fast. But if the same is used in a distributed mode, the crawldb job can take up a fair amount of time. So issuing a GET request would make the client wait for a long time for the response. A POST request was used since the crawldb resource is created once a user issues a request and not precomputed (which is usually the case when a GET is used). The /db endpoint still requires development in the part where it can spin up threads for computation like the /job endpoint, and then provide a GET interface to query results. I have tried to use the same concept in the commoncrawldump service as that might also take up time as the amount of data crawled increases. I would like to know what are your thoughts to handle such cases, where issuing a GET requires computation of the resource. Thanks! > Nutch REST API (DB) uses POST instead of GET to request > --- > > Key: NUTCH-2153 > URL: https://issues.apache.org/jira/browse/NUTCH-2153 > Project: Nutch > Issue Type: Bug > Components: REST_api >Affects Versions: 1.11 >Reporter: Aron Ahmadia >Priority: Trivial > Labels: memex > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2153) Nutch REST API (DB) uses POST instead of GET to request
[ https://issues.apache.org/jira/browse/NUTCH-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978769#comment-14978769 ] Chris A. Mattmann commented on NUTCH-2153: -- Gotcha, thanks [~ahmadia] > Nutch REST API (DB) uses POST instead of GET to request > --- > > Key: NUTCH-2153 > URL: https://issues.apache.org/jira/browse/NUTCH-2153 > Project: Nutch > Issue Type: Bug > Components: REST_api >Affects Versions: 1.10 >Reporter: Aron Ahmadia >Priority: Trivial > Labels: memex > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2153) Nutch REST API (DB) uses POST instead of GET to request
[ https://issues.apache.org/jira/browse/NUTCH-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978764#comment-14978764 ] Aron Ahmadia commented on NUTCH-2153: - The API from https://wiki.apache.org/nutch/Nutch_1.X_RESTAPI: POST /db/crawldb with following { "type":"stats", "confId":"default", "crawlId":"crawl01", "args":{"someParam":"someValue"} } uses a POST to request information (stats). This should be a GET. > Nutch REST API (DB) uses POST instead of GET to request > --- > > Key: NUTCH-2153 > URL: https://issues.apache.org/jira/browse/NUTCH-2153 > Project: Nutch > Issue Type: Bug > Components: REST_api >Affects Versions: 1.10 >Reporter: Aron Ahmadia >Priority: Trivial > Labels: memex > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2153) Nutch REST API (DB) uses POST instead of GET to request
[ https://issues.apache.org/jira/browse/NUTCH-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14978748#comment-14978748 ] Chris A. Mattmann commented on NUTCH-2153: -- can you be more specific here, [~ahmadia]? > Nutch REST API (DB) uses POST instead of GET to request > --- > > Key: NUTCH-2153 > URL: https://issues.apache.org/jira/browse/NUTCH-2153 > Project: Nutch > Issue Type: Bug > Components: REST_api >Affects Versions: 1.10 >Reporter: Aron Ahmadia >Priority: Trivial > Labels: memex > -- This message was sent by Atlassian JIRA (v6.3.4#6332)