[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974207#comment-14974207 ] Aron Ahmadia commented on NUTCH-2149: - Hi Sujen and Chris. I think I'm missing something in this API. Looking through, most of the services seem to require a "full path" to the file that should be read. This seems like both a security risk (allowing the user to read arbitrary files on the file system) and an unnecessary load on the API. Does Nutch have some concept of a "root" directory for each configuration? Also, it's not clear how to get a listing of the link, node, and sequence files. Is this available somewhere else in the REST API? How would I (as a REST interface user) know what path to provide? > REST endpoint to read Nutch sequence files > -- > > Key: NUTCH-2149 > URL: https://issues.apache.org/jira/browse/NUTCH-2149 > Project: Nutch > Issue Type: New Feature > Components: REST_api >Reporter: Sujen Shah >Assignee: Sujen Shah > Labels: memex > Fix For: 1.12 > > > This endpoint enables reading of the webgraph data like nodes, links and any > other sequence file in the Nutch ecosystem via a RESTful interface. > The current API documentation for this Reader endpoint is available at - > http://docs.nutchpytonutchrestapi.apiary.io/ > Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973383#comment-14973383 ] Hudson commented on NUTCH-2149: --- SUCCESS: Integrated in Nutch-trunk #3295 (See [https://builds.apache.org/job/Nutch-trunk/3295/]) NUTCH-2149 REST endpoint to read Nutch sequence files (Sujen Shah) (sujen: [http://svn.apache.org/viewvc/nutch/trunk/?view=rev&rev=1710468]) * trunk/CHANGES.txt * trunk/src/java/org/apache/nutch/service/NutchReader.java * trunk/src/java/org/apache/nutch/service/NutchServer.java * trunk/src/java/org/apache/nutch/service/impl/LinkReader.java * trunk/src/java/org/apache/nutch/service/impl/NodeReader.java * trunk/src/java/org/apache/nutch/service/impl/SequenceReader.java * trunk/src/java/org/apache/nutch/service/model/request/ReaderConfig.java * trunk/src/java/org/apache/nutch/service/resources/ReaderResouce.java > REST endpoint to read Nutch sequence files > -- > > Key: NUTCH-2149 > URL: https://issues.apache.org/jira/browse/NUTCH-2149 > Project: Nutch > Issue Type: New Feature > Components: REST_api >Reporter: Sujen Shah >Assignee: Sujen Shah > Labels: memex > Fix For: 1.12 > > > This endpoint enables reading of the webgraph data like nodes, links and any > other sequence file in the Nutch ecosystem via a RESTful interface. > The current API documentation for this Reader endpoint is available at - > http://docs.nutchpytonutchrestapi.apiary.io/ > Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973372#comment-14973372 ] Sujen Shah commented on NUTCH-2149: --- Ohh I didn't know that, will do that from now on. Thanks :) > REST endpoint to read Nutch sequence files > -- > > Key: NUTCH-2149 > URL: https://issues.apache.org/jira/browse/NUTCH-2149 > Project: Nutch > Issue Type: New Feature > Components: REST_api >Reporter: Sujen Shah >Assignee: Sujen Shah > Labels: memex > Fix For: 1.12 > > > This endpoint enables reading of the webgraph data like nodes, links and any > other sequence file in the Nutch ecosystem via a RESTful interface. > The current API documentation for this Reader endpoint is available at - > http://docs.nutchpytonutchrestapi.apiary.io/ > Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973368#comment-14973368 ] Chris A. Mattmann commented on NUTCH-2149: -- in your commit msg for the future [~sujenshah] reference the Github issue (aka say "this closes #80") in your commit message and asfgit user will close the issue on Github for ya. > REST endpoint to read Nutch sequence files > -- > > Key: NUTCH-2149 > URL: https://issues.apache.org/jira/browse/NUTCH-2149 > Project: Nutch > Issue Type: New Feature > Components: REST_api >Reporter: Sujen Shah >Assignee: Sujen Shah > Labels: memex > Fix For: 1.12 > > > This endpoint enables reading of the webgraph data like nodes, links and any > other sequence file in the Nutch ecosystem via a RESTful interface. > The current API documentation for this Reader endpoint is available at - > http://docs.nutchpytonutchrestapi.apiary.io/ > Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973360#comment-14973360 ] Sujen Shah commented on NUTCH-2149: --- Committed 1710468 > REST endpoint to read Nutch sequence files > -- > > Key: NUTCH-2149 > URL: https://issues.apache.org/jira/browse/NUTCH-2149 > Project: Nutch > Issue Type: New Feature > Components: REST_api >Reporter: Sujen Shah > Labels: memex > Fix For: 1.12 > > > This endpoint enables reading of the webgraph data like nodes, links and any > other sequence file in the Nutch ecosystem via a RESTful interface. > The current API documentation for this Reader endpoint is available at - > http://docs.nutchpytonutchrestapi.apiary.io/ > Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973355#comment-14973355 ] ASF GitHub Bot commented on NUTCH-2149: --- Github user sujen1412 closed the pull request at: https://github.com/apache/nutch/pull/80 > REST endpoint to read Nutch sequence files > -- > > Key: NUTCH-2149 > URL: https://issues.apache.org/jira/browse/NUTCH-2149 > Project: Nutch > Issue Type: New Feature > Components: REST_api >Reporter: Sujen Shah > Labels: memex > Fix For: 1.12 > > > This endpoint enables reading of the webgraph data like nodes, links and any > other sequence file in the Nutch ecosystem via a RESTful interface. > The current API documentation for this Reader endpoint is available at - > http://docs.nutchpytonutchrestapi.apiary.io/ > Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files
[ https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970581#comment-14970581 ] ASF GitHub Bot commented on NUTCH-2149: --- GitHub user sujen1412 opened a pull request: https://github.com/apache/nutch/pull/80 NUTCH-2149 REST endpoint to read Nutch sequence files Removed debug statement Corrected indentation and line width You can merge this pull request into a Git repository by running: $ git pull https://github.com/sujen1412/nutch NUTCH-2149 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nutch/pull/80.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #80 commit 0f71d8d6cab9f88421176beb61ed82efc4e20662 Author: Sujen Shah Date: 2015-10-23T02:49:50Z Reader endpoint for Nutch REST Removed debug statement Corrected indentation and line width > REST endpoint to read Nutch sequence files > -- > > Key: NUTCH-2149 > URL: https://issues.apache.org/jira/browse/NUTCH-2149 > Project: Nutch > Issue Type: New Feature > Components: REST_api >Reporter: Sujen Shah > Labels: memex > Fix For: 1.12 > > > This endpoint enables reading of the webgraph data like nodes, links and any > other sequence file in the Nutch ecosystem via a RESTful interface. > The current API documentation for this Reader endpoint is available at - > http://docs.nutchpytonutchrestapi.apiary.io/ > Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. -- This message was sent by Atlassian JIRA (v6.3.4#6332)