[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-26 Thread Aron Ahmadia (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14974207#comment-14974207
 ] 

Aron Ahmadia commented on NUTCH-2149:
-

Hi Sujen and Chris.

I think I'm missing something in this API.  Looking through, most of the 
services seem to require a "full path" to the file that should be read.  This 
seems like both a security risk (allowing the user to read arbitrary files on 
the file system) and an unnecessary load on the API.  Does Nutch have some 
concept of a "root" directory for each configuration?  

Also, it's not clear how to get a listing of the link, node, and sequence 
files.  Is this available somewhere else in the REST API?  How would I (as a 
REST interface user) know what path to provide?

> REST endpoint to read Nutch sequence files
> --
>
> Key: NUTCH-2149
> URL: https://issues.apache.org/jira/browse/NUTCH-2149
> Project: Nutch
>  Issue Type: New Feature
>  Components: REST_api
>Reporter: Sujen Shah
>Assignee: Sujen Shah
>  Labels: memex
> Fix For: 1.12
>
>
> This endpoint enables reading of the webgraph data like nodes, links and any 
> other sequence file in the Nutch ecosystem via a RESTful interface. 
> The current API documentation for this Reader endpoint is available at - 
> http://docs.nutchpytonutchrestapi.apiary.io/
> Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-25 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973383#comment-14973383
 ] 

Hudson commented on NUTCH-2149:
---

SUCCESS: Integrated in Nutch-trunk #3295 (See 
[https://builds.apache.org/job/Nutch-trunk/3295/])
NUTCH-2149 REST endpoint to read Nutch sequence files (Sujen Shah) (sujen: 
[http://svn.apache.org/viewvc/nutch/trunk/?view=rev&rev=1710468])
* trunk/CHANGES.txt
* trunk/src/java/org/apache/nutch/service/NutchReader.java
* trunk/src/java/org/apache/nutch/service/NutchServer.java
* trunk/src/java/org/apache/nutch/service/impl/LinkReader.java
* trunk/src/java/org/apache/nutch/service/impl/NodeReader.java
* trunk/src/java/org/apache/nutch/service/impl/SequenceReader.java
* trunk/src/java/org/apache/nutch/service/model/request/ReaderConfig.java
* trunk/src/java/org/apache/nutch/service/resources/ReaderResouce.java


> REST endpoint to read Nutch sequence files
> --
>
> Key: NUTCH-2149
> URL: https://issues.apache.org/jira/browse/NUTCH-2149
> Project: Nutch
>  Issue Type: New Feature
>  Components: REST_api
>Reporter: Sujen Shah
>Assignee: Sujen Shah
>  Labels: memex
> Fix For: 1.12
>
>
> This endpoint enables reading of the webgraph data like nodes, links and any 
> other sequence file in the Nutch ecosystem via a RESTful interface. 
> The current API documentation for this Reader endpoint is available at - 
> http://docs.nutchpytonutchrestapi.apiary.io/
> Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-25 Thread Sujen Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973372#comment-14973372
 ] 

Sujen Shah commented on NUTCH-2149:
---

Ohh I didn't know that, will do that from now on. Thanks :) 

> REST endpoint to read Nutch sequence files
> --
>
> Key: NUTCH-2149
> URL: https://issues.apache.org/jira/browse/NUTCH-2149
> Project: Nutch
>  Issue Type: New Feature
>  Components: REST_api
>Reporter: Sujen Shah
>Assignee: Sujen Shah
>  Labels: memex
> Fix For: 1.12
>
>
> This endpoint enables reading of the webgraph data like nodes, links and any 
> other sequence file in the Nutch ecosystem via a RESTful interface. 
> The current API documentation for this Reader endpoint is available at - 
> http://docs.nutchpytonutchrestapi.apiary.io/
> Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-25 Thread Chris A. Mattmann (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973368#comment-14973368
 ] 

Chris A. Mattmann commented on NUTCH-2149:
--

in your commit msg for the future [~sujenshah] reference the Github issue (aka 
say "this closes #80") in your commit message and asfgit user will close the 
issue on Github for ya.

> REST endpoint to read Nutch sequence files
> --
>
> Key: NUTCH-2149
> URL: https://issues.apache.org/jira/browse/NUTCH-2149
> Project: Nutch
>  Issue Type: New Feature
>  Components: REST_api
>Reporter: Sujen Shah
>Assignee: Sujen Shah
>  Labels: memex
> Fix For: 1.12
>
>
> This endpoint enables reading of the webgraph data like nodes, links and any 
> other sequence file in the Nutch ecosystem via a RESTful interface. 
> The current API documentation for this Reader endpoint is available at - 
> http://docs.nutchpytonutchrestapi.apiary.io/
> Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-25 Thread Sujen Shah (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973360#comment-14973360
 ] 

Sujen Shah commented on NUTCH-2149:
---

Committed 1710468

> REST endpoint to read Nutch sequence files
> --
>
> Key: NUTCH-2149
> URL: https://issues.apache.org/jira/browse/NUTCH-2149
> Project: Nutch
>  Issue Type: New Feature
>  Components: REST_api
>Reporter: Sujen Shah
>  Labels: memex
> Fix For: 1.12
>
>
> This endpoint enables reading of the webgraph data like nodes, links and any 
> other sequence file in the Nutch ecosystem via a RESTful interface. 
> The current API documentation for this Reader endpoint is available at - 
> http://docs.nutchpytonutchrestapi.apiary.io/
> Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-25 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973355#comment-14973355
 ] 

ASF GitHub Bot commented on NUTCH-2149:
---

Github user sujen1412 closed the pull request at:

https://github.com/apache/nutch/pull/80


> REST endpoint to read Nutch sequence files
> --
>
> Key: NUTCH-2149
> URL: https://issues.apache.org/jira/browse/NUTCH-2149
> Project: Nutch
>  Issue Type: New Feature
>  Components: REST_api
>Reporter: Sujen Shah
>  Labels: memex
> Fix For: 1.12
>
>
> This endpoint enables reading of the webgraph data like nodes, links and any 
> other sequence file in the Nutch ecosystem via a RESTful interface. 
> The current API documentation for this Reader endpoint is available at - 
> http://docs.nutchpytonutchrestapi.apiary.io/
> Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (NUTCH-2149) REST endpoint to read Nutch sequence files

2015-10-23 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/NUTCH-2149?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14970581#comment-14970581
 ] 

ASF GitHub Bot commented on NUTCH-2149:
---

GitHub user sujen1412 opened a pull request:

https://github.com/apache/nutch/pull/80

NUTCH-2149 REST endpoint to read Nutch sequence files

Removed debug statement

Corrected indentation and line width

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/sujen1412/nutch NUTCH-2149

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/nutch/pull/80.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #80


commit 0f71d8d6cab9f88421176beb61ed82efc4e20662
Author: Sujen Shah 
Date:   2015-10-23T02:49:50Z

Reader endpoint for Nutch REST

Removed debug statement

Corrected indentation and line width




> REST endpoint to read Nutch sequence files
> --
>
> Key: NUTCH-2149
> URL: https://issues.apache.org/jira/browse/NUTCH-2149
> Project: Nutch
>  Issue Type: New Feature
>  Components: REST_api
>Reporter: Sujen Shah
>  Labels: memex
> Fix For: 1.12
>
>
> This endpoint enables reading of the webgraph data like nodes, links and any 
> other sequence file in the Nutch ecosystem via a RESTful interface. 
> The current API documentation for this Reader endpoint is available at - 
> http://docs.nutchpytonutchrestapi.apiary.io/
> Thanks to https://github.com/ContinuumIO/nutchpy for the initial work. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)