[jira] [Commented] (NUTCH-2141) Change the InteractiveSelenium plugin handler Interface to return page content

2015-10-15 Thread Balaji Gurumurthy (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959627#comment-14959627 ] Balaji Gurumurthy commented on NUTCH-2141: -- When we concatenate the content from multiple pages

[jira] [Commented] (NUTCH-2141) Change the InteractiveSelenium plugin handler Interface to return page content

2015-10-15 Thread Michael Joyce (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959659#comment-14959659 ] Michael Joyce commented on NUTCH-2141: -- Cool makes sense. Do you have any examples? I'd like to poke

[jira] [Created] (NUTCH-2142) Nutch File Dump - FileNotFoundException (Invalid Argument) Error

2015-10-15 Thread Karanjeet Singh (JIRA)
Karanjeet Singh created NUTCH-2142: -- Summary: Nutch File Dump - FileNotFoundException (Invalid Argument) Error Key: NUTCH-2142 URL: https://issues.apache.org/jira/browse/NUTCH-2142 Project: Nutch

[jira] [Commented] (NUTCH-2139) Basic plugin to index inlinks and outlinks

2015-10-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959565#comment-14959565 ] ASF GitHub Bot commented on NUTCH-2139: --- Github user jorgelbg commented on a diff in the pull

[jira] [Created] (NUTCH-2143) GeneratorJob ignores batch id passed as argument

2015-10-15 Thread Sebastian Nagel (JIRA)
Sebastian Nagel created NUTCH-2143: -- Summary: GeneratorJob ignores batch id passed as argument Key: NUTCH-2143 URL: https://issues.apache.org/jira/browse/NUTCH-2143 Project: Nutch Issue

[GitHub] nutch pull request: Fix for NUTCH-2139 contributed by jorgelbg

2015-10-15 Thread jorgelbg
Github user jorgelbg commented on a diff in the pull request: https://github.com/apache/nutch/pull/78#discussion_r42176268 --- Diff: conf/nutch-default.xml --- @@ -1896,4 +1896,33 @@ CAUTION: Set the parser.timeout to -1 or a bigger value than 30, when using this

[GitHub] nutch pull request: Fix for NUTCH-2139 contributed by jorgelbg

2015-10-15 Thread jorgelbg
GitHub user jorgelbg opened a pull request: https://github.com/apache/nutch/pull/78 Fix for NUTCH-2139 contributed by jorgelbg Basic indexing capabilities for inlinks and outlinks. You can merge this pull request into a Git repository by running: $ git pull

[jira] [Updated] (NUTCH-2139) Basic plugin to index inlinks and outlinks

2015-10-15 Thread Jorge Luis Betancourt Gonzalez (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jorge Luis Betancourt Gonzalez updated NUTCH-2139: -- External issue ID: https://github.com/apache/nutch/pull/78 >

[jira] [Commented] (NUTCH-2139) Basic plugin to index inlinks and outlinks

2015-10-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959186#comment-14959186 ] ASF GitHub Bot commented on NUTCH-2139: --- GitHub user jorgelbg opened a pull request:

[GitHub] nutch pull request: Fix for NUTCH-2139 contributed by jorgelbg

2015-10-15 Thread jorgelbg
Github user jorgelbg commented on a diff in the pull request: https://github.com/apache/nutch/pull/78#discussion_r42170226 --- Diff: conf/nutch-default.xml --- @@ -1896,4 +1896,33 @@ CAUTION: Set the parser.timeout to -1 or a bigger value than 30, when using this

[jira] [Commented] (NUTCH-2139) Basic plugin to index inlinks and outlinks

2015-10-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959489#comment-14959489 ] ASF GitHub Bot commented on NUTCH-2139: --- Github user jorgelbg commented on a diff in the pull

[jira] [Commented] (NUTCH-2141) Change the InteractiveSelenium plugin handler Interface to return page content

2015-10-15 Thread Michael Joyce (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2141?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959345#comment-14959345 ] Michael Joyce commented on NUTCH-2141: -- This was actually brought up in NUTCH-2108. There's also an

[jira] [Commented] (NUTCH-2139) Basic plugin to index inlinks and outlinks

2015-10-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959349#comment-14959349 ] ASF GitHub Bot commented on NUTCH-2139: --- Github user sebastian-nagel commented on a diff in the pull

[GitHub] nutch pull request: Fix for NUTCH-2139 contributed by jorgelbg

2015-10-15 Thread sebastian-nagel
Github user sebastian-nagel commented on a diff in the pull request: https://github.com/apache/nutch/pull/78#discussion_r42160948 --- Diff: src/plugin/index-links/src/java/org/apache/nutch/indexer/links/LinksIndexingFilter.java --- @@ -0,0 +1,168 @@ +/** + * Licensed to

[GitHub] nutch pull request: Fix for NUTCH-2139 contributed by jorgelbg

2015-10-15 Thread sebastian-nagel
Github user sebastian-nagel commented on a diff in the pull request: https://github.com/apache/nutch/pull/78#discussion_r42160759 --- Diff: src/plugin/index-links/src/java/org/apache/nutch/indexer/links/LinksIndexingFilter.java --- @@ -0,0 +1,168 @@ +/** + * Licensed to

[GitHub] nutch pull request: Fix for NUTCH-2139 contributed by jorgelbg

2015-10-15 Thread jorgelbg
Github user jorgelbg commented on a diff in the pull request: https://github.com/apache/nutch/pull/78#discussion_r42168120 --- Diff: src/plugin/index-links/src/java/org/apache/nutch/indexer/links/LinksIndexingFilter.java --- @@ -0,0 +1,168 @@ +/** + * Licensed to the

[GitHub] nutch pull request: Fix for NUTCH-2139 contributed by jorgelbg

2015-10-15 Thread sebastian-nagel
Github user sebastian-nagel commented on a diff in the pull request: https://github.com/apache/nutch/pull/78#discussion_r42163943 --- Diff: conf/nutch-default.xml --- @@ -1896,4 +1896,33 @@ CAUTION: Set the parser.timeout to -1 or a bigger value than 30, when using this

[jira] [Commented] (NUTCH-2139) Basic plugin to index inlinks and outlinks

2015-10-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959396#comment-14959396 ] ASF GitHub Bot commented on NUTCH-2139: --- Github user sebastian-nagel commented on a diff in the pull

[GitHub] nutch pull request: Fix for NUTCH-2139 contributed by jorgelbg

2015-10-15 Thread sebastian-nagel
Github user sebastian-nagel commented on a diff in the pull request: https://github.com/apache/nutch/pull/78#discussion_r42164537 --- Diff: conf/nutch-default.xml --- @@ -1896,4 +1896,33 @@ CAUTION: Set the parser.timeout to -1 or a bigger value than 30, when using this

[jira] [Commented] (NUTCH-2139) Basic plugin to index inlinks and outlinks

2015-10-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959407#comment-14959407 ] ASF GitHub Bot commented on NUTCH-2139: --- Github user sebastian-nagel commented on a diff in the pull

[jira] [Commented] (NUTCH-2139) Basic plugin to index inlinks and outlinks

2015-10-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959460#comment-14959460 ] ASF GitHub Bot commented on NUTCH-2139: --- Github user jorgelbg commented on a diff in the pull

[jira] [Commented] (NUTCH-2139) Basic plugin to index inlinks and outlinks

2015-10-15 Thread ASF GitHub Bot (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-2139?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14959352#comment-14959352 ] ASF GitHub Bot commented on NUTCH-2139: --- Github user sebastian-nagel commented on a diff in the pull

[GitHub] nutch pull request: Fix for NUTCH-2139 contributed by jorgelbg

2015-10-15 Thread sebastian-nagel
Github user sebastian-nagel commented on a diff in the pull request: https://github.com/apache/nutch/pull/78#discussion_r42161108 --- Diff: src/plugin/index-links/src/java/org/apache/nutch/indexer/links/LinksIndexingFilter.java --- @@ -0,0 +1,168 @@ +/** + * Licensed to