[
https://issues.apache.org/jira/browse/NUTCH-2091?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Asitang Mishra updated NUTCH-2091:
--
Priority: Major (was: Minor)
> Increase robustness and crawling versatility of Nutch for the De
[
https://issues.apache.org/jira/browse/NUTCH-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933969#comment-14933969
]
Hudson commented on NUTCH-2086:
---
SUCCESS: Integrated in Nutch-trunk #3285 (See
[https://bui
[
https://issues.apache.org/jira/browse/NUTCH-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933845#comment-14933845
]
Asitang Mishra commented on NUTCH-2110:
---
To keep everything under one single url in
[
https://issues.apache.org/jira/browse/NUTCH-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933808#comment-14933808
]
Lewis John McGibbney commented on NUTCH-2086:
-
Folks this is committed @revisi
Asitang Mishra created NUTCH-2127:
-
Summary: Provide the selenium protocol with basic authentication
capabilities.
Key: NUTCH-2127
URL: https://issues.apache.org/jira/browse/NUTCH-2127
Project: Nutch
[
https://issues.apache.org/jira/browse/NUTCH-2110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Asitang Mishra updated NUTCH-2110:
--
Description: Create the capability to provide seeds in the form of
"url+xpath(including option t
[
https://issues.apache.org/jira/browse/NUTCH-2126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Asitang Mishra updated NUTCH-2126:
--
Summary: Use selenium protocol for specific sites (was: Use selenium
protocol for specific site
Asitang Mishra created NUTCH-2126:
-
Summary: Use selenium protocol for specific sites when switched on
Key: NUTCH-2126
URL: https://issues.apache.org/jira/browse/NUTCH-2126
Project: Nutch
Is
[
https://issues.apache.org/jira/browse/NUTCH-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Asitang Mishra updated NUTCH-2108:
--
Priority: Major (was: Minor)
> Add a function to the selenium interactive plugin interface to d
[
https://issues.apache.org/jira/browse/NUTCH-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2124:
---
Patch Info: Patch Available
> redirect following same link again and again , max redirect exce
[
https://issues.apache.org/jira/browse/NUTCH-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2124:
---
Attachment: NUTCH-2124.patch
Somehow the fix for NUTCH-1939 gets lost when Fetcher was refacto
[
https://issues.apache.org/jira/browse/NUTCH-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933626#comment-14933626
]
ASF GitHub Bot commented on NUTCH-2108:
---
GitHub user asitang opened a pull request:
GitHub user asitang opened a pull request:
https://github.com/apache/nutch/pull/67
NUTCH-2108
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/asitang/nutch NUTCH-2108
Alternatively you can review and apply these changes as the p
[
https://issues.apache.org/jira/browse/NUTCH-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933605#comment-14933605
]
ASF GitHub Bot commented on NUTCH-2108:
---
Github user asitang closed the pull request
Github user asitang closed the pull request at:
https://github.com/apache/nutch/pull/66
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabl
[
https://issues.apache.org/jira/browse/NUTCH-2108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933601#comment-14933601
]
ASF GitHub Bot commented on NUTCH-2108:
---
GitHub user asitang opened a pull request:
GitHub user asitang opened a pull request:
https://github.com/apache/nutch/pull/66
Added support for NUTCH-2108 and NUTCH-2109
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/asitang/nutch NUTCH-2091
Alternatively you can review
[
https://issues.apache.org/jira/browse/NUTCH-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kim Whitehall updated NUTCH-2125:
-
Summary: Metrics tool for relevancy (was: Metrics)
> Metrics tool for relevancy
> ---
[
https://issues.apache.org/jira/browse/NUTCH-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Kim Whitehall updated NUTCH-2125:
-
Description:
Purpose: a metric for determining if the “relevancy” of a crawl after each
round and
Kim Whitehall created NUTCH-2125:
Summary: Metrics
Key: NUTCH-2125
URL: https://issues.apache.org/jira/browse/NUTCH-2125
Project: Nutch
Issue Type: Improvement
Components: tool
I don't see any null pointer exceptions coming up in your log. Do you have
any more info or perhaps I'm missing something?
-- Jimmy
On Sun, Sep 27, 2015 at 3:04 PM, mithun wrote:
> Hi All
>
> While crawling my seed list, I bumped into this Null Pointer Exception for
> few urls. What could be t
shouldProcessURL simply takes a URL and returns true/false to determine if
the handler should process the URL. You can dictate what logic you do in
your handler to determine if you want to process a URL or not. You'll note
that the simple example in the codebase [1] simply returns true, A.K.A,
proc
[
https://issues.apache.org/jira/browse/NUTCH-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2124:
---
Fix Version/s: 1.11
> redirect following same link again and again , max redirect exceed and w
[
https://issues.apache.org/jira/browse/NUTCH-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14933369#comment-14933369
]
Sebastian Nagel commented on NUTCH-2124:
Confirmed. Thanks! To reproduce with the
[
https://issues.apache.org/jira/browse/NUTCH-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Sebastian Nagel updated NUTCH-2124:
---
Priority: Blocker (was: Major)
> redirect following same link again and again , max redirect
[
https://issues.apache.org/jira/browse/NUTCH-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yogendra Kumar Soni updated NUTCH-2124:
---
Description:
Hello, followredirect is not working in trunk. please see the below log.
[
https://issues.apache.org/jira/browse/NUTCH-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yogendra Kumar Soni updated NUTCH-2124:
---
Flags: Important
Labels: db_gone fetcher redirect (was: )
Descripti
Yogendra Kumar Soni created NUTCH-2124:
--
Summary: redirect following same link again and again , max
redirect exceed and went db_gone
Key: NUTCH-2124
URL: https://issues.apache.org/jira/browse/NUTCH-2124
28 matches
Mail list logo