[jira] [Commented] (NUTCH-3013) Employ commons-lang3's StopWatch to simplify timing logic

2023-10-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778206#comment-17778206 ] Hudson commented on NUTCH-3013: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #134 (See

[jira] [Resolved] (NUTCH-3013) Employ commons-lang3's StopWatch to simplify timing logic

2023-10-21 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney resolved NUTCH-3013. - Resolution: Fixed Thanks for the review [~snagel]  > Employ commons-lang3's

[jira] [Closed] (NUTCH-3013) Employ commons-lang3's StopWatch to simplify timing logic

2023-10-21 Thread Lewis John McGibbney (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney closed NUTCH-3013. --- > Employ commons-lang3's StopWatch to simplify timing logic >

[jira] [Commented] (NUTCH-3013) Employ commons-lang3's StopWatch to simplify timing logic

2023-10-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3013?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778181#comment-17778181 ] ASF GitHub Bot commented on NUTCH-3013: --- lewismc merged PR #788: URL:

Re: [PR] NUTCH-3013 Employ commons-lang3's StopWatch to simplify timing logic [nutch]

2023-10-21 Thread via GitHub
lewismc merged PR #788: URL: https://github.com/apache/nutch/pull/788 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org

[jira] [Commented] (NUTCH-3012) SegmentReader when dumping with option -recode: NPE on unparsed documents

2023-10-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778135#comment-17778135 ] Hudson commented on NUTCH-3012: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #133 (See

[jira] [Commented] (NUTCH-3011) HttpRobotRulesParser: handle HTTP 429 Too Many Requests same as server errors (HTTP 5xx)

2023-10-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778136#comment-17778136 ] Hudson commented on NUTCH-3011: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #133 (See

[jira] [Commented] (NUTCH-2990) HttpRobotRulesParser to follow 5 redirects as specified by RFC 9309

2023-10-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778123#comment-17778123 ] Hudson commented on NUTCH-2990: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #132 (See

[jira] [Commented] (NUTCH-3002) Protocol-okhttp HttpResponse: HTTP header metadata lookup should be case-insensitive

2023-10-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778124#comment-17778124 ] Hudson commented on NUTCH-3002: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #132 (See

[jira] [Commented] (NUTCH-3009) Upgrade to Hadoop 3.3.6

2023-10-21 Thread Hudson (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778125#comment-17778125 ] Hudson commented on NUTCH-3009: --- SUCCESS: Integrated in Jenkins build Nutch » Nutch-trunk #132 (See

Re: [PR] NUTCH-3012 SegmentReader when dumping with option -recode: NPE on unarsed documents [nutch]

2023-10-21 Thread via GitHub
sebastian-nagel merged PR #787: URL: https://github.com/apache/nutch/pull/787 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (NUTCH-3012) SegmentReader when dumping with option -recode: NPE on unparsed documents

2023-10-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3012?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778116#comment-17778116 ] ASF GitHub Bot commented on NUTCH-3012: --- sebastian-nagel merged PR #787: URL:

[jira] [Resolved] (NUTCH-3012) SegmentReader when dumping with option -recode: NPE on unparsed documents

2023-10-21 Thread Sebastian Nagel (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3012?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-3012. Resolution: Fixed > SegmentReader when dumping with option -recode: NPE on unparsed

[jira] [Resolved] (NUTCH-3011) HttpRobotRulesParser: handle HTTP 429 Too Many Requests same as server errors (HTTP 5xx)

2023-10-21 Thread Sebastian Nagel (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-3011. Resolution: Implemented > HttpRobotRulesParser: handle HTTP 429 Too Many Requests same as

[jira] [Commented] (NUTCH-3011) HttpRobotRulesParser: handle HTTP 429 Too Many Requests same as server errors (HTTP 5xx)

2023-10-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778115#comment-17778115 ] ASF GitHub Bot commented on NUTCH-3011: --- sebastian-nagel merged PR #786: URL:

Re: [PR] NUTCH-3011 HttpRobotRulesParser: handle HTTP 429 Too Many Requests same as server errors (HTTP 5xx) [nutch]

2023-10-21 Thread via GitHub
sebastian-nagel merged PR #786: URL: https://github.com/apache/nutch/pull/786 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (NUTCH-2990) HttpRobotRulesParser to follow 5 redirects as specified by RFC 9309

2023-10-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778108#comment-17778108 ] ASF GitHub Bot commented on NUTCH-2990: --- sebastian-nagel merged PR #779: URL:

[jira] [Resolved] (NUTCH-2990) HttpRobotRulesParser to follow 5 redirects as specified by RFC 9309

2023-10-21 Thread Sebastian Nagel (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-2990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-2990. Resolution: Implemented Thanks, everybody! > HttpRobotRulesParser to follow 5 redirects

Re: [PR] NUTCH-2990 HttpRobotRulesParser to follow 5 redirects as specified by RFC 9309 [nutch]

2023-10-21 Thread via GitHub
sebastian-nagel merged PR #779: URL: https://github.com/apache/nutch/pull/779 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (NUTCH-3009) Upgrade to Hadoop 3.3.6

2023-10-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778107#comment-17778107 ] ASF GitHub Bot commented on NUTCH-3009: --- sebastian-nagel merged PR #782: URL:

[jira] [Assigned] (NUTCH-3009) Upgrade to Hadoop 3.3.6

2023-10-21 Thread Sebastian Nagel (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel reassigned NUTCH-3009: -- Assignee: Sebastian Nagel > Upgrade to Hadoop 3.3.6 > --- > >

[jira] [Resolved] (NUTCH-3009) Upgrade to Hadoop 3.3.6

2023-10-21 Thread Sebastian Nagel (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3009?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-3009. Resolution: Implemented > Upgrade to Hadoop 3.3.6 > --- > >

Re: [PR] NUTCH-3009 Upgrade to Hadoop 3.3.6 [nutch]

2023-10-21 Thread via GitHub
sebastian-nagel merged PR #782: URL: https://github.com/apache/nutch/pull/782 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Resolved] (NUTCH-3006) Downgrade Tika dependency to 2.2.1 (core and parse-tika)

2023-10-21 Thread Sebastian Nagel (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3006?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-3006. Fix Version/s: (was: 1.20) Resolution: Abandoned > Downgrade Tika dependency to

[jira] [Assigned] (NUTCH-3002) Protocol-okhttp HttpResponse: HTTP header metadata lookup should be case-insensitive

2023-10-21 Thread Sebastian Nagel (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel reassigned NUTCH-3002: -- Assignee: Sebastian Nagel > Protocol-okhttp HttpResponse: HTTP header metadata lookup

[jira] [Resolved] (NUTCH-3002) Protocol-okhttp HttpResponse: HTTP header metadata lookup should be case-insensitive

2023-10-21 Thread Sebastian Nagel (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sebastian Nagel resolved NUTCH-3002. Resolution: Fixed > Protocol-okhttp HttpResponse: HTTP header metadata lookup should be >

[jira] [Commented] (NUTCH-3002) Protocol-okhttp HttpResponse: HTTP header metadata lookup should be case-insensitive

2023-10-21 Thread ASF GitHub Bot (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3002?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778105#comment-17778105 ] ASF GitHub Bot commented on NUTCH-3002: --- sebastian-nagel merged PR #777: URL:

Re: [PR] NUTCH-3002 Protocol-okhttp HttpResponse: HTTP header metadata lookup should be case-insensitive [nutch]

2023-10-21 Thread via GitHub
sebastian-nagel merged PR #777: URL: https://github.com/apache/nutch/pull/777 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

[jira] [Commented] (NUTCH-3014) Standardize NutchJob job names

2023-10-21 Thread Sebastian Nagel (Jira)
[ https://issues.apache.org/jira/browse/NUTCH-3014?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17778103#comment-17778103 ] Sebastian Nagel commented on NUTCH-3014: If there is a single data name/directory (CrawlDb,

[jira] [Created] (NUTCH-3014) Standardize NutchJob job names

2023-10-21 Thread Lewis John McGibbney (Jira)
Lewis John McGibbney created NUTCH-3014: --- Summary: Standardize NutchJob job names Key: NUTCH-3014 URL: https://issues.apache.org/jira/browse/NUTCH-3014 Project: Nutch Issue Type: