Re: [PR] [NUTCH-3025] urlfilter-fast to filter based on the length of the URL [nutch]

2023-11-08 Thread via GitHub
sebastian-nagel commented on PR #796: URL: https://github.com/apache/nutch/pull/796#issuecomment-1802531264 Thanks, @jnioche! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific

Re: [PR] [NUTCH-3025] urlfilter-fast to filter based on the length of the URL [nutch]

2023-11-08 Thread via GitHub
sebastian-nagel merged PR #796: URL: https://github.com/apache/nutch/pull/796 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail:

Re: [PR] [NUTCH-3025] urlfilter-fast to filter based on the length of the URL [nutch]

2023-11-08 Thread via GitHub
jnioche commented on PR #796: URL: https://github.com/apache/nutch/pull/796#issuecomment-1801938355 @sebastian-nagel merged the changes from master and made a few improvements -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub

Re: [PR] [NUTCH-3025] urlfilter-fast to filter based on the length of the URL [nutch]

2023-11-07 Thread via GitHub
jnioche commented on PR #796: URL: https://github.com/apache/nutch/pull/796#issuecomment-1798221743 Writing a test for this thing is an absolute pain. The way the filters are used for real is that their method setConf is called and the rules are loaded using _getConfResourceAsReader_, i.e.

Re: [PR] [NUTCH-3025] urlfilter-fast to filter based on the length of the URL [nutch]

2023-11-07 Thread via GitHub
jnioche commented on code in PR #796: URL: https://github.com/apache/nutch/pull/796#discussion_r1384621727 ## src/plugin/urlfilter-fast/src/java/org/apache/nutch/urlfilter/fast/FastURLFilter.java: ## @@ -97,9 +97,17 @@ public class FastURLFilter implements URLFilter {

Re: [PR] [NUTCH-3025] urlfilter-fast to filter based on the length of the URL [nutch]

2023-11-07 Thread via GitHub
sebastian-nagel commented on code in PR #796: URL: https://github.com/apache/nutch/pull/796#discussion_r1384536930 ## src/plugin/urlfilter-fast/src/java/org/apache/nutch/urlfilter/fast/FastURLFilter.java: ## @@ -97,9 +97,17 @@ public class FastURLFilter implements URLFilter {