[jira] [Updated] (NUTCH-2856) Implement a protocol-smb plugin based on hierynomus/smbj
[ https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2856: Summary: Implement a protocol-smb plugin based on hierynomus/smbj (was: Implement a protocol-smb plugin based on ) > Implement a protocol-smb plugin based on hierynomus/smbj > > > Key: NUTCH-2856 > URL: https://issues.apache.org/jira/browse/NUTCH-2856 > Project: Nutch > Issue Type: New Feature > Components: external, plugin, protocol >Reporter: Hiran Chaudhuri >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.19 > > > The plugin protocol-smb advertized on > [https://cwiki.apache.org/confluence/display/NUTCH/PluginCentral] actually > refers to the JCIFS library. According to this library's homepage > [https://www.jcifs.org/]: > _If you're looking for the latest and greatest open source Java SMB library, > this is not it. JCIFS has been in maintenance-mode-only for several years and > although what it does support works fine (SMB1, NTLMv2, midlc, MSRPC and > various utility classes), jCIFS does not support the newer SMB2/3 variants of > the SMB protocol which is slowly becoming required (Windows 10 requires > SMB2/3). JCIFS only supports SMB1 but Microsoft has deprecated SMB1 in their > products. *So if SMB1 is disabled on your network, JCIFS' file related > operations will NOT work.*_ > Looking at > [https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1:|https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1] > _Microsoft added SMB1 to the Windows Server 2012 R2 deprecation list in June > 2013. Windows Server 2016 and some versions of Windows 10 Fall Creators > Update do not have SMB1 installed by default._ > As a conclusion, the chances that SMB1 protocol is installed and/or > configured are getting vastly smaller. Therefore some migration towards > SMB2/3 is required. Luckily the JCIFS homepage lists alternatives: > * [jcifs-codelibs|https://github.com/codelibs/jcifs] > * [jcifs-ng|https://github.com/AgNO3/jcifs-ng] > * [smbj|https://github.com/hierynomus/smbj] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (NUTCH-2856) Implement an appropriately licensed protocol-smb plugin
[ https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17467111#comment-17467111 ] Lewis John McGibbney commented on NUTCH-2856: - Adding some notes from my research. * The smbj API looks very intuitive I think it will be a great fit * I was concerned about acquiring a SMB server which could be used for integration tests. Luckily the smbj project does have integration tests which show hwo this can be done but there were some missing pieces. They create an SMB (samba) server via Docker however they did not publish the image. Luckily a fellow Tika PMC took the initiative to [clone|https://github.com/nddipiazza/smbj-docker] and [publish|https://hub.docker.com/r/ndipiazza/smbj-inttest] it. * In the Gora project, we've been using [testcontainers|https://www.testcontainers.org/] for some time. This allows us to perform integration testing easily as you can either run a precanned container or you can [arbitrarily define one|https://www.testcontainers.org/features/creating_container/]. In this case, I can simply reference _ndipiazza/smbj-inttest_ and then test against it. There is a downside to this however, the host running the tests must have Docker installed. I need to therefore figure out a means of running this particular integration test only if the host has Docker installed and skipping it otherwise. > Implement an appropriately licensed protocol-smb plugin > --- > > Key: NUTCH-2856 > URL: https://issues.apache.org/jira/browse/NUTCH-2856 > Project: Nutch > Issue Type: New Feature > Components: external, plugin, protocol >Reporter: Hiran Chaudhuri >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.19 > > > The plugin protocol-smb advertized on > [https://cwiki.apache.org/confluence/display/NUTCH/PluginCentral] actually > refers to the JCIFS library. According to this library's homepage > [https://www.jcifs.org/]: > _If you're looking for the latest and greatest open source Java SMB library, > this is not it. JCIFS has been in maintenance-mode-only for several years and > although what it does support works fine (SMB1, NTLMv2, midlc, MSRPC and > various utility classes), jCIFS does not support the newer SMB2/3 variants of > the SMB protocol which is slowly becoming required (Windows 10 requires > SMB2/3). JCIFS only supports SMB1 but Microsoft has deprecated SMB1 in their > products. *So if SMB1 is disabled on your network, JCIFS' file related > operations will NOT work.*_ > Looking at > [https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1:|https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1] > _Microsoft added SMB1 to the Windows Server 2012 R2 deprecation list in June > 2013. Windows Server 2016 and some versions of Windows 10 Fall Creators > Update do not have SMB1 installed by default._ > As a conclusion, the chances that SMB1 protocol is installed and/or > configured are getting vastly smaller. Therefore some migration towards > SMB2/3 is required. Luckily the JCIFS homepage lists alternatives: > * [jcifs-codelibs|https://github.com/codelibs/jcifs] > * [jcifs-ng|https://github.com/AgNO3/jcifs-ng] > * [smbj|https://github.com/hierynomus/smbj] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (NUTCH-2856) Implement a protocol-smb plugin based on
[ https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2856: Summary: Implement a protocol-smb plugin based on (was: Implement an appropriately licensed protocol-smb plugin) > Implement a protocol-smb plugin based on > - > > Key: NUTCH-2856 > URL: https://issues.apache.org/jira/browse/NUTCH-2856 > Project: Nutch > Issue Type: New Feature > Components: external, plugin, protocol >Reporter: Hiran Chaudhuri >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.19 > > > The plugin protocol-smb advertized on > [https://cwiki.apache.org/confluence/display/NUTCH/PluginCentral] actually > refers to the JCIFS library. According to this library's homepage > [https://www.jcifs.org/]: > _If you're looking for the latest and greatest open source Java SMB library, > this is not it. JCIFS has been in maintenance-mode-only for several years and > although what it does support works fine (SMB1, NTLMv2, midlc, MSRPC and > various utility classes), jCIFS does not support the newer SMB2/3 variants of > the SMB protocol which is slowly becoming required (Windows 10 requires > SMB2/3). JCIFS only supports SMB1 but Microsoft has deprecated SMB1 in their > products. *So if SMB1 is disabled on your network, JCIFS' file related > operations will NOT work.*_ > Looking at > [https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1:|https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1] > _Microsoft added SMB1 to the Windows Server 2012 R2 deprecation list in June > 2013. Windows Server 2016 and some versions of Windows 10 Fall Creators > Update do not have SMB1 installed by default._ > As a conclusion, the chances that SMB1 protocol is installed and/or > configured are getting vastly smaller. Therefore some migration towards > SMB2/3 is required. Luckily the JCIFS homepage lists alternatives: > * [jcifs-codelibs|https://github.com/codelibs/jcifs] > * [jcifs-ng|https://github.com/AgNO3/jcifs-ng] > * [smbj|https://github.com/hierynomus/smbj] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Work started] (NUTCH-2856) Implement an appropriately licensed protocol-smb plugin
[ https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-2856 started by Lewis John McGibbney. --- > Implement an appropriately licensed protocol-smb plugin > --- > > Key: NUTCH-2856 > URL: https://issues.apache.org/jira/browse/NUTCH-2856 > Project: Nutch > Issue Type: New Feature > Components: external, plugin, protocol >Reporter: Hiran Chaudhuri >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.19 > > > The plugin protocol-smb advertized on > [https://cwiki.apache.org/confluence/display/NUTCH/PluginCentral] actually > refers to the JCIFS library. According to this library's homepage > [https://www.jcifs.org/]: > _If you're looking for the latest and greatest open source Java SMB library, > this is not it. JCIFS has been in maintenance-mode-only for several years and > although what it does support works fine (SMB1, NTLMv2, midlc, MSRPC and > various utility classes), jCIFS does not support the newer SMB2/3 variants of > the SMB protocol which is slowly becoming required (Windows 10 requires > SMB2/3). JCIFS only supports SMB1 but Microsoft has deprecated SMB1 in their > products. *So if SMB1 is disabled on your network, JCIFS' file related > operations will NOT work.*_ > Looking at > [https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1:|https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1] > _Microsoft added SMB1 to the Windows Server 2012 R2 deprecation list in June > 2013. Windows Server 2016 and some versions of Windows 10 Fall Creators > Update do not have SMB1 installed by default._ > As a conclusion, the chances that SMB1 protocol is installed and/or > configured are getting vastly smaller. Therefore some migration towards > SMB2/3 is required. Luckily the JCIFS homepage lists alternatives: > * [jcifs-codelibs|https://github.com/codelibs/jcifs] > * [jcifs-ng|https://github.com/AgNO3/jcifs-ng] > * [smbj|https://github.com/hierynomus/smbj] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (NUTCH-2856) Implement an appropriately licensed protocol-smb plugin
[ https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2856: Issue Type: New Feature (was: Bug) > Implement an appropriately licensed protocol-smb plugin > --- > > Key: NUTCH-2856 > URL: https://issues.apache.org/jira/browse/NUTCH-2856 > Project: Nutch > Issue Type: New Feature > Components: external, plugin, protocol >Reporter: Hiran Chaudhuri >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.19 > > > The plugin protocol-smb advertized on > [https://cwiki.apache.org/confluence/display/NUTCH/PluginCentral] actually > refers to the JCIFS library. According to this library's homepage > [https://www.jcifs.org/]: > _If you're looking for the latest and greatest open source Java SMB library, > this is not it. JCIFS has been in maintenance-mode-only for several years and > although what it does support works fine (SMB1, NTLMv2, midlc, MSRPC and > various utility classes), jCIFS does not support the newer SMB2/3 variants of > the SMB protocol which is slowly becoming required (Windows 10 requires > SMB2/3). JCIFS only supports SMB1 but Microsoft has deprecated SMB1 in their > products. *So if SMB1 is disabled on your network, JCIFS' file related > operations will NOT work.*_ > Looking at > [https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1:|https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1] > _Microsoft added SMB1 to the Windows Server 2012 R2 deprecation list in June > 2013. Windows Server 2016 and some versions of Windows 10 Fall Creators > Update do not have SMB1 installed by default._ > As a conclusion, the chances that SMB1 protocol is installed and/or > configured are getting vastly smaller. Therefore some migration towards > SMB2/3 is required. Luckily the JCIFS homepage lists alternatives: > * [jcifs-codelibs|https://github.com/codelibs/jcifs] > * [jcifs-ng|https://github.com/AgNO3/jcifs-ng] > * [smbj|https://github.com/hierynomus/smbj] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (NUTCH-2429) Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers
[ https://issues.apache.org/jira/browse/NUTCH-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17467069#comment-17467069 ] ASF GitHub Bot commented on NUTCH-2429: --- lewismc commented on pull request #720: URL: https://github.com/apache/nutch/pull/720#issuecomment-1003244488 This should also pave the way for me to work on [NUTCH-2856](https://issues.apache.org/jira/browse/NUTCH-2856) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers > - > > Key: NUTCH-2429 > URL: https://issues.apache.org/jira/browse/NUTCH-2429 > Project: Nutch > Issue Type: Improvement > Components: commoncrawl >Affects Versions: 1.14 > Environment: Tested on both Nutch 1.13 and 1.14 in Ubuntu Linux with > OpenJDK 1.8. >Reporter: Hiran Chaudhuri >Priority: Major > Fix For: 1.19 > > > While trying to use the protocol-smb plugin (which is not part of the Nutch > distribution) I realized there are four steps to successfully make use of a > protocol plugin: > 1 - put the artifact into the plugins directory > 2 - modify Nutch configuration files to allow smb:// urls plus include the > plugin to the loaded list > 3 - extract jcifs.jar and place it on the system classpath > 4 - run nutch with the correct system property > While steps 1 and 2 seem obvious, 3 and 4 require knowledge of plugin > internals which does not feel right for nutch and plugin users. Even more, > the jcifs.jar would exist twice on the classpath and could even cause further > problems during runtime. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [nutch] lewismc commented on pull request #720: NUTCH-2429 Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers
lewismc commented on pull request #720: URL: https://github.com/apache/nutch/pull/720#issuecomment-1003244488 This should also pave the way for me to work on [NUTCH-2856](https://issues.apache.org/jira/browse/NUTCH-2856) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Updated] (NUTCH-2856) Implement an appropriately licensed protocol-smb plugin
[ https://issues.apache.org/jira/browse/NUTCH-2856?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-2856: Summary: Implement an appropriately licensed protocol-smb plugin (was: protocol-smb plugin is outdated) > Implement an appropriately licensed protocol-smb plugin > --- > > Key: NUTCH-2856 > URL: https://issues.apache.org/jira/browse/NUTCH-2856 > Project: Nutch > Issue Type: Bug > Components: external, plugin, protocol >Reporter: Hiran Chaudhuri >Assignee: Lewis John McGibbney >Priority: Major > Fix For: 1.19 > > > The plugin protocol-smb advertized on > [https://cwiki.apache.org/confluence/display/NUTCH/PluginCentral] actually > refers to the JCIFS library. According to this library's homepage > [https://www.jcifs.org/]: > _If you're looking for the latest and greatest open source Java SMB library, > this is not it. JCIFS has been in maintenance-mode-only for several years and > although what it does support works fine (SMB1, NTLMv2, midlc, MSRPC and > various utility classes), jCIFS does not support the newer SMB2/3 variants of > the SMB protocol which is slowly becoming required (Windows 10 requires > SMB2/3). JCIFS only supports SMB1 but Microsoft has deprecated SMB1 in their > products. *So if SMB1 is disabled on your network, JCIFS' file related > operations will NOT work.*_ > Looking at > [https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1:|https://en.wikipedia.org/wiki/Server_Message_Block#SMB_/_CIFS_/_SMB1] > _Microsoft added SMB1 to the Windows Server 2012 R2 deprecation list in June > 2013. Windows Server 2016 and some versions of Windows 10 Fall Creators > Update do not have SMB1 installed by default._ > As a conclusion, the chances that SMB1 protocol is installed and/or > configured are getting vastly smaller. Therefore some migration towards > SMB2/3 is required. Luckily the JCIFS homepage lists alternatives: > * [jcifs-codelibs|https://github.com/codelibs/jcifs] > * [jcifs-ng|https://github.com/AgNO3/jcifs-ng] > * [smbj|https://github.com/hierynomus/smbj] -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (NUTCH-2429) Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers
[ https://issues.apache.org/jira/browse/NUTCH-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17467066#comment-17467066 ] ASF GitHub Bot commented on NUTCH-2429: --- lewismc commented on pull request #222: URL: https://github.com/apache/nutch/pull/222#issuecomment-1003242122 Superseded by #720 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers > - > > Key: NUTCH-2429 > URL: https://issues.apache.org/jira/browse/NUTCH-2429 > Project: Nutch > Issue Type: Improvement > Components: commoncrawl >Affects Versions: 1.14 > Environment: Tested on both Nutch 1.13 and 1.14 in Ubuntu Linux with > OpenJDK 1.8. >Reporter: Hiran Chaudhuri >Priority: Major > Fix For: 1.19 > > > While trying to use the protocol-smb plugin (which is not part of the Nutch > distribution) I realized there are four steps to successfully make use of a > protocol plugin: > 1 - put the artifact into the plugins directory > 2 - modify Nutch configuration files to allow smb:// urls plus include the > plugin to the loaded list > 3 - extract jcifs.jar and place it on the system classpath > 4 - run nutch with the correct system property > While steps 1 and 2 seem obvious, 3 and 4 require knowledge of plugin > internals which does not feel right for nutch and plugin users. Even more, > the jcifs.jar would exist twice on the classpath and could even cause further > problems during runtime. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (NUTCH-2429) Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers
[ https://issues.apache.org/jira/browse/NUTCH-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17467065#comment-17467065 ] ASF GitHub Bot commented on NUTCH-2429: --- lewismc closed pull request #222: URL: https://github.com/apache/nutch/pull/222 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers > - > > Key: NUTCH-2429 > URL: https://issues.apache.org/jira/browse/NUTCH-2429 > Project: Nutch > Issue Type: Improvement > Components: commoncrawl >Affects Versions: 1.14 > Environment: Tested on both Nutch 1.13 and 1.14 in Ubuntu Linux with > OpenJDK 1.8. >Reporter: Hiran Chaudhuri >Priority: Major > Fix For: 1.19 > > > While trying to use the protocol-smb plugin (which is not part of the Nutch > distribution) I realized there are four steps to successfully make use of a > protocol plugin: > 1 - put the artifact into the plugins directory > 2 - modify Nutch configuration files to allow smb:// urls plus include the > plugin to the loaded list > 3 - extract jcifs.jar and place it on the system classpath > 4 - run nutch with the correct system property > While steps 1 and 2 seem obvious, 3 and 4 require knowledge of plugin > internals which does not feel right for nutch and plugin users. Even more, > the jcifs.jar would exist twice on the classpath and could even cause further > problems during runtime. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [nutch] lewismc commented on pull request #222: NUTCH-2429 Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers
lewismc commented on pull request #222: URL: https://github.com/apache/nutch/pull/222#issuecomment-1003242122 Superseded by #720 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [nutch] lewismc closed pull request #222: NUTCH-2429 Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers
lewismc closed pull request #222: URL: https://github.com/apache/nutch/pull/222 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (NUTCH-2429) Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers
[ https://issues.apache.org/jira/browse/NUTCH-2429?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17467064#comment-17467064 ] ASF GitHub Bot commented on NUTCH-2429: --- lewismc opened a new pull request #720: URL: https://github.com/apache/nutch/pull/720 This issue addresses [NUTCH-2429](https://issues.apache.org/jira/browse/NUTCH-2429). Some notes * supersedes #222 by updating everything which was done there (excellent work @ HiranChaudhuri ) * incorporates @sebastian-nagel work a la [/sebastian-nagel/nutch/tree/NUTCH-2429](https://github.com/sebastian-nagel/nutch/commit/e589f05ef42486892427d347ecd10abfa9e380d7) * organizes the imports for each Class touched in this pull request * addresses a couple of rogue Classes which declared `public static final Logger` --> `private static final Logger` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers > - > > Key: NUTCH-2429 > URL: https://issues.apache.org/jira/browse/NUTCH-2429 > Project: Nutch > Issue Type: Improvement > Components: commoncrawl >Affects Versions: 1.14 > Environment: Tested on both Nutch 1.13 and 1.14 in Ubuntu Linux with > OpenJDK 1.8. >Reporter: Hiran Chaudhuri >Priority: Major > Fix For: 1.19 > > > While trying to use the protocol-smb plugin (which is not part of the Nutch > distribution) I realized there are four steps to successfully make use of a > protocol plugin: > 1 - put the artifact into the plugins directory > 2 - modify Nutch configuration files to allow smb:// urls plus include the > plugin to the loaded list > 3 - extract jcifs.jar and place it on the system classpath > 4 - run nutch with the correct system property > While steps 1 and 2 seem obvious, 3 and 4 require knowledge of plugin > internals which does not feel right for nutch and plugin users. Even more, > the jcifs.jar would exist twice on the classpath and could even cause further > problems during runtime. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[GitHub] [nutch] lewismc opened a new pull request #720: NUTCH-2429 Fix Plugin System to allow protocol plugins to bundle their URLStreamHandlers
lewismc opened a new pull request #720: URL: https://github.com/apache/nutch/pull/720 This issue addresses [NUTCH-2429](https://issues.apache.org/jira/browse/NUTCH-2429). Some notes * supersedes #222 by updating everything which was done there (excellent work @ HiranChaudhuri ) * incorporates @sebastian-nagel work a la [/sebastian-nagel/nutch/tree/NUTCH-2429](https://github.com/sebastian-nagel/nutch/commit/e589f05ef42486892427d347ecd10abfa9e380d7) * organizes the imports for each Class touched in this pull request * addresses a couple of rogue Classes which declared `public static final Logger` --> `private static final Logger` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@nutch.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[jira] [Commented] (NUTCH-2924) Generate maxCount expr evaluated only once
[ https://issues.apache.org/jira/browse/NUTCH-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17466942#comment-17466942 ] Markus Jelsma commented on NUTCH-2924: -- Updated patch, logging INFO > DEBUG. Otherwise slow reducers due to excessive logging. > Generate maxCount expr evaluated only once > -- > > Key: NUTCH-2924 > URL: https://issues.apache.org/jira/browse/NUTCH-2924 > Project: Nutch > Issue Type: Bug > Components: generator >Affects Versions: 1.16 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 1.19 > > Attachments: NUTCH-2924-1.patch, NUTCH-2924.patch > > > The generate.maxCount expression is evaluated only once in the generator's > reducer, instead, it must be set once per host. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (NUTCH-2924) Generate maxCount expr evaluated only once
[ https://issues.apache.org/jira/browse/NUTCH-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2924: - Attachment: NUTCH-2924-1.patch > Generate maxCount expr evaluated only once > -- > > Key: NUTCH-2924 > URL: https://issues.apache.org/jira/browse/NUTCH-2924 > Project: Nutch > Issue Type: Bug > Components: generator >Affects Versions: 1.16 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 1.19 > > Attachments: NUTCH-2924-1.patch, NUTCH-2924.patch > > > The generate.maxCount expression is evaluated only once in the generator's > reducer, instead, it must be set once per host. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Updated] (NUTCH-2924) Generate maxCount expr evaluated only once
[ https://issues.apache.org/jira/browse/NUTCH-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-2924: - Attachment: NUTCH-2924.patch > Generate maxCount expr evaluated only once > -- > > Key: NUTCH-2924 > URL: https://issues.apache.org/jira/browse/NUTCH-2924 > Project: Nutch > Issue Type: Bug > Components: generator >Affects Versions: 1.16 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 1.19 > > Attachments: NUTCH-2924.patch > > > The generate.maxCount expression is evaluated only once in the generator's > reducer, instead, it must be set once per host. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Commented] (NUTCH-2924) Generate maxCount expr evaluated only once
[ https://issues.apache.org/jira/browse/NUTCH-2924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17466926#comment-17466926 ] Markus Jelsma commented on NUTCH-2924: -- Patch, again, only for 1.15, for now. > Generate maxCount expr evaluated only once > -- > > Key: NUTCH-2924 > URL: https://issues.apache.org/jira/browse/NUTCH-2924 > Project: Nutch > Issue Type: Bug > Components: generator >Affects Versions: 1.16 >Reporter: Markus Jelsma >Assignee: Markus Jelsma >Priority: Major > Fix For: 1.19 > > Attachments: NUTCH-2924.patch > > > The generate.maxCount expression is evaluated only once in the generator's > reducer, instead, it must be set once per host. -- This message was sent by Atlassian Jira (v8.20.1#820001)
[jira] [Created] (NUTCH-2924) Generate maxCount expr evaluated only once
Markus Jelsma created NUTCH-2924: Summary: Generate maxCount expr evaluated only once Key: NUTCH-2924 URL: https://issues.apache.org/jira/browse/NUTCH-2924 Project: Nutch Issue Type: Bug Components: generator Affects Versions: 1.16 Reporter: Markus Jelsma Assignee: Markus Jelsma Fix For: 1.19 The generate.maxCount expression is evaluated only once in the generator's reducer, instead, it must be set once per host. -- This message was sent by Atlassian Jira (v8.20.1#820001)