Build failed in Hudson: Nutch-trunk #363
See http://hudson.zones.apache.org/hudson/job/Nutch-trunk/363/changes Changes: [kubes] NUTCH-44 - Too many search results. Configurable limit on max number of search results returned. Thanks Emilijan Mirceski and Susam Pal. -- [...truncated 4599 lines...] copy-generated-lib: [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-regex init: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix/classes [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix/test init-plugin: deps-jar: compile: [echo] Compiling plugin: urlfilter-suffix [javac] Compiling 1 source file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix/classes [javac] Note: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/src/plugin/urlfilter-suffix/src/java/org/apache/nutch/urlfilter/suffix/SuffixURLFilter.java uses unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. jar: [jar] Building jar: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix/urlfilter-suffix.jar deps-test: deploy: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-suffix [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-suffix copy-generated-lib: [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-suffix init: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator/classes [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator/test init-plugin: deps-jar: compile: [echo] Compiling plugin: urlfilter-validator [javac] Compiling 1 source file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator/classes jar: [jar] Building jar: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator/urlfilter-validator.jar deps-test: deploy: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-validator [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-validator copy-generated-lib: [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-validator init: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic/classes [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic/test init-plugin: deps-jar: compile: [echo] Compiling plugin: urlnormalizer-basic [javac] Compiling 1 source file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic/classes jar: [jar] Building jar: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic/urlnormalizer-basic.jar deps-test: deploy: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlnormalizer-basic [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlnormalizer-basic copy-generated-lib: [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlnormalizer-basic init: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass/classes [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass/test init-plugin: deps-jar: compile: [echo] Compiling plugin: urlnormalizer-pass [javac] Compiling 1 source file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass/classes jar: [jar] Building jar: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass/urlnormalizer-pass.jar deps-test: deploy: [mkdir] Created dir:
[jira] Commented: (NUTCH-44) too many search results
[ https://issues.apache.org/jira/browse/NUTCH-44?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=12570305#action_12570305 ] Hudson commented on NUTCH-44: - Integrated in Nutch-trunk #363 (See [http://hudson.zones.apache.org/hudson/job/Nutch-trunk/363/]) too many search results --- Key: NUTCH-44 URL: https://issues.apache.org/jira/browse/NUTCH-44 Project: Nutch Issue Type: Bug Components: web gui Environment: web environment Reporter: Emilijan Mirceski Assignee: Dennis Kubes Attachments: NUTCH-44-2-20080215.patch, NUTCH-44.patch There should be a limitation (user defined) on the number of results the search engine can return. For example, if one modifies the seach url as: http://my/search.jsp?query=some quieryhitsPerPage=2hitsPerSite=0 The search will try to return 20,000 pages which isn't good for the server side performance. Is it possible to have a setting in the config xml files to control this? Thanks, Emilijan -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Next release?
Hi all, I propose to start planning for the next release, and tentatively I propose to schedule it for the beginning of April. I'm going to close a lot of old and outdated issues in JIRA - other committers, please do the same if you know that a given issue no longer applies. Out of the remaining open issues, we should resolve all with the blocker / major status, and of the type bug. Then we can resolve as many as we can from the remaining categories, depending on the votes and perceived importance of the issue. Any other suggestions? -- Best regards, Andrzej Bialecki ___. ___ ___ ___ _ _ __ [__ || __|__/|__||\/| Information Retrieval, Semantic Web ___|||__|| \| || | Embedded Unix, System Integration http://www.sigram.com Contact: info at sigram dot com
[jira] Updated: (NUTCH-614) Order Inlinks by OPIC score of parent page
[ https://issues.apache.org/jira/browse/NUTCH-614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dennis Kubes updated NUTCH-614: --- Patch Info: [Patch Available] Order Inlinks by OPIC score of parent page -- Key: NUTCH-614 URL: https://issues.apache.org/jira/browse/NUTCH-614 Project: Nutch Issue Type: Improvement Affects Versions: 0.9.0 Environment: All Reporter: Dennis Kubes Assignee: Dennis Kubes Fix For: 0.9.0, 1.0.0 Attachments: NUTCH-614-1-20080219.patch Currently when saving inlinks there is a max number of inlinks (configurable) which get saved and very little logic goes into deciding which inlinks get saved. This patch uses the OPIC score of the encompassing page to set a score for each inlink. Inlinks are then reverse sorted according to score and the best inlinks are saved first. The logic behind this is that pages with higher OPIC scores should have better links which they are pointing to. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Updated: (NUTCH-614) Order Inlinks by OPIC score of parent page
[ https://issues.apache.org/jira/browse/NUTCH-614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dennis Kubes updated NUTCH-614: --- Attachment: NUTCH-614-1-20080219.patch Orders inlinks by parents OPIC score. Order Inlinks by OPIC score of parent page -- Key: NUTCH-614 URL: https://issues.apache.org/jira/browse/NUTCH-614 Project: Nutch Issue Type: Improvement Affects Versions: 0.9.0 Environment: All Reporter: Dennis Kubes Assignee: Dennis Kubes Fix For: 0.9.0, 1.0.0 Attachments: NUTCH-614-1-20080219.patch Currently when saving inlinks there is a max number of inlinks (configurable) which get saved and very little logic goes into deciding which inlinks get saved. This patch uses the OPIC score of the encompassing page to set a score for each inlink. Inlinks are then reverse sorted according to score and the best inlinks are saved first. The logic behind this is that pages with higher OPIC scores should have better links which they are pointing to. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Created: (NUTCH-614) Order Inlinks by OPIC score of parent page
Order Inlinks by OPIC score of parent page -- Key: NUTCH-614 URL: https://issues.apache.org/jira/browse/NUTCH-614 Project: Nutch Issue Type: Improvement Affects Versions: 0.9.0 Environment: All Reporter: Dennis Kubes Assignee: Dennis Kubes Fix For: 1.0.0, 0.9.0 Attachments: NUTCH-614-1-20080219.patch Currently when saving inlinks there is a max number of inlinks (configurable) which get saved and very little logic goes into deciding which inlinks get saved. This patch uses the OPIC score of the encompassing page to set a score for each inlink. Inlinks are then reverse sorted according to score and the best inlinks are saved first. The logic behind this is that pages with higher OPIC scores should have better links which they are pointing to. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
[jira] Work started: (NUTCH-614) Order Inlinks by OPIC score of parent page
[ https://issues.apache.org/jira/browse/NUTCH-614?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on NUTCH-614 started by Dennis Kubes. Order Inlinks by OPIC score of parent page -- Key: NUTCH-614 URL: https://issues.apache.org/jira/browse/NUTCH-614 Project: Nutch Issue Type: Improvement Affects Versions: 0.9.0 Environment: All Reporter: Dennis Kubes Assignee: Dennis Kubes Fix For: 0.9.0, 1.0.0 Attachments: NUTCH-614-1-20080219.patch Currently when saving inlinks there is a max number of inlinks (configurable) which get saved and very little logic goes into deciding which inlinks get saved. This patch uses the OPIC score of the encompassing page to set a score for each inlink. Inlinks are then reverse sorted according to score and the best inlinks are saved first. The logic behind this is that pages with higher OPIC scores should have better links which they are pointing to. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.
Build failed in Hudson: Nutch-trunk #364
See http://hudson.zones.apache.org/hudson/job/Nutch-trunk/364/changes -- [...truncated 4599 lines...] copy-generated-lib: [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-regex init: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix/classes [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix/test init-plugin: deps-jar: compile: [echo] Compiling plugin: urlfilter-suffix [javac] Compiling 1 source file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix/classes [javac] Note: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/src/plugin/urlfilter-suffix/src/java/org/apache/nutch/urlfilter/suffix/SuffixURLFilter.java uses unchecked or unsafe operations. [javac] Note: Recompile with -Xlint:unchecked for details. jar: [jar] Building jar: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-suffix/urlfilter-suffix.jar deps-test: deploy: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-suffix [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-suffix copy-generated-lib: [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-suffix init: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator/classes [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator/test init-plugin: deps-jar: compile: [echo] Compiling plugin: urlfilter-validator [javac] Compiling 1 source file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator/classes jar: [jar] Building jar: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlfilter-validator/urlfilter-validator.jar deps-test: deploy: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-validator [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-validator copy-generated-lib: [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlfilter-validator init: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic/classes [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic/test init-plugin: deps-jar: compile: [echo] Compiling plugin: urlnormalizer-basic [javac] Compiling 1 source file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic/classes jar: [jar] Building jar: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-basic/urlnormalizer-basic.jar deps-test: deploy: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlnormalizer-basic [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlnormalizer-basic copy-generated-lib: [copy] Copying 1 file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlnormalizer-basic init: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass/classes [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass/test init-plugin: deps-jar: compile: [echo] Compiling plugin: urlnormalizer-pass [javac] Compiling 1 source file to http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass/classes jar: [jar] Building jar: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/urlnormalizer-pass/urlnormalizer-pass.jar deps-test: deploy: [mkdir] Created dir: http://hudson.zones.apache.org/hudson/job/Nutch-trunk/ws/trunk/build/plugins/urlnormalizer-pass [copy] Copying 1 file to