[GitHub] nutch pull request: NUTCH-2136
GitHub user asitang opened a pull request: https://github.com/apache/nutch/pull/71 NUTCH-2136 You can merge this pull request into a Git repository by running: $ git pull https://github.com/asitang/nutch NUTCH-2136 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nutch/pull/71.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #71 commit f9aa3f50a113c22702abad9926e2488f87485722 Author: Asitang MishraDate: 2015-10-12T09:27:47Z dependencies removed from ivy.xml and plugin.xml. Changed the implementation of Naive Bayes ParseFilter commit 5a3cc9b4ac5f250983ca81e62f9dff63ab5ead3f Author: Asitang Mishra Date: 2015-10-12T09:33:38Z made some cosmetic changes to the code --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (NUTCH-2136) Implement a different version of Naive Bayes Parse Filter
[ https://issues.apache.org/jira/browse/NUTCH-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14952854#comment-14952854 ] ASF GitHub Bot commented on NUTCH-2136: --- GitHub user asitang opened a pull request: https://github.com/apache/nutch/pull/71 NUTCH-2136 You can merge this pull request into a Git repository by running: $ git pull https://github.com/asitang/nutch NUTCH-2136 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nutch/pull/71.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #71 commit f9aa3f50a113c22702abad9926e2488f87485722 Author: Asitang MishraDate: 2015-10-12T09:27:47Z dependencies removed from ivy.xml and plugin.xml. Changed the implementation of Naive Bayes ParseFilter commit 5a3cc9b4ac5f250983ca81e62f9dff63ab5ead3f Author: Asitang Mishra Date: 2015-10-12T09:33:38Z made some cosmetic changes to the code > Implement a different version of Naive Bayes Parse Filter > - > > Key: NUTCH-2136 > URL: https://issues.apache.org/jira/browse/NUTCH-2136 > Project: Nutch > Issue Type: Improvement > Components: parser >Reporter: Asitang Mishra > Fix For: 1.10 > > > There has been many dependency issues with the first implementation of Naive > Bayes Parse Filter. The major dependencies were Mahout and Lucene. There was > also the issue where the training process failed in the distributed mode due > to the fact that a nested hadoop job was unable to run on the cluster. > To remove all these issues and make the filter be able to run in a > distributed environment I am going to implement my own version of Naive Bayes > without any dependency on any machine learning libraries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (NUTCH-2136) Implement a different version of Naive Bayes Parse Filter
[ https://issues.apache.org/jira/browse/NUTCH-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asitang Mishra resolved NUTCH-2136. --- Resolution: Fixed > Implement a different version of Naive Bayes Parse Filter > - > > Key: NUTCH-2136 > URL: https://issues.apache.org/jira/browse/NUTCH-2136 > Project: Nutch > Issue Type: Improvement > Components: parser >Reporter: Asitang Mishra > Fix For: 1.10 > > > There has been many dependency issues with the first implementation of Naive > Bayes Parse Filter. The major dependencies were Mahout and Lucene. There was > also the issue where the training process failed in the distributed mode due > to the fact that a nested hadoop job was unable to run on the cluster. > To remove all these issues and make the filter be able to run in a > distributed environment I am going to implement my own version of Naive Bayes > without any dependency on any machine learning libraries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] nutch pull request: Branch 2.3.1
Github user dyzsasd closed the pull request at: https://github.com/apache/nutch/pull/72 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nutch pull request: Branch 2.3.1
GitHub user dyzsasd opened a pull request: https://github.com/apache/nutch/pull/72 Branch 2.3.1 You can merge this pull request into a Git repository by running: $ git pull https://github.com/apache/nutch branch-2.3.1 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nutch/pull/72.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #72 commit fa88ac21de22536c7bd464d59204d8fbf034aa53 Author: Lewis John McGibbneyDate: 2013-06-27T17:21:35Z prepare for new development git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1497462 13f79535-47bb-0310-9956-ffa450edef68 commit 9728ed2267e359772c6e8aa61f0bde69b7237f2d Author: Lewis John McGibbney Date: 2013-06-27T18:01:56Z update for release report git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1497480 13f79535-47bb-0310-9956-ffa450edef68 commit e868ed8d22f0ff69f7fa0da60269d09f30698469 Author: lufeng Date: 2013-07-01T13:34:23Z NUTCH-1594 count variable is never changed in ParseUtil class git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1498437 13f79535-47bb-0310-9956-ffa450edef68 commit fe9ea2aad1e75e419048d454992a5a56ceac8a1d Author: Markus Jelsma Date: 2013-07-05T10:27:47Z NUTCH-1595 Upgrade to Tika 1.4 (jnioche, markus) git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1499959 13f79535-47bb-0310-9956-ffa450edef68 commit d5cb787bead9589df0fe4f896fbb2ed17f059d9c Author: Julien Nioche Date: 2013-07-08T08:50:08Z NUTCH-1604 Protocol-factory not thread-safe git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1500610 13f79535-47bb-0310-9956-ffa450edef68 commit ccd793cd35768377231d77c01c5e9a9b700694f1 Author: Sebastian Nagel Date: 2013-07-25T21:15:02Z NUTCH-1587 misspelled property "threshold" in conf/log4j.properties git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1507131 13f79535-47bb-0310-9956-ffa450edef68 commit d4deef989ffc41b9dd5e77683e73286d81e1178b Author: Sebastian Nagel Date: 2013-08-07T21:10:17Z NUTCH-911 protocol-file to return proper protocol status for notmodified, gone, access_denied git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1511496 13f79535-47bb-0310-9956-ffa450edef68 commit 46dae3c0f754f212f7260d897bbd0785c19cd418 Author: lufeng Date: 2013-08-13T15:17:05Z NUTCH-1294 IndexClean job with solr implementation. git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1513543 13f79535-47bb-0310-9956-ffa450edef68 commit f7a76daaeb0c0f3686ececb1d946529f28f6ff17 Author: lufeng Date: 2013-08-13T15:21:34Z NUTCH-1294 IndexClean job with solr implementation. git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1513548 13f79535-47bb-0310-9956-ffa450edef68 commit 0508944f9bfbbf5f6b6898a95d156d2977ab3137 Author: Lewis John McGibbney Date: 2013-08-18T23:02:53Z NUTCH-1624 Typo in WebTableReader line 486 git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1515240 13f79535-47bb-0310-9956-ffa450edef68 commit 86c1f5584a49d45ac1d150a8dafedbd2af7351c1 Author: Julien Nioche Date: 2013-08-23T08:52:38Z NUTCH-1629 Injector skips empty lines git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1516752 13f79535-47bb-0310-9956-ffa450edef68 commit 936389646645b84816579f30c96077a678de5b1c Author: Lewis John McGibbney Date: 2013-08-23T19:47:16Z NUTCH-1631 Display Document Count Added to Solr Server git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1517003 13f79535-47bb-0310-9956-ffa450edef68 commit 33bed204bb922e9d5b3f3d67f2b61757ce3fdd9e Author: lufeng Date: 2013-08-24T15:21:20Z NUTCH-1619 Writes Dmoz Description and Title information to db with snippet argument. git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1517147 13f79535-47bb-0310-9956-ffa450edef68 commit a0030f4ef10f2866ccae90afadc8f3460911f88d Author: lufeng Date: 2013-08-24T15:50:01Z NUTCH-1619 Writes Dmoz Description and Title information to db with snippet argument. git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1517155 13f79535-47bb-0310-9956-ffa450edef68 commit 1d62b185abbd6f98c3dd644861bfb44d036bde8a Author: lufeng Date: 2013-09-05T14:40:25Z NUTCH-1556 enabling updatedb to accept batchId git-svn-id: https://svn.apache.org/repos/asf/nutch/branches/2.x@1520332
[GitHub] nutch pull request: NUTCH-2136
Github user asfgit closed the pull request at: https://github.com/apache/nutch/pull/71 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (NUTCH-2136) Implement a different version of Naive Bayes Parse Filter
[ https://issues.apache.org/jira/browse/NUTCH-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953286#comment-14953286 ] ASF GitHub Bot commented on NUTCH-2136: --- Github user asfgit closed the pull request at: https://github.com/apache/nutch/pull/71 > Implement a different version of Naive Bayes Parse Filter > - > > Key: NUTCH-2136 > URL: https://issues.apache.org/jira/browse/NUTCH-2136 > Project: Nutch > Issue Type: Improvement > Components: parser >Reporter: Asitang Mishra > Fix For: 1.10 > > > There has been many dependency issues with the first implementation of Naive > Bayes Parse Filter. The major dependencies were Mahout and Lucene. There was > also the issue where the training process failed in the distributed mode due > to the fact that a nested hadoop job was unable to run on the cluster. > To remove all these issues and make the filter be able to run in a > distributed environment I am going to implement my own version of Naive Bayes > without any dependency on any machine learning libraries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2136) Implement a different version of Naive Bayes Parse Filter
[ https://issues.apache.org/jira/browse/NUTCH-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953367#comment-14953367 ] Chris A. Mattmann commented on NUTCH-2136: -- [~asitang]: 1. ALv2 headers missing from the source files 2. CHANGES.txt entry should include your name (either your SVN id, or your full name) Other than that, looks great! > Implement a different version of Naive Bayes Parse Filter > - > > Key: NUTCH-2136 > URL: https://issues.apache.org/jira/browse/NUTCH-2136 > Project: Nutch > Issue Type: Improvement > Components: parser >Reporter: Asitang Mishra > Fix For: 1.10 > > > There has been many dependency issues with the first implementation of Naive > Bayes Parse Filter. The major dependencies were Mahout and Lucene. There was > also the issue where the training process failed in the distributed mode due > to the fact that a nested hadoop job was unable to run on the cluster. > To remove all these issues and make the filter be able to run in a > distributed environment I am going to implement my own version of Naive Bayes > without any dependency on any machine learning libraries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (NUTCH-2137) add changes.txt and ALV2 headers to the Naive Bayes Parse Filter
[ https://issues.apache.org/jira/browse/NUTCH-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asitang Mishra updated NUTCH-2137: -- Issue Type: Task (was: Bug) > add changes.txt and ALV2 headers to the Naive Bayes Parse Filter > > > Key: NUTCH-2137 > URL: https://issues.apache.org/jira/browse/NUTCH-2137 > Project: Nutch > Issue Type: Task >Reporter: Asitang Mishra > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Reopened] (NUTCH-2136) Implement a different version of Naive Bayes Parse Filter
[ https://issues.apache.org/jira/browse/NUTCH-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asitang Mishra reopened NUTCH-2136: --- Add ALv2 headers and add author to changes.txt > Implement a different version of Naive Bayes Parse Filter > - > > Key: NUTCH-2136 > URL: https://issues.apache.org/jira/browse/NUTCH-2136 > Project: Nutch > Issue Type: Improvement > Components: parser >Reporter: Asitang Mishra > Fix For: 1.10 > > > There has been many dependency issues with the first implementation of Naive > Bayes Parse Filter. The major dependencies were Mahout and Lucene. There was > also the issue where the training process failed in the distributed mode due > to the fact that a nested hadoop job was unable to run on the cluster. > To remove all these issues and make the filter be able to run in a > distributed environment I am going to implement my own version of Naive Bayes > without any dependency on any machine learning libraries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] nutch pull request: Made changes to changes.txt and added AVL2 hea...
Github user asitang closed the pull request at: https://github.com/apache/nutch/pull/73 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (NUTCH-2136) Implement a different version of Naive Bayes Parse Filter
[ https://issues.apache.org/jira/browse/NUTCH-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953351#comment-14953351 ] Hudson commented on NUTCH-2136: --- SUCCESS: Integrated in Nutch-trunk #3289 (See [https://builds.apache.org/job/Nutch-trunk/3289/]) NUTCH-2136 Implement a different version of Naive Bayes Parse Filter this closes #71 (asitang: [http://svn.apache.org/viewvc/nutch/trunk/?view=rev=1708158]) * trunk/CHANGES.txt * trunk/src/plugin/parsefilter-naivebayes/src/java/org/apache/nutch/parsefilter/naivebayes/Classify.java * trunk/src/plugin/parsefilter-naivebayes/src/java/org/apache/nutch/parsefilter/naivebayes/NaiveBayesParseFilter.java * trunk/src/plugin/parsefilter-naivebayes/src/java/org/apache/nutch/parsefilter/naivebayes/Train.java > Implement a different version of Naive Bayes Parse Filter > - > > Key: NUTCH-2136 > URL: https://issues.apache.org/jira/browse/NUTCH-2136 > Project: Nutch > Issue Type: Improvement > Components: parser >Reporter: Asitang Mishra > Fix For: 1.10 > > > There has been many dependency issues with the first implementation of Naive > Bayes Parse Filter. The major dependencies were Mahout and Lucene. There was > also the issue where the training process failed in the distributed mode due > to the fact that a nested hadoop job was unable to run on the cluster. > To remove all these issues and make the filter be able to run in a > distributed environment I am going to implement my own version of Naive Bayes > without any dependency on any machine learning libraries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2136) Implement a different version of Naive Bayes Parse Filter
[ https://issues.apache.org/jira/browse/NUTCH-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953438#comment-14953438 ] Asitang Mishra commented on NUTCH-2136: --- Will do. > Implement a different version of Naive Bayes Parse Filter > - > > Key: NUTCH-2136 > URL: https://issues.apache.org/jira/browse/NUTCH-2136 > Project: Nutch > Issue Type: Improvement > Components: parser >Reporter: Asitang Mishra > Fix For: 1.10 > > > There has been many dependency issues with the first implementation of Naive > Bayes Parse Filter. The major dependencies were Mahout and Lucene. There was > also the issue where the training process failed in the distributed mode due > to the fact that a nested hadoop job was unable to run on the cluster. > To remove all these issues and make the filter be able to run in a > distributed environment I am going to implement my own version of Naive Bayes > without any dependency on any machine learning libraries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] nutch pull request: Made changes to changes.txt and added AVL2 hea...
GitHub user asitang opened a pull request: https://github.com/apache/nutch/pull/73 Made changes to changes.txt and added AVL2 headers You can merge this pull request into a Git repository by running: $ git pull https://github.com/asitang/nutch NUTCH-2136 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nutch/pull/73.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #73 commit 07e8556c2632324c66edc6c0b9ea219cef8986cd Author: Asitang MishraDate: 2015-10-12T17:45:18Z Made changes to changes.txt and added AVL2 headers --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] nutch pull request: NUTCH 2137
Github user asfgit closed the pull request at: https://github.com/apache/nutch/pull/74 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (NUTCH-2137) add changes.txt and ALV2 headers to the Naive Bayes Parse Filter
[ https://issues.apache.org/jira/browse/NUTCH-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asitang Mishra updated NUTCH-2137: -- Priority: Trivial (was: Major) > add changes.txt and ALV2 headers to the Naive Bayes Parse Filter > > > Key: NUTCH-2137 > URL: https://issues.apache.org/jira/browse/NUTCH-2137 > Project: Nutch > Issue Type: Task >Reporter: Asitang Mishra >Priority: Trivial > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (NUTCH-2137) add changes.txt and ALV2 headers to the Naive Bayes Parse Filter
Asitang Mishra created NUTCH-2137: - Summary: add changes.txt and ALV2 headers to the Naive Bayes Parse Filter Key: NUTCH-2137 URL: https://issues.apache.org/jira/browse/NUTCH-2137 Project: Nutch Issue Type: Bug Reporter: Asitang Mishra -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (NUTCH-2136) Implement a different version of Naive Bayes Parse Filter
[ https://issues.apache.org/jira/browse/NUTCH-2136?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asitang Mishra resolved NUTCH-2136. --- Resolution: Fixed > Implement a different version of Naive Bayes Parse Filter > - > > Key: NUTCH-2136 > URL: https://issues.apache.org/jira/browse/NUTCH-2136 > Project: Nutch > Issue Type: Improvement > Components: parser >Reporter: Asitang Mishra > Fix For: 1.10 > > > There has been many dependency issues with the first implementation of Naive > Bayes Parse Filter. The major dependencies were Mahout and Lucene. There was > also the issue where the training process failed in the distributed mode due > to the fact that a nested hadoop job was unable to run on the cluster. > To remove all these issues and make the filter be able to run in a > distributed environment I am going to implement my own version of Naive Bayes > without any dependency on any machine learning libraries. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] nutch pull request: NUTCH 2137
GitHub user asitang opened a pull request: https://github.com/apache/nutch/pull/74 NUTCH 2137 You can merge this pull request into a Git repository by running: $ git pull https://github.com/asitang/nutch NUTCH-2137 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/nutch/pull/74.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #74 commit bac9e529655ac94af576a7598e3469c26bd9e9b4 Author: Asitang MishraDate: 2015-10-12T18:14:01Z made changes to changes.txt and added license headers --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (NUTCH-2137) add changes.txt and ALV2 headers to the Naive Bayes Parse Filter
[ https://issues.apache.org/jira/browse/NUTCH-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Asitang Mishra resolved NUTCH-2137. --- Resolution: Fixed > add changes.txt and ALV2 headers to the Naive Bayes Parse Filter > > > Key: NUTCH-2137 > URL: https://issues.apache.org/jira/browse/NUTCH-2137 > Project: Nutch > Issue Type: Task >Reporter: Asitang Mishra >Priority: Trivial > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (NUTCH-2137) add changes.txt and ALV2 headers to the Naive Bayes Parse Filter
[ https://issues.apache.org/jira/browse/NUTCH-2137?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14953566#comment-14953566 ] Hudson commented on NUTCH-2137: --- SUCCESS: Integrated in Nutch-trunk #3290 (See [https://builds.apache.org/job/Nutch-trunk/3290/]) NUTCH-2137 add changes.txt and ALV2 headers to the Naive Bayes Parse Filter this closes #74 (asitang: [http://svn.apache.org/viewvc/nutch/trunk/?view=rev=1708189]) * trunk/CHANGES.txt * trunk/src/plugin/parsefilter-naivebayes/src/java/org/apache/nutch/parsefilter/naivebayes/Classify.java * trunk/src/plugin/parsefilter-naivebayes/src/java/org/apache/nutch/parsefilter/naivebayes/Train.java > add changes.txt and ALV2 headers to the Naive Bayes Parse Filter > > > Key: NUTCH-2137 > URL: https://issues.apache.org/jira/browse/NUTCH-2137 > Project: Nutch > Issue Type: Task >Reporter: Asitang Mishra >Priority: Trivial > -- This message was sent by Atlassian JIRA (v6.3.4#6332)