[jira] [Updated] (NUTCH-1494) RSS feed plugin seems broken

2013-01-07 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1494: --- Attachment: NUTCH-1494.3.patch @Lewis: it worked :) I have attached the patch. Please let me know

[jira] [Commented] (NUTCH-1031) Delegate parsing of robots.txt to crawler-commons

2013-01-07 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13546639#comment-13546639 ] Tejas Patil commented on NUTCH-1031: The current nutch robots parsing logic is uses

[jira] [Created] (NUTCH-1514) Phase out the deprecated configuration properties (if possible)

2013-01-06 Thread Tejas Patil (JIRA)
Tejas Patil created NUTCH-1514: -- Summary: Phase out the deprecated configuration properties (if possible) Key: NUTCH-1514 URL: https://issues.apache.org/jira/browse/NUTCH-1514 Project: Nutch

[jira] [Updated] (NUTCH-1514) Phase out the deprecated configuration properties (if possible)

2013-01-06 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1514: --- Attachment: NUTCH-1514.patch Attached the patch for changes in nutch trunk. Please let me know your

[jira] [Updated] (NUTCH-1514) Phase out the deprecated configuration properties (if possible)

2013-01-06 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1514?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1514: --- Attachment: NUTCH-1514-v2.patch Thanks Sebastian !! I removed those references in nutch-default.xml

[jira] [Commented] (NUTCH-1513) Support Robots.txt for Ftp urls

2013-01-04 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13543720#comment-13543720 ] Tejas Patil commented on NUTCH-1513: For this has to be supported I have 2 approaches:

[jira] [Commented] (NUTCH-1494) RSS feed plugin seems broken

2013-01-03 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1494?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542798#comment-13542798 ] Tejas Patil commented on NUTCH-1494: I was working on

[jira] [Commented] (NUTCH-1053) Parsing of RSS feeds fails

2013-01-03 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1053?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13542805#comment-13542805 ] Tejas Patil commented on NUTCH-1053: The exception seen by Lewis wrt command line way

[jira] [Updated] (NUTCH-1274) Fix [cast] javac warnings

2013-01-03 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1274?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1274: --- Attachment: NUTCH-1274-trunk.patch NUTCH-1274-2.x.patch PFA the patches for trunk

[jira] [Updated] (NUTCH-1224) Migrate FreeGenerator to MapReduce API

2012-12-29 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1224?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1224: --- Attachment: NUTCH-1224.1.patch First attempt. Only remaining question is: Should I create a separate

[jira] [Updated] (NUTCH-1127) JUnit test for urlfilter-validator

2012-12-29 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1127: --- Attachment: NUTCH-1127.patch Wrote test case capturing few scenarios. Attached the patch. Please let

[jira] [Updated] (NUTCH-1119) JUnit test for index-static

2012-12-23 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1119?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1119: --- Attachment: NUTCH-1119.patch Wrote a test case which checks following: 1. static data fields are

[jira] [Updated] (NUTCH-1284) Add site fetcher.max.crawl.delay as log output by default.

2012-12-22 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1284: --- Attachment: NUTCH-1284.patch Patch for the fix Add site fetcher.max.crawl.delay as

[jira] [Commented] (NUTCH-1284) Add site fetcher.max.crawl.delay as log output by default.

2012-12-22 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13538725#comment-13538725 ] Tejas Patil commented on NUTCH-1284: I searched for the relevant mail thread[0] to get

[jira] [Comment Edited] (NUTCH-1284) Add site fetcher.max.crawl.delay as log output by default.

2012-12-22 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13538725#comment-13538725 ] Tejas Patil edited comment on NUTCH-1284 at 12/22/12 10:54 AM:

[jira] [Updated] (NUTCH-1118) JUnit test for index-basic

2012-12-22 Thread Tejas Patil (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1118?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tejas Patil updated NUTCH-1118: --- Attachment: NUTCH-1118.patch Wrote a test case which checks following: 1. basic searchable fields

<    1   2   3