Crawl images and store locally

2015-03-23 Thread Tizy Ninan
Hi, Does Nutch supports crawling images from webpages? If so, what are the steps to retrieve the images and store it locally? Thanks and Regards, Tizy

[jira] [Commented] (NUTCH-1968) File Name too long issue of DumpFileUtil.java file

2015-03-23 Thread Giuseppe Totaro (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376432#comment-14376432 ] Giuseppe Totaro commented on NUTCH-1968: Of Course. [~chrismattmann], I uploaded t

[jira] [Updated] (NUTCH-1974) keyPrefix option for CommonCrawlDataDumper tool

2015-03-23 Thread Giuseppe Totaro (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Giuseppe Totaro updated NUTCH-1974: --- Attachment: NUTCH-1974.patch > keyPrefix option for CommonCrawlDataDumper tool > -

[jira] [Created] (NUTCH-1974) keyPrefix option for CommonCrawlDataDumper tool

2015-03-23 Thread Giuseppe Totaro (JIRA)
Giuseppe Totaro created NUTCH-1974: -- Summary: keyPrefix option for CommonCrawlDataDumper tool Key: NUTCH-1974 URL: https://issues.apache.org/jira/browse/NUTCH-1974 Project: Nutch Issue Type:

[jira] [Comment Edited] (NUTCH-1941) Optional rolling http.agent.name's

2015-03-23 Thread Asitang Mishra (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376357#comment-14376357 ] Asitang Mishra edited comment on NUTCH-1941 at 3/23/15 6:41 PM:

[jira] [Commented] (NUTCH-1941) Optional rolling http.agent.name's

2015-03-23 Thread Asitang Mishra (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376357#comment-14376357 ] Asitang Mishra commented on NUTCH-1941: --- Hi @Sebastian Nagel, thanks for the comment

[jira] [Commented] (NUTCH-1958) Remove scoring-opic from nutch-default.xml

2015-03-23 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14376227#comment-14376227 ] Sebastian Nagel commented on NUTCH-1958: Scoring-oping is not that bad, scores are

[jira] [Commented] (NUTCH-1958) Remove scoring-opic from nutch-default.xml

2015-03-23 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375857#comment-14375857 ] Julien Nioche commented on NUTCH-1958: -- I agree but I think there could be benefits i

[jira] [Commented] (NUTCH-1958) Remove scoring-opic from nutch-default.xml

2015-03-23 Thread Markus Jelsma (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375787#comment-14375787 ] Markus Jelsma commented on NUTCH-1958: -- Hello Julien - neither. Scoring-depth does no

[jira] [Commented] (NUTCH-1958) Remove scoring-opic from nutch-default.xml

2015-03-23 Thread Julien Nioche (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375696#comment-14375696 ] Julien Nioche commented on NUTCH-1958: -- What would you suggest as a replacement? scor

Re: [ANNOUNCE] New Nutch committer and PMC - Mo Omer

2015-03-23 Thread Julien Nioche
Welcome Mo! On 22 March 2015 at 19:31, Markus Jelsma wrote: > Welcome Mohammad! > > -Original message- > From: Mohammed Omer > Sent: Sunday 22nd March 2015 18:55 > To: u...@nutch.apache.org > Cc: dev@nutch.apache.org > Subject: Re: [ANNOUNCE] New Nutch committer and PMC - Mo Omer > > Hel

[jira] [Commented] (NUTCH-1941) Optional rolling http.agent.name's

2015-03-23 Thread Sebastian Nagel (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1941?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375693#comment-14375693 ] Sebastian Nagel commented on NUTCH-1941: Hi [~asitangm], thanks! The patch needs s