[jira] [Commented] (NUTCH-1296) nutchgora fetcher does not show correct 'threads' and 'resuming' properties

2012-03-02 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221508#comment-13221508 ] Hudson commented on NUTCH-1296: --- Integrated in Nutch-nutchgora #181 (See [https://builds.ap

[jira] [Commented] (NUTCH-1295) nutchgora restlet dependencies failing when remote repos is down

2012-03-02 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221509#comment-13221509 ] Hudson commented on NUTCH-1295: --- Integrated in Nutch-nutchgora #181 (See [https://builds.ap

[jira] [Commented] (NUTCH-1292) Better exception logging and debugging during fetch.

2012-03-02 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221506#comment-13221506 ] Hudson commented on NUTCH-1292: --- Integrated in Nutch-nutchgora #181 (See [https://builds.ap

[jira] [Commented] (NUTCH-1263) FetcherJob must put 'fetchTime' on input

2012-03-02 Thread Hudson (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13221507#comment-13221507 ] Hudson commented on NUTCH-1263: --- Integrated in Nutch-nutchgora #181 (See [https://builds.ap

[jira] [Updated] (NUTCH-475) Adaptive crawl delay

2012-03-02 Thread Lewis John McGibbney (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-475: --- Attachment: NUTCH-475.patch Updated patch which brings this issue up to speed as of Do

[jira] [Commented] (NUTCH-1253) Incompatible neko and xerces versions

2012-03-02 Thread Ferdy Galema (Commented) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1253?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220984#comment-13220984 ] Ferdy Galema commented on NUTCH-1253: - I'll give this one a go.. > In

[jira] [Closed] (NUTCH-1292) Better exception logging and debugging during fetch.

2012-03-02 Thread Ferdy Galema (Closed) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1292?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema closed NUTCH-1292. --- Resolution: Fixed committed > Better exception logging and debugging during fetch. >

[jira] [Closed] (NUTCH-1263) FetcherJob must put 'fetchTime' on input

2012-03-02 Thread Ferdy Galema (Closed) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1263?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema closed NUTCH-1263. --- Resolution: Fixed Fix Version/s: nutchgora This one slipped under the radar. Committed.

Re: Nutch with Letor

2012-03-02 Thread Lewis John Mcgibbney
Also please4 hip this discussion to user@ as it seems to be more relevant there. Thanks On Fri, Mar 2, 2012 at 2:13 PM, Lewis John Mcgibbney < lewis.mcgibb...@gmail.com> wrote: > Hi, > > Would be great if you could provide some links to the dataset, exactly > what it is etc. > > Thank you > > >

Re: Nutch with Letor

2012-03-02 Thread Lewis John Mcgibbney
Hi, Would be great if you could provide some links to the dataset, exactly what it is etc. Thank you On Fri, Mar 2, 2012 at 1:19 PM, varunpandeyengg wrote: > Hey Guys, > > I am new to Nutch. I am part of a IR research team & need to create a setup > where in I need to crawl Microsoft's LETOR Da

[jira] [Closed] (NUTCH-1296) nutchgora fetcher does not show correct 'threads' and 'resuming' properties

2012-03-02 Thread Ferdy Galema (Closed) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema closed NUTCH-1296. --- Resolution: Fixed committed > nutchgora fetcher does not show correct 'threads' and

[jira] [Created] (NUTCH-1296) nutchgora fetcher does not show correct 'threads' and 'resuming' properties

2012-03-02 Thread Ferdy Galema (Created) (JIRA)
nutchgora fetcher does not show correct 'threads' and 'resuming' properties --- Key: NUTCH-1296 URL: https://issues.apache.org/jira/browse/NUTCH-1296 Project: Nutch Issu

Nutch with Letor

2012-03-02 Thread varunpandeyengg
Hey Guys, I am new to Nutch. I am part of a IR research team & need to create a setup where in I need to crawl Microsoft's LETOR Dataset with Nutch. After googling for a while, I didn't get any tutorial or help. Could anyone guide me for the same? I am using Nutch 1.4 on Ubuntu 11.10 & Eclipse 3.

Re: Drawing an analogy between AdaptiveFetchSchedule and AdaptiveCrawlDelay

2012-03-02 Thread Lewis John Mcgibbney
Hi Andrzej, On Fri, Mar 2, 2012 at 12:37 PM, Andrzej Bialecki wrote: > Fetcher2 is the current Fetcher. The original Fetcher was temporarily > renamed OldFetcher and then removed. > So looks like this 'might' be more straight forward to implement than I originally thought. When I get a bit of t

[jira] [Updated] (NUTCH-1273) Fix [deprecation] javac warnings

2012-03-02 Thread Lewis John McGibbney (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1273?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Lewis John McGibbney updated NUTCH-1273: Attachment: NUTCH-1273-v2-trunk.patch This patch goes some length to address the is

Re: Drawing an analogy between AdaptiveFetchSchedule and AdaptiveCrawlDelay

2012-03-02 Thread Andrzej Bialecki
On 02/03/2012 12:45, Lewis John Mcgibbney wrote: Hi Guys, As there were some comments on the user list, I recently got digging with http redirects then stumbled across NUTCH-1042. Although these are individual issues e.g. redirects and crawl delays, I think they are certainly linked, however wha

Drawing an analogy between AdaptiveFetchSchedule and AdaptiveCrawlDelay

2012-03-02 Thread Lewis John Mcgibbney
Hi Guys, As there were some comments on the user list, I recently got digging with http redirects then stumbled across NUTCH-1042. Although these are individual issues e.g. redirects and crawl delays, I think they are certainly linked, however what is interesting is that users 'usually' don't cons

[jira] [Closed] (NUTCH-1295) nutchgora restlet dependencies failing when remote repos is down

2012-03-02 Thread Ferdy Galema (Closed) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema closed NUTCH-1295. --- Resolution: Fixed committed > nutchgora restlet dependencies failing when remote rep

[jira] [Updated] (NUTCH-1295) nutchgora restlet dependencies failing when remote repos is down

2012-03-02 Thread Ferdy Galema (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Ferdy Galema updated NUTCH-1295: Attachment: NUTCH-1295.patch > nutchgora restlet dependencies failing when remote repos is down

[jira] [Created] (NUTCH-1295) nutchgora restlet dependencies failing when remote repos is down

2012-03-02 Thread Ferdy Galema (Created) (JIRA)
nutchgora restlet dependencies failing when remote repos is down Key: NUTCH-1295 URL: https://issues.apache.org/jira/browse/NUTCH-1295 Project: Nutch Issue Type: Bug

[jira] [Issue Comment Edited] (NUTCH-1024) Dynamically set fetchInterval by MIME-type

2012-03-02 Thread Markus Jelsma (Issue Comment Edited) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13220787#comment-13220787 ] Markus Jelsma edited comment on NUTCH-1024 at 3/2/12 9:05 AM: --

[jira] [Updated] (NUTCH-1024) Dynamically set fetchInterval by MIME-type

2012-03-02 Thread Markus Jelsma (Updated) (JIRA)
[ https://issues.apache.org/jira/browse/NUTCH-1024?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Markus Jelsma updated NUTCH-1024: - Attachment: NUTCH-1024-1.5-1.patch New patch for trunk! This also includes a change to the inject