On Sun, Aug 28, 2011 at 2:37 PM, lewis john mcgibbney < [email protected]> wrote:
> Hi Gabriele can you expand on your last comment... are you running in > deploy > mode? > I'm running nutch locally, from deploy/local/bin/nutch in Fetcher.java there's special code to set numFetchers to 1 for some reason. > > And to reply to your first point, yes you are correct, the FAQ's need > extensive updating. Please feel free to change anything you feel necessary, > however as a matter of retaining knowledge for the legacy of Nutch we are > now moving deprecated/old information resources to the archive section of > the wiki. > > Actually I was wrong, I somehow thought -numFetchers was a bin/nutch fetch option, but it understandably was bin/nutch generate. It's a pity that it's not possible to break big segments into smaller ones on local machines. > > > On Sun, Aug 28, 2011 at 7:58 AM, Gabriele Kahlout > <[email protected]>wrote: > > > but that's no local solution: > > > > if ("local".equals(job.get("mapred.job.tracker")) && numLists != 1) { > > // override > > LOG.info("Generator: jobtracker is 'local', generating exactly one > > partition."); > > numLists = 1; > > } > > > > On Sun, Aug 28, 2011 at 8:57 AM, Gabriele Kahlout > > <[email protected]>wrote: > > > > > it was a bin/nutch generate option. > > > > > > > > > On Sun, Aug 28, 2011 at 6:24 AM, Gabriele Kahlout < > > > [email protected]> wrote: > > > > > >> Hello, > > >> > > >> All over the FAQ <http://wiki.apache.org/nutch/FAQ> it's bin/nutch > > >> -numFetchers option is documented as a way to generate multiple small > > >> segments. However that option doesn't seem available neither in 1.3 > nor > > 1.4. > > >> So should the FAQ be updated or am I missing something? How else could > I > > >> generate multiple small segments? > > >> I can see doing that with -topN but that's less convenient. > > >> > > >> -- > > >> Regards, > > >> K. Gabriele > > >> > > >> --- unchanged since 20/9/10 --- > > >> P.S. If the subject contains "[LON]" or the addressee acknowledges the > > >> receipt within 48 hours then I don't resend the email. > > >> subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ > > >> time(x) < Now + 48h) ⇒ ¬resend(I, this). > > >> > > >> If an email is sent by a sender that is not a trusted contact or the > > email > > >> does not contain a valid code then the email is not received. A valid > > code > > >> starts with a hyphen and ends with "X". > > >> ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y > ∈ > > >> L(-[a-z]+[0-9]X)). > > >> > > >> > > > > > > > > > -- > > > Regards, > > > K. Gabriele > > > > > > --- unchanged since 20/9/10 --- > > > P.S. If the subject contains "[LON]" or the addressee acknowledges the > > > receipt within 48 hours then I don't resend the email. > > > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ > > > time(x) < Now + 48h) ⇒ ¬resend(I, this). > > > > > > If an email is sent by a sender that is not a trusted contact or the > > email > > > does not contain a valid code then the email is not received. A valid > > code > > > starts with a hyphen and ends with "X". > > > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y > ∈ > > > L(-[a-z]+[0-9]X)). > > > > > > > > > > > > -- > > Regards, > > K. Gabriele > > > > --- unchanged since 20/9/10 --- > > P.S. If the subject contains "[LON]" or the addressee acknowledges the > > receipt within 48 hours then I don't resend the email. > > subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ > > time(x) > > < Now + 48h) ⇒ ¬resend(I, this). > > > > If an email is sent by a sender that is not a trusted contact or the > email > > does not contain a valid code then the email is not received. A valid > code > > starts with a hyphen and ends with "X". > > ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈ > > L(-[a-z]+[0-9]X)). > > > > > > -- > *Lewis* > -- Regards, K. Gabriele --- unchanged since 20/9/10 --- P.S. If the subject contains "[LON]" or the addressee acknowledges the receipt within 48 hours then I don't resend the email. subject(this) ∈ L(LON*) ∨ ∃x. (x ∈ MyInbox ∧ Acknowledges(x, this) ∧ time(x) < Now + 48h) ⇒ ¬resend(I, this). If an email is sent by a sender that is not a trusted contact or the email does not contain a valid code then the email is not received. A valid code starts with a hyphen and ends with "X". ∀x. x ∈ MyInbox ⇒ from(x) ∈ MySafeSenderList ∨ (∃y. y ∈ subject(x) ∧ y ∈ L(-[a-z]+[0-9]X)).

