RE: RE: random sampling of crawlDb urls

2018-05-01 Thread Markus Jelsma
3:18 > To: user@nutch.apache.org > Subject: Re: RE: random sampling of crawlDb urls > > Just to clarify: .99 does NOT work fine. It should have rejected most of the > records when I specified "((Math.random())>=.99)". > > I have used expressions not involving M

Re: RE: random sampling of crawlDb urls

2018-05-01 Thread Michael Coffey
Just to clarify: .99 does NOT work fine. It should have rejected most of the records when I specified "((Math.random())>=.99)". I have used expressions not involving Math.random. For example, I can extract records above a specific score with "score>1.0". But the random thing doesn't work even