Re: Post Meetup Meetup was Re: Unit test lag?

2010-01-18 Thread Ted Dunning
On Mon, Jan 18, 2010 at 4:46 PM, Grant Ingersoll wrote:

>
> On Jan 18, 2010, at 12:34 PM, Benson Margulies wrote:
>
> > If it's SF on Thursday, someone will have to have a beer as my proxy.
>
> I volunteer ;-)
>

You're on.


> Sounds like a we have a post meetup meetup brewing.  I'm not familiar with
> the area, anyone know where we can go afterwards?  Also, I'll need a ride
> back to San Mateo if possible.
>

There are lots of places if we can build a caravan to get to Castro street
(Mountain View, 0.5-1 miles) or Murphy Street (Sunnyvale 2 miles).  My house
could even be available, but is a bit more of a mess than guests are usually
allowed to see and doesn't have a beer tap.

Regarding the ride to San Mateo, I would be happy to help transport you to
the train which might satisfy your needs if there isn't somebody else headed
that way.  I am also happy to help transport anybody coming south on the
train to the Dojo.  Mountain View would probably be the better train stop to
aim for if you want to take me up on my offer.


-- 
Ted Dunning, CTO
DeepDyve


Post Meetup Meetup was Re: Unit test lag?

2010-01-18 Thread Grant Ingersoll

On Jan 18, 2010, at 12:34 PM, Benson Margulies wrote:

> If it's SF on Thursday, someone will have to have a beer as my proxy.

I volunteer ;-)

Sounds like a we have a post meetup meetup brewing.  I'm not familiar with the 
area, anyone know where we can go afterwards?  Also, I'll need a ride back to 
San Mateo if possible.

-Grant

Re: Unit test lag?

2010-01-18 Thread Jake Mannix
Hmm, if all you guys are going to be there, I may need to push back my
flight -
I'm scheduled to fly *out* of SFO right around the time of the Meetup, but
if I can push back that flight, I will.

  -jake

On Mon, Jan 18, 2010 at 1:24 PM, Ted Dunning  wrote:

> I'll be there.
>
> Sean, are you really going to be there?  That would be fantastic.
>
> On Mon, Jan 18, 2010 at 6:02 AM, Grant Ingersoll  >wrote:
>
> >
> > On Jan 17, 2010, at 8:35 PM, Ted Dunning wrote:
> >
> > > We should have a beer some time anyway and the beers we owe you for
> > cleaning
> > > up Colt more than cancel any potential beer on this issue so I will be
> > happy
> > > to buy (Sean, you are included for similar reasons if we ever see each
> > > other).
> >
> > After the Meetup (http://www.meetup.com/SFBay-Lucene-Solr-Meetup/) on
> > Thursday?  Looks like Sean will be there.  What other Mahouts are
> planning
> > on attending?
> >
> > -Grant
>
>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>


Re: Unit test lag?

2010-01-18 Thread Sean Owen
Yes, I'm on the west coast for a week from tomorrow for various
reasons and so will certainly stop in. Looking forward to it.

Sean

On Mon, Jan 18, 2010 at 9:24 PM, Ted Dunning  wrote:
> I'll be there.
>
> Sean, are you really going to be there?  That would be fantastic.


Re: Unit test lag?

2010-01-18 Thread Ted Dunning
I'll be there.

Sean, are you really going to be there?  That would be fantastic.

On Mon, Jan 18, 2010 at 6:02 AM, Grant Ingersoll wrote:

>
> On Jan 17, 2010, at 8:35 PM, Ted Dunning wrote:
>
> > We should have a beer some time anyway and the beers we owe you for
> cleaning
> > up Colt more than cancel any potential beer on this issue so I will be
> happy
> > to buy (Sean, you are included for similar reasons if we ever see each
> > other).
>
> After the Meetup (http://www.meetup.com/SFBay-Lucene-Solr-Meetup/) on
> Thursday?  Looks like Sean will be there.  What other Mahouts are planning
> on attending?
>
> -Grant




-- 
Ted Dunning, CTO
DeepDyve


Re: Unit test lag?

2010-01-18 Thread Benson Margulies
If it's SF on Thursday, someone will have to have a beer as my proxy.
I'll be back here in the snow.

On Mon, Jan 18, 2010 at 12:21 PM, Jeff Eastman
 wrote:
> I'm planning on attending
> Jeff
>
>
> Grant Ingersoll wrote:
>>
>> On Jan 17, 2010, at 8:35 PM, Ted Dunning wrote:
>>
>>
>>>
>>> We should have a beer some time anyway and the beers we owe you for
>>> cleaning
>>> up Colt more than cancel any potential beer on this issue so I will be
>>> happy
>>> to buy (Sean, you are included for similar reasons if we ever see each
>>> other).
>>>
>>
>> After the Meetup (http://www.meetup.com/SFBay-Lucene-Solr-Meetup/) on
>> Thursday?  Looks like Sean will be there.  What other Mahouts are planning
>> on attending?
>>
>> -Grant
>>
>
>


Re: Unit test lag?

2010-01-18 Thread Jeff Eastman

I'm planning on attending
Jeff


Grant Ingersoll wrote:

On Jan 17, 2010, at 8:35 PM, Ted Dunning wrote:

  

We should have a beer some time anyway and the beers we owe you for cleaning
up Colt more than cancel any potential beer on this issue so I will be happy
to buy (Sean, you are included for similar reasons if we ever see each
other).



After the Meetup (http://www.meetup.com/SFBay-Lucene-Solr-Meetup/) on Thursday? 
 Looks like Sean will be there.  What other Mahouts are planning on attending?

-Grant
  




Re: Unit test lag?

2010-01-18 Thread Drew Farris
On Mon, Jan 18, 2010 at 9:42 AM, Sean Owen  wrote:

> You can punt the choice all the way up to fix that. Then regular
> callers are forced to instantiate and supply the RNG in all cases, and
> the API has Randoms all over the place, and I suppose I don't quite
> like that aesthetically.

Point taken. I suspect there may be ways around this ugliness in many
cases, but you certainly know the code infinitely better than I do.

>> RandomUtils.useTestSeed() is called once in a VM all other callers of
>> RandomUtils.getRandom() will get a test seed.
>
> ... yep and I think this is cleaner than the option above. That may be
> the only delta between what we're saying.

Yes, this is the only difference. Thanks for taking the time to
understand my point of view and taking steps to resolve the slowness
issue. Both are very much appreciated.

Drew


Re: Unit test lag?

2010-01-18 Thread Sean Owen
On Mon, Jan 18, 2010 at 2:36 PM, Drew Farris  wrote:
> I'm suggesting that the instantiator/caller of the class choose
> between a regular and test-friendly RNG. In some classes that creator
> will be a unit test in other cases the creator will be another piece
> of production code. In some cases the decision as to which type of RNG
> to use will need to be made further up the object graph than the
> immediate instantiator/caller, and generally it should be made as
> close to main() or setUp() as possible.

Yes, but the problem is the production code. The test code knows it's
in test mode. The production code does not, since it's executed in
test and non-test mode. It can't make this choice independently.

You can punt the choice all the way up to fix that. Then regular
callers are forced to instantiate and supply the RNG in all cases, and
the API has Randoms all over the place, and I suppose I don't quite
like that aesthetically.


> RandomUtils essentially achieves the same thing in a static fashion.
> The class itself decides that it will always delegate to RandomUtils
> and random utils provides the different strategies. Currently if
> RandomUtils.useTestSeed() is called once in a VM all other callers of
> RandomUtils.getRandom() will get a test seed.

... yep and I think this is cleaner than the option above. That may be
the only delta between what we're saying.

(Separately I'd like to hijack MAHOUT-260 now to talk about the
still-existing repeatability problem, which is a different question.
Any thoughts on that? it patches this up pretty well but isn't
entirely pretty.)


Re: Unit test lag?

2010-01-18 Thread Drew Farris
On Mon, Jan 18, 2010 at 9:23 AM, Sean Owen  wrote:
> You're suggesting the class choose between a regular and test-friendly
> RNG, by calling one of two methods. Doesn't that put the decision with
> the class instead of externally? Right now it's already external.
> RandomUtils decides what to instantiate.

I'm suggesting that the instantiator/caller of the class choose
between a regular and test-friendly RNG. In some classes that creator
will be a unit test in other cases the creator will be another piece
of production code. In some cases the decision as to which type of RNG
to use will need to be made further up the object graph than the
immediate instantiator/caller, and generally it should be made as
close to main() or setUp() as possible.

RandomUtils essentially achieves the same thing in a static fashion.
The class itself decides that it will always delegate to RandomUtils
and random utils provides the different strategies. Currently if
RandomUtils.useTestSeed() is called once in a VM all other callers of
RandomUtils.getRandom() will get a test seed.

Drew


Re: Unit test lag?

2010-01-18 Thread Sean Owen
You're suggesting the class choose between a regular and test-friendly
RNG, by calling one of two methods. Doesn't that put the decision with
the class instead of externally? Right now it's already external.
RandomUtils decides what to instantiate.

On Mon, Jan 18, 2010 at 2:21 PM, Drew Farris  wrote:
> You get it entirelym Moving around the injection in this case produces
> more testable code in that you don't have a class-defined behavior for
> the RNG. Instead it becomes an externally-defined behavior.


Re: Unit test lag?

2010-01-18 Thread Drew Farris
On Mon, Jan 18, 2010 at 9:06 AM, Sean Owen  wrote:

> (Separately you could argue we're going about this all wrong, by
> trying to depend on the exact output of the RNG..

No argument here. In practice I don't think we can really get around
using a pre-seeded RNG for tests.

> You've moved around the injection, but nothing else I think. Am I
> misunderstanding because that seems to be why I'm not following
> getTestRandom().

You get it entirelym Moving around the injection in this case produces
more testable code in that you don't have a class-defined behavior for
the RNG. Instead it becomes an externally-defined behavior.

> (Taking it as a constructor param is the conventional way to set up
> for injecting, but from an API perspective I don't quite like it. I
> understand why an evaluator necessarily needs a Recommender to exist,
> but why do I need to give it a Random, conceptually?)

It really depends on the evaluator implementation. In the case of
GenericRecommenderIRStatsEvaluator the evaluator happens use
randomness to perform the evaluation function. I agree that these
sorts of injections should not be accommodated at the interface level
and shouldn't pollute the API.

Drew


Re: Unit test lag?

2010-01-18 Thread Sean Owen
On Mon, Jan 18, 2010 at 2:00 PM, Drew Farris  wrote:
> In what cases would you want to reset them all remotely, at the
> beginning of each test?

You pretty much said it -- tests should start from a known, fixed
state, so that the result is the same each time, and we can assert
about the output. This means setting the entire library and test
fixture state to a known state -- that's why there's a need to not
just control the initial seed but reset it.

(Separately you could argue we're going about this all wrong, by
trying to depend on the exact output of the RNG, and should be writing
tests that assert only what's true no matter what the outcome, or
else, assert things that should be true in 99.% of all RNG
sequences. But let's resort to that argument later.)


> In tests you call
>
> Random r = RandomUtil.getTestRandom()
> ev = new GenericRecommenderIRStatsEvaluator(r);
>
> In production code you call:
>
> Random r = RandomUtil.getRandom();
> ev = new GenericRecommenderIRStatsEvaluator(r);

And you're suggesting getRandom() returns a randomly-seeded RNG? Then
this just returns to the original problem: the test is not repeatable.
You've moved around the injection, but nothing else I think. Am I
misunderstanding because that seems to be why I'm not following
getTestRandom().

(Taking it as a constructor param is the conventional way to set up
for injecting, but from an API perspective I don't quite like it. I
understand why an evaluator necessarily needs a Recommender to exist,
but why do I need to give it a Random, conceptually?)


Re: Unit test lag?

2010-01-18 Thread Grant Ingersoll

On Jan 17, 2010, at 8:35 PM, Ted Dunning wrote:

> We should have a beer some time anyway and the beers we owe you for cleaning
> up Colt more than cancel any potential beer on this issue so I will be happy
> to buy (Sean, you are included for similar reasons if we ever see each
> other).

After the Meetup (http://www.meetup.com/SFBay-Lucene-Solr-Meetup/) on Thursday? 
 Looks like Sean will be there.  What other Mahouts are planning on attending?

-Grant

Re: Unit test lag?

2010-01-18 Thread Drew Farris
On Mon, Jan 18, 2010 at 3:58 AM, Sean Owen  wrote:

> The real fix is centralizing management of Random, tracking them, and
> being able to reset them all "remotely".

In what cases would you want to reset them all remotely, at the
beginning of each test?

> It is injected already -- that's the purpose of having "getRandom()"
> and not "getTestRandom()". That's the means by which a different
> fixed-seed RNG can be provided when run in a test harness. You
> couldn't do that with two methods: they'd each return a normal or
> fixed RNG, and the code could only call one.

I was suggesting the RNG could be specified outside of the code
instead of inside of the code. For example, instead of:

GenericRecommenderIRStatsEvaluator() {
random = RandomUtils.getRandom();
  }

You could have:

GenericRecommenderIRStatsEvaluator(Random r) {
random = r;
  }

In tests you call

Random r = RandomUtil.getTestRandom()
ev = new GenericRecommenderIRStatsEvaluator(r);

In production code you call:

Random r = RandomUtil.getRandom();
ev = new GenericRecommenderIRStatsEvaluator(r);

The alternative would be to leave the constructor for
GenericRecommenderIRStatsEvaluator at it is and provde a way to set
the value of the field 'random' to something for testing, e.g: the
return value of RandomUtil.getTestRandom()

For example in tests:

Random r = RantomUtil.getTestRandom()
ev = new GenericRecommenderIRStatsEvaluator();
ev.setRandomSeed(r);

While production code would remain unchanged. This won't necessarily
work in all cases depending upon what sort of things happen in the
constructor.

As Ted pointed out, doing the above would require a fair amount of
thought and work, there's no single approach to introducing this type
of injection that will work everywhere.

Drew


Re: Unit test lag?

2010-01-18 Thread Sean Owen
Same here, I don't like Spring myself as it smells like
overengineering -- certainly for this case. I'm otherwise a luddite
though and could more broadly be convinced.

On Mon, Jan 18, 2010 at 2:49 AM, Ted Dunning  wrote:
> I have had too many unpleasant experiences using Spring to be enthused about
> jumping fully into it for this one use case.


Re: Unit test lag?

2010-01-18 Thread Sean Owen
On Mon, Jan 18, 2010 at 2:24 AM, Drew Farris  wrote:
> On Sun, Jan 17, 2010 at 9:10 PM, Sean Owen  wrote:
>> There are already cases where code needs to control the seed (mostly
>> to serialize/deserialize the exact state of an object). I don't think
>> that's the issue per se? The issue is when an RNG lives beyond one
>> test, and there are legitimate reasons that may be so.
>
> Ahh, ok, I wasn't really considering this. Would it be sufficient to
> assign the RNG to a static field in the test class in this case? If it
> needed to live across multiple classes, it could be public.
> Nevertheless..

Well that would create the problem rather than solve it but since the
problem already does (or will) exist legitimately in the main code,
you could say sure, why not? After all in some class that's
instantiate a lot, doesn't make sense from an efficiency standpoint to
initialize an RNG per instance, and so it's static, and there you go,
problem.

(This wouldn't be a good change for tests though since a
statically-initialized RNG would be created before setUp() set the
framework to test mode. But then again, that's another example of the
actual issue at hand.)

The real fix is centralizing management of Random, tracking them, and
being able to reset them all "remotely". I know how that could be
done.


> I suspect I'm missing something here because I don't understand how
> randomness is used in the non-test code or specifically how the RNG's
> are managed. I was (falsely, likely) assuming that the non-test code
> didn't obtain the RNG itself but rather had it provided/injected by an
> external source. In the context of a test, something from
> getTestRandom() which uses a fixed seed could be injected, while in
> production code something else would be.
>

Randomness is used outside of tests to, for example, sample 10% of a
data set for example. Or k-means.

It is injected already -- that's the purpose of having "getRandom()"
and not "getTestRandom()". That's the means by which a different
fixed-seed RNG can be provided when run in a test harness. You
couldn't do that with two methods: they'd each return a normal or
fixed RNG, and the code could only call one.


Re: Unit test lag?

2010-01-17 Thread Ted Dunning
The Guice user guide is also very good at describing the benefits of
injection.

http://code.google.com/docreader/#p=google-guice&s=google-guice&t=Motivation

I also like the level of complexity that Guice introduces (nearly zero).  My
major problem with Spring is that it introduces and mixes a bunch of
different concepts at the same time.  This makes it hard to take a small
bite.  Guice looks like a small bite and just defining constructors for
hand-done injection is a still smaller bite.

On Sun, Jan 17, 2010 at 6:59 PM, Drew Farris  wrote:

> However, we can support the concept of injection without having to
> commit to using one framework or another. Every class is instantiated
> somewhere, so manual injection can be performed sans framework at that
> point. Speaking specifically for this case, the contract would be that
> anything that requires a RNG gets it injected by the class that
> instantiates it instead of obtaining one through some method of its
> own.
>
> There's a great series of posts that describe the advantages to this
> approach when it comes to testability that's reachable from:
> http://misko.hevery.com/2008/09/10/where-have-all-the-new-operators-gone/
>



-- 
Ted Dunning, CTO
DeepDyve


Re: Unit test lag?

2010-01-17 Thread Ted Dunning
I prefer the injection method as well.

On Sun, Jan 17, 2010 at 7:51 PM, Drew Farris  wrote:

> > If we want to go in Drew's suggested direction, we have to decide what
> > to do about seeds. We either need to define an
> > 'RandomNumberGeneratorFactory' interface which takes seeds and return
> > generators, or we want to inject Random objects and expect the
> > injector to worry about constructing and dealing with seeds.
>
> I vote for the latter. Those Random objects could be created via a
> factory by whomever is injecting them.
>
> FWIW, RandomNumberGeneratorFactory pretty much exists today as
> RandomUtils, I suspect we would just want to get rid of the static
> boolean that determines whether a test seed or random seed is used for
> getRandom().




-- 
Ted Dunning, CTO
DeepDyve


Re: Unit test lag?

2010-01-17 Thread Drew Farris
On Sun, Jan 17, 2010 at 10:31 PM, Benson Margulies
 wrote:
> Have a look at the patch I posted to MAHOUT-260. It ducks the
> injection question for now.

This looks reasonable.

> However, what's perhaps most interesting is that it makes tests fail!
> Some tests get different answers with the stock JDK rng.

Which tests are failing? I'm having some issues with non-patched head ATM.

> If we want to go in Drew's suggested direction, we have to decide what
> to do about seeds. We either need to define an
> 'RandomNumberGeneratorFactory' interface which takes seeds and return
> generators, or we want to inject Random objects and expect the
> injector to worry about constructing and dealing with seeds.

I vote for the latter. Those Random objects could be created via a
factory by whomever is injecting them.

FWIW, RandomNumberGeneratorFactory pretty much exists today as
RandomUtils, I suspect we would just want to get rid of the static
boolean that determines whether a test seed or random seed is used for
getRandom().


Re: Unit test lag?

2010-01-17 Thread Benson Margulies
Have a look at the patch I posted to MAHOUT-260. It ducks the
injection question for now.

However, what's perhaps most interesting is that it makes tests fail!
Some tests get different answers with the stock JDK rng.

If we want to go in Drew's suggested direction, we have to decide what
to do about seeds. We either need to define an
'RandomNumberGeneratorFactory' interface which takes seeds and return
generators, or we want to inject Random objects and expect the
injector to worry about constructing and dealing with seeds.


On Sun, Jan 17, 2010 at 9:59 PM, Drew Farris  wrote:
> I've used spring a great deal as well and generally look pretty
> favorably upon it, but readily admit there are definite cons to it to.
>
> However, we can support the concept of injection without having to
> commit to using one framework or another. Every class is instantiated
> somewhere, so manual injection can be performed sans framework at that
> point. Speaking specifically for this case, the contract would be that
> anything that requires a RNG gets it injected by the class that
> instantiates it instead of obtaining one through some method of its
> own.
>
> There's a great series of posts that describe the advantages to this
> approach when it comes to testability that's reachable from:
> http://misko.hevery.com/2008/09/10/where-have-all-the-new-operators-gone/
>
> This sort of injection strategy can be introduced steps across the
> codebase using manual injection techniques and then as/if needed a
> dynamic injection framwork can be folded in. It seems that plugging in
> RNG's might be a good place to start.
>
> Drew
>
> On Sun, Jan 17, 2010 at 9:35 PM, Benson Margulies  
> wrote:
>> One moral equivalent of Spring is a String property with a
>> fully-qualified class name which RandomUtils instantiates to get its
>> RNG. Another is to actually inject the RNG object. Spring would get
>> really tempting here.
>>
>> I've had an extended immersion in Spring via CXF, so I have a low
>> threshold for introducing it.
>>
>>
>>
>> On Sun, Jan 17, 2010 at 9:24 PM, Drew Farris  wrote:
>>> On Sun, Jan 17, 2010 at 9:10 PM, Sean Owen  wrote:
 There are already cases where code needs to control the seed (mostly
 to serialize/deserialize the exact state of an object). I don't think
 that's the issue per se? The issue is when an RNG lives beyond one
 test, and there are legitimate reasons that may be so.
>>>
>>> Ahh, ok, I wasn't really considering this. Would it be sufficient to
>>> assign the RNG to a static field in the test class in this case? If it
>>> needed to live across multiple classes, it could be public.
>>> Nevertheless..
>>>
 I don't see how a getTestRandom() method fixes something... I can't
 call this in my non-test code, and then tests can't control those
 RNGs. The non-test code can't make this decision which is why they
 don't. I don't think this is the problem/solution but rather having a
 way to globally reset all RNGs.
>>>
>>> I suspect I'm missing something here because I don't understand how
>>> randomness is used in the non-test code or specifically how the RNG's
>>> are managed. I was (falsely, likely) assuming that the non-test code
>>> didn't obtain the RNG itself but rather had it provided/injected by an
>>> external source. In the context of a test, something from
>>> getTestRandom() which uses a fixed seed could be injected, while in
>>> production code something else would be.
>>>
>>
>


Re: Unit test lag?

2010-01-17 Thread Drew Farris
I've used spring a great deal as well and generally look pretty
favorably upon it, but readily admit there are definite cons to it to.

However, we can support the concept of injection without having to
commit to using one framework or another. Every class is instantiated
somewhere, so manual injection can be performed sans framework at that
point. Speaking specifically for this case, the contract would be that
anything that requires a RNG gets it injected by the class that
instantiates it instead of obtaining one through some method of its
own.

There's a great series of posts that describe the advantages to this
approach when it comes to testability that's reachable from:
http://misko.hevery.com/2008/09/10/where-have-all-the-new-operators-gone/

This sort of injection strategy can be introduced steps across the
codebase using manual injection techniques and then as/if needed a
dynamic injection framwork can be folded in. It seems that plugging in
RNG's might be a good place to start.

Drew

On Sun, Jan 17, 2010 at 9:35 PM, Benson Margulies  wrote:
> One moral equivalent of Spring is a String property with a
> fully-qualified class name which RandomUtils instantiates to get its
> RNG. Another is to actually inject the RNG object. Spring would get
> really tempting here.
>
> I've had an extended immersion in Spring via CXF, so I have a low
> threshold for introducing it.
>
>
>
> On Sun, Jan 17, 2010 at 9:24 PM, Drew Farris  wrote:
>> On Sun, Jan 17, 2010 at 9:10 PM, Sean Owen  wrote:
>>> There are already cases where code needs to control the seed (mostly
>>> to serialize/deserialize the exact state of an object). I don't think
>>> that's the issue per se? The issue is when an RNG lives beyond one
>>> test, and there are legitimate reasons that may be so.
>>
>> Ahh, ok, I wasn't really considering this. Would it be sufficient to
>> assign the RNG to a static field in the test class in this case? If it
>> needed to live across multiple classes, it could be public.
>> Nevertheless..
>>
>>> I don't see how a getTestRandom() method fixes something... I can't
>>> call this in my non-test code, and then tests can't control those
>>> RNGs. The non-test code can't make this decision which is why they
>>> don't. I don't think this is the problem/solution but rather having a
>>> way to globally reset all RNGs.
>>
>> I suspect I'm missing something here because I don't understand how
>> randomness is used in the non-test code or specifically how the RNG's
>> are managed. I was (falsely, likely) assuming that the non-test code
>> didn't obtain the RNG itself but rather had it provided/injected by an
>> external source. In the context of a test, something from
>> getTestRandom() which uses a fixed seed could be injected, while in
>> production code something else would be.
>>
>


Re: Unit test lag?

2010-01-17 Thread Benson Margulies
OK, then the class name appeals to me. I'll propose a patch.

On Sun, Jan 17, 2010 at 9:49 PM, Ted Dunning  wrote:
> I have had too many unpleasant experiences using Spring to be enthused about
> jumping fully into it for this one use case.
>
> On Sun, Jan 17, 2010 at 6:35 PM, Benson Margulies 
> wrote:
>
>> One moral equivalent of Spring is a String property with a
>> fully-qualified class name which RandomUtils instantiates to get its
>> RNG. Another is to actually inject the RNG object. Spring would get
>> really tempting here.
>>
>> I've had an extended immersion in Spring via CXF, so I have a low
>> threshold for introducing it.
>>
>>
>


Re: Unit test lag?

2010-01-17 Thread Ted Dunning
I have had too many unpleasant experiences using Spring to be enthused about
jumping fully into it for this one use case.

On Sun, Jan 17, 2010 at 6:35 PM, Benson Margulies wrote:

> One moral equivalent of Spring is a String property with a
> fully-qualified class name which RandomUtils instantiates to get its
> RNG. Another is to actually inject the RNG object. Spring would get
> really tempting here.
>
> I've had an extended immersion in Spring via CXF, so I have a low
> threshold for introducing it.
>
>


Re: Unit test lag?

2010-01-17 Thread Benson Margulies
One moral equivalent of Spring is a String property with a
fully-qualified class name which RandomUtils instantiates to get its
RNG. Another is to actually inject the RNG object. Spring would get
really tempting here.

I've had an extended immersion in Spring via CXF, so I have a low
threshold for introducing it.



On Sun, Jan 17, 2010 at 9:24 PM, Drew Farris  wrote:
> On Sun, Jan 17, 2010 at 9:10 PM, Sean Owen  wrote:
>> There are already cases where code needs to control the seed (mostly
>> to serialize/deserialize the exact state of an object). I don't think
>> that's the issue per se? The issue is when an RNG lives beyond one
>> test, and there are legitimate reasons that may be so.
>
> Ahh, ok, I wasn't really considering this. Would it be sufficient to
> assign the RNG to a static field in the test class in this case? If it
> needed to live across multiple classes, it could be public.
> Nevertheless..
>
>> I don't see how a getTestRandom() method fixes something... I can't
>> call this in my non-test code, and then tests can't control those
>> RNGs. The non-test code can't make this decision which is why they
>> don't. I don't think this is the problem/solution but rather having a
>> way to globally reset all RNGs.
>
> I suspect I'm missing something here because I don't understand how
> randomness is used in the non-test code or specifically how the RNG's
> are managed. I was (falsely, likely) assuming that the non-test code
> didn't obtain the RNG itself but rather had it provided/injected by an
> external source. In the context of a test, something from
> getTestRandom() which uses a fixed seed could be injected, while in
> production code something else would be.
>


Re: Unit test lag?

2010-01-17 Thread Drew Farris
On Sun, Jan 17, 2010 at 9:10 PM, Sean Owen  wrote:
> There are already cases where code needs to control the seed (mostly
> to serialize/deserialize the exact state of an object). I don't think
> that's the issue per se? The issue is when an RNG lives beyond one
> test, and there are legitimate reasons that may be so.

Ahh, ok, I wasn't really considering this. Would it be sufficient to
assign the RNG to a static field in the test class in this case? If it
needed to live across multiple classes, it could be public.
Nevertheless..

> I don't see how a getTestRandom() method fixes something... I can't
> call this in my non-test code, and then tests can't control those
> RNGs. The non-test code can't make this decision which is why they
> don't. I don't think this is the problem/solution but rather having a
> way to globally reset all RNGs.

I suspect I'm missing something here because I don't understand how
randomness is used in the non-test code or specifically how the RNG's
are managed. I was (falsely, likely) assuming that the non-test code
didn't obtain the RNG itself but rather had it provided/injected by an
external source. In the context of a test, something from
getTestRandom() which uses a fixed seed could be injected, while in
production code something else would be.


Re: Unit test lag?

2010-01-17 Thread Ted Dunning
On Sun, Jan 17, 2010 at 6:10 PM, Sean Owen  wrote:

> There are already cases where code needs to control the seed (mostly
> to serialize/deserialize the exact state of an object).
>

That is an important case, but it should be deterministic and thus not a
problem for testing.  Really the RNG is being used more as a good hash
function in these cases.


> I don't think
> that's the issue per se?
>

I don't think that the serialization trick is a problem at all.


> The issue is when an RNG lives beyond one
> test, and there are legitimate reasons that may be so.
>

Hmm... I can't think of any off-hand.  You probably have something in mind.

Can you say what reasons there are for this?


> I don't see how a getTestRandom() method fixes something... I can't
> call this in my non-test code, and then tests can't control those
> RNGs. The non-test code can't make this decision which is why they
> don't. I don't think this is the problem/solution but rather having a
> way to globally reset all RNGs.
>

I think that the problem that we are talking around here is whether we
commit to having RNG's be injectable whereever they are used.   Half
measures are the problem here (IMHO).  Real injection would solve all the
questions by giving complete control to the test case.  I don't think that
we need Guice or Spring here, just a way to say "use this RNG, if you don't
mind".


Re: Unit test lag?

2010-01-17 Thread Sean Owen
This could be my fault though my tests are passing. Let me look.

On Jan 18, 2010 2:15 AM, "Drew Farris"  wrote:

Spoke too soon of course, some tests fail strangely locally:

/u01/eclipse/eclipse-mahout-workspace/mahout-svn/core/src/test/java/org/apache/mahout/ga/watchmaker/EvalMapperTest.java:[48,25]
type parameter org.apache.hadoop.io.LongWritable is not within its
bound

/u01/eclipse/eclipse-mahout-workspace/mahout-svn/core/src/test/java/org/apache/mahout/ga/watchmaker/EvalMapperTest.java:[48,92]
type parameter org.apache.hadoop.io.LongWritable is not within its
bound

Looks like this was discussed way back in
http://issues.apache.org/jira/browse/MAHOUT-127, but for the life of
my I can'y figure out why I'm running into it now.

On Sun, Jan 17, 2010 at 8:38 PM, Drew Farris  wrote:
> On Sun, Jan 17, 2010 ...


Re: Unit test lag?

2010-01-17 Thread Drew Farris
Spoke too soon of course, some tests fail strangely locally:

/u01/eclipse/eclipse-mahout-workspace/mahout-svn/core/src/test/java/org/apache/mahout/ga/watchmaker/EvalMapperTest.java:[48,25]
type parameter org.apache.hadoop.io.LongWritable is not within its
bound

/u01/eclipse/eclipse-mahout-workspace/mahout-svn/core/src/test/java/org/apache/mahout/ga/watchmaker/EvalMapperTest.java:[48,92]
type parameter org.apache.hadoop.io.LongWritable is not within its
bound

Looks like this was discussed way back in
http://issues.apache.org/jira/browse/MAHOUT-127, but for the life of
my I can'y figure out why I'm running into it now.

On Sun, Jan 17, 2010 at 8:38 PM, Drew Farris  wrote:
> On Sun, Jan 17, 2010 at 2:55 PM, Sean Owen  wrote:
>> Am I right that running tests in 1 JVM instead of n JVMs helps
>> mitigate this? because I just committed that change.
>>
>
> I just updated to HEAD, and this seems to have fixed the problem. Unit
> tests are completing in times in-line with those reported by the tests
> themselves.
>
> Since this was happening at class loading time, running all of the
> tests in a single VM does mitigate this becuase less forks mean less
> entropy drain and there is more time to collect entropy between forks.
>


Re: Unit test lag?

2010-01-17 Thread Sean Owen
There are already cases where code needs to control the seed (mostly
to serialize/deserialize the exact state of an object). I don't think
that's the issue per se? The issue is when an RNG lives beyond one
test, and there are legitimate reasons that may be so.

I don't see how a getTestRandom() method fixes something... I can't
call this in my non-test code, and then tests can't control those
RNGs. The non-test code can't make this decision which is why they
don't. I don't think this is the problem/solution but rather having a
way to globally reset all RNGs.

On Mon, Jan 18, 2010 at 1:55 AM, Drew Farris  wrote:
> The potential issue I see is if any tests expected to run using a seed
>>other< than the test seed. Now that we are no longer forking, calling
> RandomUtils.useTestSeed() in test A will cause the test seed to be
> used in B, C, D, E etc. In this case it makes sense to avoid using a
> stateful static classes like RandomUtils, probably to condense this
> down to RandomUtils.getTestRandom().
>
> RandomUtils.getRandom() will reset the seed in any case, to a default
> seed if useTestSeed() as ever been called, to something random if
> useTestSeed() has never been called.
>


Re: Unit test lag?

2010-01-17 Thread Drew Farris
Ted, It depends on the test implementation itself. Generally, I
believe the pattern that is followed is:

RandomUtils.useTestSeed();
Random r = RandomUtils.getRandom();

The potential issue I see is if any tests expected to run using a seed
>other< than the test seed. Now that we are no longer forking, calling
RandomUtils.useTestSeed() in test A will cause the test seed to be
used in B, C, D, E etc. In this case it makes sense to avoid using a
stateful static classes like RandomUtils, probably to condense this
down to RandomUtils.getTestRandom().

RandomUtils.getRandom() will reset the seed in any case, to a default
seed if useTestSeed() as ever been called, to something random if
useTestSeed() has never been called.


On Sun, Jan 17, 2010 at 8:39 PM, Ted Dunning  wrote:
> Do the RandomUtils reset the seed for every test as desired?
>
> On Sun, Jan 17, 2010 at 5:38 PM, Drew Farris  wrote:
>
>> On Sun, Jan 17, 2010 at 2:55 PM, Sean Owen  wrote:
>> > Am I right that running tests in 1 JVM instead of n JVMs helps
>> > mitigate this? because I just committed that change.
>> >
>>
>> I just updated to HEAD, and this seems to have fixed the problem. Unit
>> tests are completing in times in-line with those reported by the tests
>> themselves.
>>
>> Since this was happening at class loading time, running all of the
>> tests in a single VM does mitigate this becuase less forks mean less
>> entropy drain and there is more time to collect entropy between forks.
>>
>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>


Re: Unit test lag?

2010-01-17 Thread Ted Dunning
I can imagine ways to nuke the problem as well.

On Sun, Jan 17, 2010 at 5:46 PM, Sean Owen  wrote:

> I can imagine some semi-elaborate ways to actually explicitly manage
> and address this with a wrapper class.
>



-- 
Ted Dunning, CTO
DeepDyve


Re: Unit test lag?

2010-01-17 Thread Sean Owen
Not quite, and you have a good point. Each instance of an RNG is
seeded identically when testing. But if something holds an RNG open
across tests, it won't be reset somehow. I could imagine that if
there's a static RNG somewhere in a class, which would be reasonable.
(Or if a test isn't quite using setUp() properly vis-a-vis RNGs, but
that's fixable.)

I can imagine some semi-elaborate ways to actually explicitly manage
and address this with a wrapper class.

On Mon, Jan 18, 2010 at 1:39 AM, Ted Dunning  wrote:
> Do the RandomUtils reset the seed for every test as desired?


Re: Unit test lag?

2010-01-17 Thread Ted Dunning
Do the RandomUtils reset the seed for every test as desired?

On Sun, Jan 17, 2010 at 5:38 PM, Drew Farris  wrote:

> On Sun, Jan 17, 2010 at 2:55 PM, Sean Owen  wrote:
> > Am I right that running tests in 1 JVM instead of n JVMs helps
> > mitigate this? because I just committed that change.
> >
>
> I just updated to HEAD, and this seems to have fixed the problem. Unit
> tests are completing in times in-line with those reported by the tests
> themselves.
>
> Since this was happening at class loading time, running all of the
> tests in a single VM does mitigate this becuase less forks mean less
> entropy drain and there is more time to collect entropy between forks.
>



-- 
Ted Dunning, CTO
DeepDyve


Re: Unit test lag?

2010-01-17 Thread Ted Dunning
And I think that we need to be robust in the face of either behavior.  It
should be fine to initialize once.

On Sun, Jan 17, 2010 at 5:36 PM, Sean Owen  wrote:

> I think you are right in that JVMs are allowed to wait until first use
> to load a class, but the one time I checked the Sun JVM it didn't work
> that way. It actively loaded the class  (which is also allowed). I
> would bet dollars to donuts we'd find it doesn't wait.
>



-- 
Ted Dunning, CTO
DeepDyve


Re: Unit test lag?

2010-01-17 Thread Drew Farris
On Sun, Jan 17, 2010 at 2:55 PM, Sean Owen  wrote:
> Am I right that running tests in 1 JVM instead of n JVMs helps
> mitigate this? because I just committed that change.
>

I just updated to HEAD, and this seems to have fixed the problem. Unit
tests are completing in times in-line with those reported by the tests
themselves.

Since this was happening at class loading time, running all of the
tests in a single VM does mitigate this becuase less forks mean less
entropy drain and there is more time to collect entropy between forks.


Re: Unit test lag?

2010-01-17 Thread Sean Owen
I think you are right in that JVMs are allowed to wait until first use
to load a class, but the one time I checked the Sun JVM it didn't work
that way. It actively loaded the class  (which is also allowed). I
would bet dollars to donuts we'd find it doesn't wait.


On Mon, Jan 18, 2010 at 1:22 AM, Benson Margulies  wrote:
> Sean, that's not how class loaders work AFAIK. the mere presence of an
> import does not trigger the load. You have to touch it.
>
> HOWEVER, if I am wrong, I will (a) buy the beer, and (b) add the
> reflective code to get rid of the import.


Re: Unit test lag?

2010-01-17 Thread Ted Dunning
We should have a beer some time anyway and the beers we owe you for cleaning
up Colt more than cancel any potential beer on this issue so I will be happy
to buy (Sean, you are included for similar reasons if we ever see each
other).

Does the difference here matter?  If we have zero or one class load, we
should be fine relative to bits consumed.  The problem is n-class loads.

On Sun, Jan 17, 2010 at 5:22 PM, Benson Margulies wrote:

> Sean, that's not how class loaders work AFAIK. the mere presence of an
> import does not trigger the load. You have to touch it.
>
> HOWEVER, if I am wrong, I will (a) buy the beer, and (b) add the
> reflective code to get rid of the import.
>



-- 
Ted Dunning, CTO
DeepDyve


Re: Unit test lag?

2010-01-17 Thread Ted Dunning
No.  We won't.  The JDK RNG is fine for pretty much everything we do.  I
agree that we should use a better generator for production use, but for
deterministic tests, there isn't an issue.

And frankly, I try to use algorithms are robust about the generator they
use.  Some applications are really good at exposing flaws and some are fine
with anything better than ROT-13(n++).  I think that all we have are the
latter kind so far.

On Sun, Jan 17, 2010 at 4:19 PM, Benson Margulies wrote:

> So the question to me is whether we lose any test quality by using the JDK
> RNG.




-- 
Ted Dunning, CTO
DeepDyve


Re: Unit test lag?

2010-01-17 Thread Benson Margulies
Sean, that's not how class loaders work AFAIK. the mere presence of an
import does not trigger the load. You have to touch it.

HOWEVER, if I am wrong, I will (a) buy the beer, and (b) add the
reflective code to get rid of the import.

On Sun, Jan 17, 2010 at 7:26 PM, Sean Owen  wrote:
> Nope, since it imports MersenneTwisterRNG, that class will be
> initialized the moment RandomUtils is loaded.
>
> On Mon, Jan 18, 2010 at 12:19 AM, Benson Margulies
>  wrote:
>> That would make a difference. If the code in RandomUtils never new's
>> the Mersenne class, then it's static blocks would never run. If
>> necessary, the Mersenne class could by loaded explicitly, but I don't
>> think we have to go that far.
>>
>> So the question to me is whether we lose any test quality by using the JDK 
>> RNG.
>>
>


Re: Unit test lag?

2010-01-17 Thread Sean Owen
Nope, since it imports MersenneTwisterRNG, that class will be
initialized the moment RandomUtils is loaded.

On Mon, Jan 18, 2010 at 12:19 AM, Benson Margulies
 wrote:
> That would make a difference. If the code in RandomUtils never new's
> the Mersenne class, then it's static blocks would never run. If
> necessary, the Mersenne class could by loaded explicitly, but I don't
> think we have to go that far.
>
> So the question to me is whether we lose any test quality by using the JDK 
> RNG.
>


Re: Unit test lag?

2010-01-17 Thread Benson Margulies
That would make a difference. If the code in RandomUtils never new's
the Mersenne class, then it's static blocks would never run. If
necessary, the Mersenne class could by loaded explicitly, but I don't
think we have to go that far.

So the question to me is whether we lose any test quality by using the JDK RNG.


On Sun, Jan 17, 2010 at 7:07 PM, Sean Owen  wrote:
> It sounds like the slow code gets triggered at class-loading time, so
> no I don't think this would make a difference. But with the change I
> committed we should only have one class loader in play, I think.
>
> On Mon, Jan 18, 2010 at 12:00 AM, Benson Margulies
>  wrote:
>> What if we used the plain old JDK rng when in test mode?
>


Re: Unit test lag?

2010-01-17 Thread Sean Owen
It sounds like the slow code gets triggered at class-loading time, so
no I don't think this would make a difference. But with the change I
committed we should only have one class loader in play, I think.

On Mon, Jan 18, 2010 at 12:00 AM, Benson Margulies
 wrote:
> What if we used the plain old JDK rng when in test mode?


Re: Unit test lag?

2010-01-17 Thread Benson Margulies
What if we used the plain old JDK rng when in test mode?

On Sun, Jan 17, 2010 at 3:16 PM, Olivier Grisel
 wrote:
> 2010/1/17 Sean Owen :
>> Am I right that running tests in 1 JVM instead of n JVMs helps
>> mitigate this? because I just committed that change.
>
> I have the feeling it helps yes. I haven't timed the tests though.
>
> --
> Olivier
> http://twitter.com/ogrisel - http://code.oliviergrisel.name
>


Re: Unit test lag?

2010-01-17 Thread Olivier Grisel
2010/1/17 Sean Owen :
> Am I right that running tests in 1 JVM instead of n JVMs helps
> mitigate this? because I just committed that change.

I have the feeling it helps yes. I haven't timed the tests though.

-- 
Olivier
http://twitter.com/ogrisel - http://code.oliviergrisel.name


Re: Unit test lag?

2010-01-17 Thread Ted Dunning
This is a way of saying "I don't know".

On Sun, Jan 17, 2010 at 12:02 PM, Ted Dunning  wrote:

> That might help if the random class is loaded only once.
>
> If the different tests each use a new class loader (seems unlikely) then
> the static stuff will be executed multiply and the problem will be retained.
>
>
>
> On Sun, Jan 17, 2010 at 11:55 AM, Sean Owen  wrote:
>
>> Am I right that running tests in 1 JVM instead of n JVMs helps
>> mitigate this? because I just committed that change.
>>
>> On Sun, Jan 17, 2010 at 7:49 PM, Ted Dunning 
>> wrote:
>> > It doesn't affect the random numbers being generated.
>> >
>> > But it does eat bits of entropy from /dev/random.  That can then get
>> starved
>> > and block until more entropy is derived.  Since the reading is done in a
>> > static block instead of on construction, the cost can't be avoided.
>> >
>>
>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>
>


-- 
Ted Dunning, CTO
DeepDyve


Re: Unit test lag?

2010-01-17 Thread Ted Dunning
That might help if the random class is loaded only once.

If the different tests each use a new class loader (seems unlikely) then the
static stuff will be executed multiply and the problem will be retained.


On Sun, Jan 17, 2010 at 11:55 AM, Sean Owen  wrote:

> Am I right that running tests in 1 JVM instead of n JVMs helps
> mitigate this? because I just committed that change.
>
> On Sun, Jan 17, 2010 at 7:49 PM, Ted Dunning 
> wrote:
> > It doesn't affect the random numbers being generated.
> >
> > But it does eat bits of entropy from /dev/random.  That can then get
> starved
> > and block until more entropy is derived.  Since the reading is done in a
> > static block instead of on construction, the cost can't be avoided.
> >
>



-- 
Ted Dunning, CTO
DeepDyve


Re: Unit test lag?

2010-01-17 Thread Sean Owen
Am I right that running tests in 1 JVM instead of n JVMs helps
mitigate this? because I just committed that change.

On Sun, Jan 17, 2010 at 7:49 PM, Ted Dunning  wrote:
> It doesn't affect the random numbers being generated.
>
> But it does eat bits of entropy from /dev/random.  That can then get starved
> and block until more entropy is derived.  Since the reading is done in a
> static block instead of on construction, the cost can't be avoided.
>


Re: Unit test lag?

2010-01-17 Thread Ted Dunning
It doesn't affect the random numbers being generated.

But it does eat bits of entropy from /dev/random.  That can then get starved
and block until more entropy is derived.  Since the reading is done in a
static block instead of on construction, the cost can't be avoided.

On Sun, Jan 17, 2010 at 4:31 AM, Sean Owen  wrote:

> But does that affect code which instantiates a MersenneTwisterRNG with
> its own seed?
>
> On Sun, Jan 17, 2010 at 12:24 PM, Benson Margulies
>  wrote:
> >> I don't know of any further issues with MersenneTwisterRNG though --
> >> what's the issue? Don't care what it does with /dev/random as long as
> >> in test mode we are seeding it with the same seed, and that's what
> >>
> >
> > Olivier and I found the Mersenne code touching the
> > SecureRandomNumberGenerator, which goes and talks to /dev/random, all
> > in static blocks before any seeds are used.
>



-- 
Ted Dunning, CTO
DeepDyve


Re: Unit test lag?

2010-01-17 Thread Olivier Grisel
2010/1/17 Drew Farris :
> Olivier,
>
> If you are still interested in trying to debug these, you could
> configure the surefire-plugin to use the options for opening up a port
> for remote debugging when it forks off the java process.
>
> see: 
> http://maven.apache.org/plugins/maven-surefire-plugin/examples/debugging.html
>
> The examples there will suspend the vm until you connect with the
> debugger. If you don't know this already, mvn can be convinced to run
> individual tests using the -Dtest=testname argument (sans package
> name, e.g: mvn test -Dtest=TransactionTreeTest)

Thanks for the hint it did not know about that one.

However I have added a log dump the stacktrace in RandomUtils whenever
useSeed is false. It will be less tedious than waiting for the
breakpoints to fire-up in eclipse.

-- 
Olivier
http://twitter.com/ogrisel - http://code.oliviergrisel.name


Re: Unit test lag?

2010-01-17 Thread Drew Farris
Olivier,

If you are still interested in trying to debug these, you could
configure the surefire-plugin to use the options for opening up a port
for remote debugging when it forks off the java process.

see: 
http://maven.apache.org/plugins/maven-surefire-plugin/examples/debugging.html

The examples there will suspend the vm until you connect with the
debugger. If you don't know this already, mvn can be convinced to run
individual tests using the -Dtest=testname argument (sans package
name, e.g: mvn test -Dtest=TransactionTreeTest)

Hope this helps,

Drew

On Sun, Jan 17, 2010 at 8:49 AM, Olivier Grisel
 wrote:
> Ok I have found three non deterministic tests so far that actually
> consume entropy by calling generateSeed:
>
> TransactionTreeTest
> CacheTest
> AverageAbsoluteDifferenceRecommenderEvaluatorTest
>
> But using eclipse is not really helpful since I am forced to set the
> forkMode to "never" to make my debugger able to attach and then have
> to manual introspect what's happening. I'll try again with a log
> statement and setting to fork mode back to "always".
>
> --
> Olivier
> http://twitter.com/ogrisel - http://code.oliviergrisel.name
>


Re: Unit test lag?

2010-01-17 Thread Sean Owen
On Sun, Jan 17, 2010 at 1:36 PM, Drew Farris  wrote:
> Using a fixed seed doesn't solve the problem due to the way
> SecureRandomSeedGenerator is loaded by MerseneTwisterRNG

OK yeah I understand now. I thought this thread was addressing the
determinism issue, but you're talking about performance. My bad.
That's why I was confused.

Well I'm keen to solve the determinism issue too, I'll try that.


Re: Unit test lag?

2010-01-17 Thread Olivier Grisel
Ok I have found three non deterministic tests so far that actually
consume entropy by calling generateSeed:

TransactionTreeTest
CacheTest
AverageAbsoluteDifferenceRecommenderEvaluatorTest

But using eclipse is not really helpful since I am forced to set the
forkMode to "never" to make my debugger able to attach and then have
to manual introspect what's happening. I'll try again with a log
statement and setting to fork mode back to "always".

-- 
Olivier
http://twitter.com/ogrisel - http://code.oliviergrisel.name


Re: Unit test lag?

2010-01-17 Thread Drew Farris
The real problem I originally brought up was that the unit tests were
horribly slow due to blocking on /dev/random.

On Sun, Jan 17, 2010 at 8:21 AM, Sean Owen  wrote:
> I think I must be missing something --
>
> We don't use SecureRandom directly, so what would these effects have
> to do with slow unit tests in our project?

SecureRandom is referenced from the uncommons-maths class
SecureRandomSeedGenerator via a private static final, so when
SecureRandomSeedGenerator gets class loaded, we incur the penalty of
the first SecureRandomSeed constructor's read of /dev/random. Since we
fork for each unit test, this happens rapidly and quickly consumes the
availble entropy on the system, leading to the blocking behavior we're
seeing.

Using a fixed seed doesn't solve the problem due to the way
SecureRandomSeedGenerator is loaded by MerseneTwisterRNG

Eliminating forking from the unit tests will probably be acceptable
because I believe that Olivier has shown that the read from
/dev/random only happens once either at SecureRandom class load time,
or upon first call to its ctor.

Drew


Re: Unit test lag?

2010-01-17 Thread Sean Owen
I'm sorry I really think I'm off on my own planet. What issue are you
trying to solve? Performance, or deterministic tests? I'm concerned
with the latter and still do not understand what this has to do with
it.

On Sun, Jan 17, 2010 at 1:31 PM, Olivier Grisel
 wrote:
> 2010/1/17 Sean Owen :
>> I think I must be missing something --
>>
>> We don't use SecureRandom directly, so what would these effects have
>> to do with slow unit tests in our project?
>
> Classloading MersenneTwisterRNG in turn class loads
> DefaultSeedGenerator which has the following static block:
>
>  private static final SeedGenerator[] GENERATORS = new SeedGenerator[]
>   {
>       new DevRandomSeedGenerator(),
>       new RandomDotOrgSeedGenerator(),
>       new SecureRandomSeedGenerator()
>   };
>
> And further rely upon an instance of java.security.SecureRandom for each fork.
>
> I am currently tracing a complete maven surefire run with eclipse to
> see if we actually call generateSeed in the tests. So far this is the
> case only in TransactionTreeTest which need a fix to use the test
> seed.
>
> --
> Olivier
> http://twitter.com/ogrisel - http://code.oliviergrisel.name
>


Re: Unit test lag?

2010-01-17 Thread Olivier Grisel
2010/1/17 Sean Owen :
> I think I must be missing something --
>
> We don't use SecureRandom directly, so what would these effects have
> to do with slow unit tests in our project?

Classloading MersenneTwisterRNG in turn class loads
DefaultSeedGenerator which has the following static block:

  private static final SeedGenerator[] GENERATORS = new SeedGenerator[]
   {
   new DevRandomSeedGenerator(),
   new RandomDotOrgSeedGenerator(),
   new SecureRandomSeedGenerator()
   };

And further rely upon an instance of java.security.SecureRandom for each fork.

I am currently tracing a complete maven surefire run with eclipse to
see if we actually call generateSeed in the tests. So far this is the
case only in TransactionTreeTest which need a fix to use the test
seed.

-- 
Olivier
http://twitter.com/ogrisel - http://code.oliviergrisel.name


Re: Unit test lag?

2010-01-17 Thread Sean Owen
I think I must be missing something --

We don't use SecureRandom directly, so what would these effects have
to do with slow unit tests in our project?

And also am I right that, if we use our own seed in
MersenneTwisterRNG, we still get deterministic behavior?

I'm going to change all our tests to make sure we use a fixed seed,
and I'm still not clear why this wouldn't address the randomness
issue?

I don't know about performance, why this would have a bearing or why
it's recently slowed. Is that the issue you guys are looking at?

On Sun, Jan 17, 2010 at 1:11 PM, Olivier Grisel
 wrote:
> 2010/1/17 Benson Margulies :
>> On Sun, Jan 17, 2010 at 7:31 AM, Sean Owen  wrote:
>>> But does that affect code which instantiates a MersenneTwisterRNG with
>>> its own seed?
>>
>> That's what it looked like to me, but I may have been depending on
>> Olivier's analysis.
>
> I confirm that the first call to the java.security.SecureRandom
> constructor (which is in the static part of uncommons math init) does
> two system calls to /dev/random:
>
> $ strace -o /tmp/clj.strace.out -F -f java $JAVA_OPTS \
>        -cp .:..:/usr/share/java/jline.jar:$LIBS \
>        jline.ConsoleRunner clojure.lang.Repl
>
> user=> (java.security.SecureRandom.)
> #
> user=> (java.security.SecureRandom.)
> #
>
> while in a separate console:
>
> $ tail -f /tmp/clj.strace.out | grep "/dev/random"
> 18354 stat64("/dev/random", {st_mode=S_IFCHR|0666, st_rdev=makedev(1,
> 8), ...}) = 0
> 18354 open("/dev/random", O_RDONLY|O_LARGEFILE) = 19
>
> Further calls to the constructor or the generateSeed reuse the same
> file descriptor (no further calls to open on /dev/random).
>
> I can instantiate many (10) SecureRandom instances without
> blocking the process while calling generateSeed actually consume
> entropy as expected and blocks the app after a couple of hundred
> bytes.
>
> In our case it is possible that only the first call to the
> SecureRandom constructor in each forked tests is enough to block
> slowdown them all even if we don't call generateSeed.
>
> --
> Olivier
> http://twitter.com/ogrisel - http://code.oliviergrisel.name
>


Re: Unit test lag?

2010-01-17 Thread Olivier Grisel
2010/1/17 Benson Margulies :
> On Sun, Jan 17, 2010 at 7:31 AM, Sean Owen  wrote:
>> But does that affect code which instantiates a MersenneTwisterRNG with
>> its own seed?
>
> That's what it looked like to me, but I may have been depending on
> Olivier's analysis.

I confirm that the first call to the java.security.SecureRandom
constructor (which is in the static part of uncommons math init) does
two system calls to /dev/random:

$ strace -o /tmp/clj.strace.out -F -f java $JAVA_OPTS \
-cp .:..:/usr/share/java/jline.jar:$LIBS \
jline.ConsoleRunner clojure.lang.Repl

user=> (java.security.SecureRandom.)
#
user=> (java.security.SecureRandom.)
#

while in a separate console:

$ tail -f /tmp/clj.strace.out | grep "/dev/random"
18354 stat64("/dev/random", {st_mode=S_IFCHR|0666, st_rdev=makedev(1,
8), ...}) = 0
18354 open("/dev/random", O_RDONLY|O_LARGEFILE) = 19

Further calls to the constructor or the generateSeed reuse the same
file descriptor (no further calls to open on /dev/random).

I can instantiate many (10) SecureRandom instances without
blocking the process while calling generateSeed actually consume
entropy as expected and blocks the app after a couple of hundred
bytes.

In our case it is possible that only the first call to the
SecureRandom constructor in each forked tests is enough to block
slowdown them all even if we don't call generateSeed.

-- 
Olivier
http://twitter.com/ogrisel - http://code.oliviergrisel.name


Re: Unit test lag?

2010-01-17 Thread Benson Margulies
On Sun, Jan 17, 2010 at 7:31 AM, Sean Owen  wrote:
> But does that affect code which instantiates a MersenneTwisterRNG with
> its own seed?

That's what it looked like to me, but I may have been depending on
Olivier's analysis.


>
> On Sun, Jan 17, 2010 at 12:24 PM, Benson Margulies
>  wrote:
>>> I don't know of any further issues with MersenneTwisterRNG though --
>>> what's the issue? Don't care what it does with /dev/random as long as
>>> in test mode we are seeding it with the same seed, and that's what
>>>
>>
>> Olivier and I found the Mersenne code touching the
>> SecureRandomNumberGenerator, which goes and talks to /dev/random, all
>> in static blocks before any seeds are used.
>>
>


Re: Unit test lag?

2010-01-17 Thread Olivier Grisel
2010/1/17 Benson Margulies :
>> I don't know of any further issues with MersenneTwisterRNG though --
>> what's the issue? Don't care what it does with /dev/random as long as
>> in test mode we are seeding it with the same seed, and that's what
>>
>
> Olivier and I found the Mersenne code touching the
> SecureRandomNumberGenerator, which goes and talks to /dev/random, all
> in static blocks before any seeds are used.
>

I am not 100% that the SecureRandom() constructor that is called in
the static part of the uncommons math packagte is actually doing any
blocking call to /dev/random on linux. I would have thought that this
is only the case when calling the generateSeed method called by the
MersenneTwisterRNG constructor iwhen called by RandomUtils.getRandom()
throughout Mahout components if RandomUtils. useTestSeed() is not
called first.

Maybe this is not the case. I will try to connect the eclipse debugger
to investigate further.

-- 
Olivier
http://twitter.com/ogrisel - http://code.oliviergrisel.name


Re: Unit test lag?

2010-01-17 Thread Sean Owen
But does that affect code which instantiates a MersenneTwisterRNG with
its own seed?

On Sun, Jan 17, 2010 at 12:24 PM, Benson Margulies
 wrote:
>> I don't know of any further issues with MersenneTwisterRNG though --
>> what's the issue? Don't care what it does with /dev/random as long as
>> in test mode we are seeding it with the same seed, and that's what
>>
>
> Olivier and I found the Mersenne code touching the
> SecureRandomNumberGenerator, which goes and talks to /dev/random, all
> in static blocks before any seeds are used.
>


Re: Unit test lag?

2010-01-17 Thread Benson Margulies
> I don't know of any further issues with MersenneTwisterRNG though --
> what's the issue? Don't care what it does with /dev/random as long as
> in test mode we are seeding it with the same seed, and that's what
>

Olivier and I found the Mersenne code touching the
SecureRandomNumberGenerator, which goes and talks to /dev/random, all
in static blocks before any seeds are used.


Re: Unit test lag?

2010-01-17 Thread Sean Owen
Not sure what's going on or why that revision would have anything to
do with the slowdown... the only thing of substance it did was
actually let the SamplingIterator test run but it doesn't take long.

I agree with not forking a JVM per test, so will make that change.

Also, yes, we need tests to be deterministic. This is the theory
behind why all code should obtain a Random from RandomUtils, and, all
tests should configure RandomUtils to use a fixed seed in setUp().
This isn't 100% true. Mind if I indeed finally fix this?

I don't know of any further issues with MersenneTwisterRNG though --
what's the issue? Don't care what it does with /dev/random as long as
in test mode we are seeding it with the same seed, and that's what
RandomUtils does.


On Sun, Jan 17, 2010 at 7:29 AM, deneche abdelhakim  wrote:
> removing the maven repository does not solve the problem, neither a
> fresh checkout of the trunk.
>
> but older revisions don't show any slowdown!!! I tried the following 
> revisions:
>
> Those old revisions seem Ok:
> 
> r896946 | srowen | 2010-01-07 19:02:41 +0100 (Thu, 07 Jan 2010) | 1 line
> MAHOUT-238
> 
> r897134 | robinanil | 2010-01-08 09:23:22 +0100 (Fri, 08 Jan 2010) | 1 line
> MAHOUT-221 Missed out two files while checking in FP-Bonsai
> 
> r897405 | adeneche | 2010-01-09 11:02:49 +0100 (Sat, 09 Jan 2010) | 1 line
> MAHOUT-216
>
>
 The slowdowns start at this revision !!!
> 
> r897440 | srowen | 2010-01-09 13:53:25 +0100 (Sat, 09 Jan 2010) | 1 line
> Code style adjustments; enabled/fixed TestSamplingIterator
>


Re: Unit test lag?

2010-01-16 Thread deneche abdelhakim
removing the maven repository does not solve the problem, neither a
fresh checkout of the trunk.

but older revisions don't show any slowdown!!! I tried the following revisions:

Those old revisions seem Ok:

r896946 | srowen | 2010-01-07 19:02:41 +0100 (Thu, 07 Jan 2010) | 1 line
MAHOUT-238

r897134 | robinanil | 2010-01-08 09:23:22 +0100 (Fri, 08 Jan 2010) | 1 line
MAHOUT-221 Missed out two files while checking in FP-Bonsai

r897405 | adeneche | 2010-01-09 11:02:49 +0100 (Sat, 09 Jan 2010) | 1 line
MAHOUT-216


>>> The slowdowns start at this revision !!!

r897440 | srowen | 2010-01-09 13:53:25 +0100 (Sat, 09 Jan 2010) | 1 line
Code style adjustments; enabled/fixed TestSamplingIterator



On Sun, Jan 17, 2010 at 5:47 AM, deneche abdelhakim  wrote:
> I'm getting similar slowdowns with my VirtualBox Ubuntu 9.04
>
> I'm suspecting that the problem is not -only- caused by RandomUtils because:
>
> 1. I'm familiar with MerseneTwisterRNG slowdowns (I use it a lot) but
> the test time used to be reported accurately by maven. Now maven
> reports that a test took less than a second but it actually took a lot
> more !
>
> 2. Most of my tests actually call RandomUtils.useTestSeed() in setup()
> (InMemInputSplitTest included) but the tests still take a lot of time,
> and again its not reported accurately by maven
>
> 3. I generally launch a 'mvn clean install' every Thursday. I never
> got this slowdowns until last Thursday (dit we change anything that
> could have caused this slowdowns)
>
> On Sun, Jan 17, 2010 at 12:33 AM, Benson Margulies
>  wrote:

>>> Unit tests should generally be using a fixed seed and not need to load a
>>> secure seed from dev/random.  I would say that RandomUtils is probably the
>>> problem here.  The secure seed should be loaded lazily only if the test seed
>>> is not in use.
>>
>> The problem, as I see it, is that the uncommons-math package start
>> initializing a random seed as soon as you touch it, whether you need
>> it or not. RandomUtils can only avoid this by avoiding uncommons-math
>> in unit test mode.
>>
>>>
>>>
>>>
>>> --
>>> Ted Dunning, CTO
>>> DeepDyve
>>>
>>
>


Re: Unit test lag?

2010-01-16 Thread deneche abdelhakim
I'm getting similar slowdowns with my VirtualBox Ubuntu 9.04

I'm suspecting that the problem is not -only- caused by RandomUtils because:

1. I'm familiar with MerseneTwisterRNG slowdowns (I use it a lot) but
the test time used to be reported accurately by maven. Now maven
reports that a test took less than a second but it actually took a lot
more !

2. Most of my tests actually call RandomUtils.useTestSeed() in setup()
(InMemInputSplitTest included) but the tests still take a lot of time,
and again its not reported accurately by maven

3. I generally launch a 'mvn clean install' every Thursday. I never
got this slowdowns until last Thursday (dit we change anything that
could have caused this slowdowns)

On Sun, Jan 17, 2010 at 12:33 AM, Benson Margulies
 wrote:
>>>
>> Unit tests should generally be using a fixed seed and not need to load a
>> secure seed from dev/random.  I would say that RandomUtils is probably the
>> problem here.  The secure seed should be loaded lazily only if the test seed
>> is not in use.
>
> The problem, as I see it, is that the uncommons-math package start
> initializing a random seed as soon as you touch it, whether you need
> it or not. RandomUtils can only avoid this by avoiding uncommons-math
> in unit test mode.
>
>>
>>
>>
>> --
>> Ted Dunning, CTO
>> DeepDyve
>>
>


Re: Unit test lag?

2010-01-16 Thread Benson Margulies
>>
> Unit tests should generally be using a fixed seed and not need to load a
> secure seed from dev/random.  I would say that RandomUtils is probably the
> problem here.  The secure seed should be loaded lazily only if the test seed
> is not in use.

The problem, as I see it, is that the uncommons-math package start
initializing a random seed as soon as you touch it, whether you need
it or not. RandomUtils can only avoid this by avoiding uncommons-math
in unit test mode.

>
>
>
> --
> Ted Dunning, CTO
> DeepDyve
>


Re: Unit test lag?

2010-01-16 Thread Ted Dunning
On Sat, Jan 16, 2010 at 1:40 PM, Drew Farris  wrote:

> Mahout does per-test forking, which means we're forking off a new JVM
> for each unit text execution, this adds overhead to tests that takes
> 0.2s to complete. Is per-test forking strictly needed?
>

It shouldn't be.  I would count it a bug if it were.


>  ... wall time 30s (!) or so. ... attempting to reading from /dev/random.
>
>
Unit tests should generally be using a fixed seed and not need to load a
secure seed from dev/random.  I would say that RandomUtils is probably the
problem here.  The secure seed should be loaded lazily only if the test seed
is not in use.



-- 
Ted Dunning, CTO
DeepDyve


Re: Unit test lag?

2010-01-16 Thread Olivier Grisel
Some tests are probably not calling:

RandomUtils.useTestSeed();

in a setUp() or static init. Maybe a mixin class MahoutTestCase base
class with a default static init that calls it would do.

Otherwise, I confirm that setting forkModel to "once" in maven/pom.xml
solves the issue (and all tests pass).

-- 
Olivier
http://twitter.com/ogrisel - http://code.oliviergrisel.name


Re: Unit test lag?

2010-01-16 Thread Benson Margulies
Oh, I see. We have to give up on the MerseneTwisterRNG in tests and
just use the JRE. Is that OK?

On Sat, Jan 16, 2010 at 5:44 PM, Olivier Grisel
 wrote:
> 2010/1/16 Drew Farris :
>> On Sat, Jan 16, 2010 at 4:42 PM, Benson Margulies  
>> wrote:
>>> . Running through strace showed
 that something was attempting to reading from /dev/random. Sometimes
 it ran fine, but at least 25-30% it ended up blocking until the
 entropy pool is refilled. To test I moved /dev/random, and created a
 link from /dev/urandom to /dev/random (the former doesn't block, but
 isn't cryptographically secure). It looks as if this could be related
 to the loading of the SecureRandomSeedGenerator class.
>>>
>>> Why not use a fixed random seed for unit tests? That would make them
>>> more repeatable and avoid this problem, no?
>>>
>>
>> It appears we are. in RandomUtils:
>>
>>  public static Random getRandom() {
>>    return testSeed ? new MersenneTwisterRNG(STANDARD_SEED) : new
>> MersenneTwisterRNG();
>>  }
>>
>> But something somewhere is forcing SecureRandomSeedGenerator to get
>> loaded by the classloader which in turn does a 'new SecureRandom()' in
>> a private static final field assignment. Trying to track down what is
>> causing the generator to get loaded in the first place.
>>
>> But something is forcing the SecureRandomSeedGenerator class to get
>> loaded, which I suspect
>>
>
>
> MersenneTwisterRNG constructor calls:
>
>  this(DefaultSeedGenerator.getInstance().generateSeed(SEED_SIZE_BYTES));
>
> Which in turn calls:
>
>    private static final SeedGenerator[] GENERATORS = new SeedGenerator[]
>    {
>        new DevRandomSeedGenerator(),
>        new RandomDotOrgSeedGenerator(),
>        new SecureRandomSeedGenerator()
>    };
>
> In the definition of the class: DefaultSeedGenerator
>
> Unless the forking tests are disabled I don't see how to prevent the
> MersenneTwisterRNG to inderctly fetch entropy from /dev/random /
> SecureRandom.
> --
> Olivier
> http://twitter.com/ogrisel - http://code.oliviergrisel.name
>


Re: Unit test lag?

2010-01-16 Thread Benson Margulies
I see a way, but it involves loading this class explicitly with reflection.

I'll make a patch.


Re: Unit test lag?

2010-01-16 Thread Olivier Grisel
2010/1/16 Drew Farris :
> On Sat, Jan 16, 2010 at 4:42 PM, Benson Margulies  
> wrote:
>> . Running through strace showed
>>> that something was attempting to reading from /dev/random. Sometimes
>>> it ran fine, but at least 25-30% it ended up blocking until the
>>> entropy pool is refilled. To test I moved /dev/random, and created a
>>> link from /dev/urandom to /dev/random (the former doesn't block, but
>>> isn't cryptographically secure). It looks as if this could be related
>>> to the loading of the SecureRandomSeedGenerator class.
>>
>> Why not use a fixed random seed for unit tests? That would make them
>> more repeatable and avoid this problem, no?
>>
>
> It appears we are. in RandomUtils:
>
>  public static Random getRandom() {
>    return testSeed ? new MersenneTwisterRNG(STANDARD_SEED) : new
> MersenneTwisterRNG();
>  }
>
> But something somewhere is forcing SecureRandomSeedGenerator to get
> loaded by the classloader which in turn does a 'new SecureRandom()' in
> a private static final field assignment. Trying to track down what is
> causing the generator to get loaded in the first place.
>
> But something is forcing the SecureRandomSeedGenerator class to get
> loaded, which I suspect
>


MersenneTwisterRNG constructor calls:

  this(DefaultSeedGenerator.getInstance().generateSeed(SEED_SIZE_BYTES));

Which in turn calls:

private static final SeedGenerator[] GENERATORS = new SeedGenerator[]
{
new DevRandomSeedGenerator(),
new RandomDotOrgSeedGenerator(),
new SecureRandomSeedGenerator()
};

In the definition of the class: DefaultSeedGenerator

Unless the forking tests are disabled I don't see how to prevent the
MersenneTwisterRNG to inderctly fetch entropy from /dev/random /
SecureRandom.
-- 
Olivier
http://twitter.com/ogrisel - http://code.oliviergrisel.name


Re: Unit test lag?

2010-01-16 Thread Benson Margulies
This is going to be a lot of fun. That class is in uncommons-math, and
the connection to it from Mahout is hardly obvious.

On Sat, Jan 16, 2010 at 5:34 PM, Benson Margulies  wrote:
> It looks as if this could be related
 to the loading of the SecureRandomSeedGenerator class.
>>>
>
> Let's fix that class to defer until there's a good reason to make a seed.
>


Re: Unit test lag?

2010-01-16 Thread Benson Margulies
It looks as if this could be related
>>> to the loading of the SecureRandomSeedGenerator class.
>>

Let's fix that class to defer until there's a good reason to make a seed.


Re: Unit test lag?

2010-01-16 Thread Olivier Grisel
2010/1/16 Benson Margulies :
> . Running through strace showed
>> that something was attempting to reading from /dev/random. Sometimes
>> it ran fine, but at least 25-30% it ended up blocking until the
>> entropy pool is refilled. To test I moved /dev/random, and created a
>> link from /dev/urandom to /dev/random (the former doesn't block, but
>> isn't cryptographically secure). It looks as if this could be related
>> to the loading of the SecureRandomSeedGenerator class.
>>

I also experience the same slowdown Drew describes. ubuntu machines too.

> Why not use a fixed random seed for unit tests? That would make them
> more repeatable and avoid this problem, no?
>

+1 for the fixed seed (42 is my favorite seed).

-- 
Olivier
http://twitter.com/ogrisel - http://code.oliviergrisel.name


Re: Unit test lag?

2010-01-16 Thread Drew Farris
On Sat, Jan 16, 2010 at 4:42 PM, Benson Margulies  wrote:
> . Running through strace showed
>> that something was attempting to reading from /dev/random. Sometimes
>> it ran fine, but at least 25-30% it ended up blocking until the
>> entropy pool is refilled. To test I moved /dev/random, and created a
>> link from /dev/urandom to /dev/random (the former doesn't block, but
>> isn't cryptographically secure). It looks as if this could be related
>> to the loading of the SecureRandomSeedGenerator class.
>
> Why not use a fixed random seed for unit tests? That would make them
> more repeatable and avoid this problem, no?
>

It appears we are. in RandomUtils:

  public static Random getRandom() {
return testSeed ? new MersenneTwisterRNG(STANDARD_SEED) : new
MersenneTwisterRNG();
  }

But something somewhere is forcing SecureRandomSeedGenerator to get
loaded by the classloader which in turn does a 'new SecureRandom()' in
a private static final field assignment. Trying to track down what is
causing the generator to get loaded in the first place.

But something is forcing the SecureRandomSeedGenerator class to get
loaded, which I suspect


Re: Unit test lag?

2010-01-16 Thread Benson Margulies
. Running through strace showed
> that something was attempting to reading from /dev/random. Sometimes
> it ran fine, but at least 25-30% it ended up blocking until the
> entropy pool is refilled. To test I moved /dev/random, and created a
> link from /dev/urandom to /dev/random (the former doesn't block, but
> isn't cryptographically secure). It looks as if this could be related
> to the loading of the SecureRandomSeedGenerator class.
>

Why not use a fixed random seed for unit tests? That would make them
more repeatable and avoid this problem, no?


Unit test lag?

2010-01-16 Thread Drew Farris
Recently I've been noticing that Mahout's unit tests generally take a
considerably long time to run, generally longer than what is reported
in the individual test output. I took a look as to why this was the
case and found a couple things:

Mahout does per-test forking, which means we're forking off a new JVM
for each unit text execution, this adds overhead to tests that takes
0.2s to complete. Is per-test forking strictly needed?

I captured the command-line used to execute one of the forked tests
(InMemInputSplitTest) by running mvn -X and executed it from the shell
repeatedly using time see what was going on. In one of every few
invocations, the test in question would report completion in 3s, but
time reported a wall time 30s (!) or so. Running through strace showed
that something was attempting to reading from /dev/random. Sometimes
it ran fine, but at least 25-30% it ended up blocking until the
entropy pool is refilled. To test I moved /dev/random, and created a
link from /dev/urandom to /dev/random (the former doesn't block, but
isn't cryptographically secure). It looks as if this could be related
to the loading of the SecureRandomSeedGenerator class.

I'm running on Ubuntu 9.04, kernel 2.6.28-17-server with the latest patches.

Is anyone else experiencing similar slowness?

Drew