[ 
https://issues.apache.org/jira/browse/MAHOUT-1240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13674456#comment-13674456
 ] 

Dawid Weiss commented on MAHOUT-1240:
-------------------------------------

+1 for using randomized testing :) One remark --
{code}
+public final class VectorWritableTest extends RandomizedTest {
+  private static final int MAX_VECTOR_SIZE = 100;
+  private final Random r = RandomUtils.getRandom();
+
+  public void createRandom(Vector v) {
+    int size = r.nextInt(v.size());
...
{code}

If you're extending/ using randomized test framework then it'd be probably 
sensible to reuse its internal Random -- this will let you use all the benefits 
of the framework, including changing seeds on Repeat(ed) methods etc. This in 
general is a problem:

{code}
+  private final Random r = RandomUtils.getRandom();
{code}

because class instantiation takes place outside of the suite runner (this is 
JUnit's limitation). Anything that has a class scope should be initialized in a 
BeforeClass-annotated method hook (and cleaned appropriately if resources are 
to be released).

I would remove the class-scoped "r" and replace calls to it with superclass 
utility functions (randomInt, randomDouble etc.).
                
> Randomized testing and Serialization of NonZeros
> ------------------------------------------------
>
>                 Key: MAHOUT-1240
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-1240
>             Project: Mahout
>          Issue Type: Bug
>            Reporter: Robin Anil
>            Assignee: Robin Anil
>             Fix For: 0.8
>
>         Attachments: MAHOUT-1240.patch
>
>
> Currently the nonZero iterator does not guarantee nonZero iteration for 
> certain vectors (RASV, SASV) for performance reason. However vector view 
> iterator adds a zero check.. To be correct we have to either remove the check 
> or do correct non zero serialization everywhere. However this means going 
> over the vectors in two passes. Given that is pretty fast already, I am 
> fixing the logic bug. We can tackle the speed up for the next release.
> This also adds a randomized test for serialization that catches all such bugs.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to