I dont recall.
------
Robin Anil

On Thu, Jun 21, 2012 at 2:54 PM, Dan Brickley <[email protected]> wrote:

> Robin,
>
> Do you remember if this test ran successfully to completion? If not,
> I'll submit a JIRA when I've a complete log of a failed run...
>
> Dan
>
> ---------- Forwarded message ----------
> From: Grant Ingersoll <[email protected]>
> Date: 21 June 2012 21:33
> Subject: Re: Spectral Kmeans wiki category data test - can you confirm
> if you ran it to completion?
> To: Dan Brickley <[email protected]>
> Cc: Shannon Quinn <[email protected]>
>
>
> I'd ask on dev@, as Robin was actually the one who ran it.
>
> On Jun 21, 2012, at 3:15 PM, Dan Brickley wrote:
>
> Hi
>
> With the patch https://issues.apache.org/jira/browse/MAHOUT-986 in
> 0.7, this doesn't die so quickly ... but I'm still not seeing it run
> to completion.
>
> Using the template commandline you suggested, 'bin/mahout
> spectralkmeans -k 20 -d 4192499 -x 7 -i path/to/csv/file/ -o
> your/output/path/
>
> I've seen it fail with -k 20, and -k 10
>
> Unfortunately I was running this in a screen session without proper
> logging and I want to double-check everything before reporting so I'm
> re-running with -k 10 now and will file a bug if it fails, ... but
> meanwhile I wanted to check in with you to see if you'd had a
> successful run. I'm testing with the 0.7 distro.
>
> The failure was an IndexException, here's the -k 20 version,
>
> mahout  spectralkmeans -k 20 -d 4192499 -x 7 -i spectral/input/  -o
> spectral/output/
>
> 12/06/19 19:33:11 INFO lanczos.LanczosSolver: 20 passes through the
> corpus so far...
> Exception in thread "main" org.apache.mahout.math.IndexException:
> Index 20 is outside allowable range of [0,20)
>        at
> org.apache.mahout.math.AbstractMatrix.set(AbstractMatrix.java:479)
>        at
> org.apache.mahout.math.decomposer.lanczos.LanczosSolver.solve(LanczosSolver.java:132)
>        at
> org.apache.mahout.math.hadoop.decomposer.DistributedLanczosSolver.runJob(DistributedLanczosSolver.java:73)
>        at
> org.apache.mahout.clustering.spectral.kmeans.SpectralKMeansDriver.run(SpectralKMeansDriver.java:148)
>        at
> org.apache.mahout.clustering.spectral.kmeans.SpectralKMeansDriver.run(SpectralKMeansDriver.java:86)
>
> It's barfing out here,
>
>    // Next step: perform eigen-decomposition using LanczosSolver
>    // since some of the eigen-output is spurious and will be eliminated
>    // upon verification, we have to aim to overshoot and then discard
>    // unnecessary vectors later
>    int overshoot = (int) ((double) clusters * OVERSHOOT_MULTIPLIER);
>    DistributedLanczosSolver solver = new DistributedLanczosSolver();
>    LanczosState state = new LanczosState(L, overshoot,
> solver.getInitialVector(L));
>    Path lanczosSeqFiles = new Path(outputCalc, "eigenvectors-" +
> (System.nanoTime() & 0xFF));
>    solver.runJob(conf,
>                  state,
>                  overshoot,
>                  true,
>                  lanczosSeqFiles.toString());
>
> With -k 10 I got "12/06/20 20:51:15 INFO lanczos.LanczosSolver: 10
> passes through the corpus so far...
> Exception in thread "main" org.apache.mahout.math.IndexException:
> Index 10 is outside allowable range of [0,10)
>        at
> org.apache.mahout.math.AbstractMatrix.set(AbstractMatrix.java:479)".
>
> ...although the logs also showed "12/06/20 20:40:18 INFO
> lanczos.LanczosSolver: Finding 20 singular vectors of matrix with
> 4192499 rows, via Lanczos" which confused me until Shannon reminded me
> of the overshoot.
>
> I'm happy to +cc the mailing lists but for starters thought I'd check
> to see if the test run had succeeded for you; if so, maybe I've some
> local problem.
>
> Dan
>
>
> --------------------------------------------
> Grant Ingersoll
> http://www.lucidimagination.com
>

Reply via email to