So this is not a problem of A'A computation -- the input is obviously
invalid.

Question is what you did before you got a A handle -- read it from file?
parallelized it from in-core matrix (drmParallelize)? as a result of other
computation (if yes than what)? wrapped around manually crafted RDD
(drmWrap)?

I don't understand the question about non-continuous ids. You are referring
to some context of your computation assuming I am in context (but i am
unfortunately not)

On Mon, Nov 17, 2014 at 4:55 PM, Dmitriy Lyubimov <dlie...@gmail.com> wrote:

>
>
> On Mon, Nov 17, 2014 at 3:46 PM, Pat Ferrel <p...@occamsmachete.com> wrote:
>
>> A matrix with about 4600 rows and somewhere around 27790 columns when
>> executing the following line from AtA (not sure of the exact dimensions)
>>
>>      /** The version of A'A that does not use GraphX */
>>      def at_a_nongraph(op: OpAtA[_], srcRdd: DrmRdd[_]): DrmRdd[Int] = {
>>
>> a vector is created whose size is causes the error. How could I have
>> constructed a drm that would cause this error? If the column IDs were
>> non-contiguous would that yield this error?
>>
>
> what did you do specifically to build matrix A?
>
>
>> ==================
>>
>> 14/11/12 17:56:03 ERROR executor.Executor: Exception in task 5.0 in stage
>> 18.0 (TID 66169)
>> org.apache.mahout.math.IndexException: Index 27792 is outside allowable
>> range of [0,27789)
>>         at
>> org.apache.mahout.math.AbstractVector.viewPart(AbstractVector.java:147)
>>         at
>> org.apache.mahout.math.scalabindings.VectorOps.apply(VectorOps.scala:37)
>>         at
>> org.apache.mahout.sparkbindings.blas.AtA$$anonfun$5$$anonfun$apply$6.apply(AtA.scala:152)
>>         at
>> org.apache.mahout.sparkbindings.blas.AtA$$anonfun$5$$anonfun$apply$6.apply(AtA.scala:149)
>>         at
>> scala.collection.immutable.Stream$$anonfun$map$1.apply(Stream.scala:376)
>>         at
>> scala.collection.immutable.Stream$$anonfun$map$1.apply(Stream.scala:376)
>>         at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1085)
>>         at scala.collection.immutable.Stream$Cons.tail(Stream.scala:1077)
>>         at
>> scala.collection.immutable.StreamIterator$$anonfun$next$1.apply(Stream.scala:980)
>>         at
>> scala.collection.immutable.StreamIterator$$anonfun$next$1.apply(Stream.scala:980)
>>         at
>> scala.collection.immutable.StreamIterator$LazyCell.v$lzycompute(Stream.scala:969)
>>         at
>> scala.collection.immutable.StreamIterator$LazyCell.v(Stream.scala:969)
>>         at
>> scala.collection.immutable.StreamIterator.hasNext(Stream.scala:974)
>>         at scala.collection.Iterator$$anon$13.hasNext(Iterator.scala:371)
>>         at
>> org.apache.spark.util.collection.ExternalAppendOnlyMap.insertAll(ExternalAppendOnlyMap.scala:137)
>>         at
>> org.apache.spark.Aggregator.combineValuesByKey(Aggregator.scala:58)
>>         at
>> org.apache.spark.shuffle.hash.HashShuffleWriter.write(HashShuffleWriter.scala:55)
>>         at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:68)
>>         at
>> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
>>         at org.apache.spark.scheduler.Task.run(Task.scala:54)
>>         at
>> org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:177)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
>>         at
>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
>>         at java.lang.Thread.run(Thread.java:695)
>>
>>
>

Reply via email to