Great experience!

/Edward

On Fri, Sep 19, 2008 at 2:50 PM, Palleti, Pallavi
<[EMAIL PROTECTED]> wrote:
> Yeah. That was the problem. And Hama can be surely useful for large scale 
> matrix operations.
>
> But for this problem, I have modified the code to just pass the ID 
> information and read the vector information only when it is needed. In this 
> case, it was needed only in the reducer phase. This way, it avoided this 
> problem of out of memory error and also faster now.
>
> Thanks
> Pallavi
> -----Original Message-----
> From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Edward J. Yoon
> Sent: Friday, September 19, 2008 10:35 AM
> To: core-user@hadoop.apache.org; [EMAIL PROTECTED]; [EMAIL PROTECTED]
> Subject: Re: OutOfMemory Error
>
>> The key is of the form "ID :DenseVector Representation in mahout with
>
> I guess vector size seems too large so it'll need a distributed vector
> architecture (or 2d partitioning strategies) for large scale matrix
> operations. The hama team investigate these problem areas. So, it will
> be improved If hama can be used for mahout in the future.
>
> /Edward
>
> On Thu, Sep 18, 2008 at 12:28 PM, Pallavi Palleti <[EMAIL PROTECTED]> wrote:
>>
>> Hadoop Version - 17.1
>> io.sort.factor =10
>> The key is of the form "ID :DenseVector Representation in mahout with
>> dimensionality size = 160k"
>> For example: C1:[,0.00111111, 3.002, ...... 1.001,....]
>> So, typical size of the key  of the mapper output can be 160K*6 (assuming
>> double in string is represented in 5 bytes)+ 5 (bytes for C1:[])  + size
>> required to store that the object is of type Text
>>
>> Thanks
>> Pallavi
>>
>>
>>
>> Devaraj Das wrote:
>>>
>>>
>>>
>>>
>>> On 9/17/08 6:06 PM, "Pallavi Palleti" <[EMAIL PROTECTED]> wrote:
>>>
>>>>
>>>> Hi all,
>>>>
>>>>    I am getting outofmemory error as shown below when I ran map-red on
>>>> huge
>>>> amount of data.:
>>>> java.lang.OutOfMemoryError: Java heap space
>>>> at
>>>> org.apache.hadoop.io.DataOutputBuffer$Buffer.write(DataOutputBuffer.java:52)
>>>> at org.apache.hadoop.io.DataOutputBuffer.write(DataOutputBuffer.java:90)
>>>> at
>>>> org.apache.hadoop.io.SequenceFile$Reader.nextRawKey(SequenceFile.java:1974)
>>>> at
>>>> org.apache.hadoop.io.SequenceFile$Sorter$SegmentDescriptor.nextRawKey(Sequence
>>>> File.java:3002)
>>>> at
>>>> org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:28
>>>> 02)
>>>> at org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:2511)
>>>> at
>>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1040)
>>>> at
>>>> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:698)
>>>> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:220)
>>>> at org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:2124
>>>> The above error comes almost at the end of map job. I have set the heap
>>>> size
>>>> to 1GB. Still the problem is persisting.  Can someone please help me how
>>>> to
>>>> avoid this error?
>>> What is the typical size of your key? What is the value of io.sort.factor?
>>> Hadoop version?
>>>
>>>
>>>
>>>
>>
>> --
>> View this message in context: 
>> http://www.nabble.com/OutOfMemory-Error-tp19531174p19545298.html
>> Sent from the Hadoop core-user mailing list archive at Nabble.com.
>>
>>
>
>
>
> --
> Best regards, Edward J. Yoon
> [EMAIL PROTECTED]
> http://blog.udanax.org
>



-- 
Best regards, Edward J. Yoon
[EMAIL PROTECTED]
http://blog.udanax.org

Reply via email to