[
https://issues.apache.org/jira/browse/MAHOUT-577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12977964#action_12977964
]
Joris Geessels commented on MAHOUT-577:
---------------------------------------
Hi Maya,
I'm not sure if this is what's going on, but it might be possible that hadoop
is indicated that the map tasks are completed while they are in reality still
executing. I've noticed this myself a few times. So you might consider to let
your job run a little longer and see if it helps ( assuming you haven't done so
yet ). You can verify that your job is progressing as follows:
hadoop job -status <job-id>
and you can acquire the job-id from:
hadoop job -list
If the status information isn't changing then your map task hangs indeed.
> RowSimilarityJob hangs during CooccurrencesMapper
> -------------------------------------------------
>
> Key: MAHOUT-577
> URL: https://issues.apache.org/jira/browse/MAHOUT-577
> Project: Mahout
> Issue Type: Bug
> Components: Collaborative Filtering
> Affects Versions: 0.4
> Environment: Linux Debian 5.0.5, 12GB Ram, Hadoop 20.3 installation
> Reporter: Maya Hristakeva
> Priority: Blocker
>
> Hello,
> When trying to run a RowSimilarityJob on a matrix ( 146682 x 138351 ), the
> job gets through the RowWeightMapper and WeightedOccurrencesPerColumnReducer,
> and hangs during the CooccurrencesMapper although it shows that the map tasks
> are 100% complete.
> The command I use to run the job is:
> hadoop jar mahout-core-0.4-job.jar
> org.apache.mahout.math.hadoop.similarity.RowSimilarityJob
> -Dmapred.input.dir=/user/maya.hristakeva/mahout/core4/tf/1/0.001/title/12_07_10/lda/5/lda-sim/ldaCompressedDocumentsMatrix
>
> -Dmapred.output.dir=/user/maya.hristakeva/mahout/core4/tf/1/0.001/title/12_07_10/lda/5/lda-sim/ldaDocumentSimilarityMatrix
> -Dmapred.reduce.tasks=8 -Dmapred.map.tasks=200
> -Dmapred.job.name=LDA_ROW_SIMILARITY_TEST --tempDir
> /user/maya.hristakeva/temp/lda/5 --numberOfColumns 138351
> --similarityClassname
> org.apache.mahout.math.hadoop.similarity.vector.DistributedEuclideanDistanceVectorSimilarity
> --maxSimilaritiesPerRow 10
> And the output of the mappers which are 100% complete, but hanging is:
> syslog logs
> 01-05 18:30:00,835 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 29085149; bufend = 39038598; bufvoid = 99614720
> 2011-01-05 18:30:00,835 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 65461; kvend = 327605; length = 327680
> 2011-01-05 18:30:06,241 INFO org.apache.hadoop.mapred.MapTask: Finished spill
> 94
> 2011-01-05 18:30:09,208 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: record full = true
> 2011-01-05 18:30:09,208 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 39038598; bufend = 48983989; bufvoid = 99614720
> 2011-01-05 18:30:09,208 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 327605; kvend = 262068; length = 327680
> 2011-01-05 18:30:14,528 INFO org.apache.hadoop.mapred.MapTask: Finished spill
> 95
> 2011-01-05 18:30:17,328 INFO org.apache.hadoop.mapred.MapTask: Spilling map
> output: record full = true
> 2011-01-05 18:30:17,328 INFO org.apache.hadoop.mapred.MapTask: bufstart =
> 48983989; bufend = 58929384; bufvoid = 99614720
> 2011-01-05 18:30:17,328 INFO org.apache.hadoop.mapred.MapTask: kvstart =
> 262068; kvend = 196531; length = 327680
> 2011-01-05 18:30:22,615 INFO org.apache.hadoop.mapred.MapTask: Finished spill
> 96
> .
> .
> .
> This problem does not occur when I use a toy matrix of 100 x 100, but once I
> give it the original matrix of ..... the problem is always reproducible.
> Any ideas on what could be causing this?
> Thanks,
> Maya Hristakeva
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.