[ https://issues.apache.org/jira/browse/MAHOUT-1408?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Dmitriy Lyubimov reassigned MAHOUT-1408: ---------------------------------------- Assignee: Dmitriy Lyubimov > Distributed cache file matching bug while running SSVD in broadcast mode > ------------------------------------------------------------------------ > > Key: MAHOUT-1408 > URL: https://issues.apache.org/jira/browse/MAHOUT-1408 > Project: Mahout > Issue Type: Bug > Components: Math > Affects Versions: 0.8 > Reporter: Angad Singh > Assignee: Dmitriy Lyubimov > Priority: Minor > Attachments: BtJob.java.patch > > > The error is: > java.lang.IllegalArgumentException: Unexpected file name, unable to deduce > partition > #:file:/data/d1/mapred/local/taskTracker/distcache/434503979705629827_-1822139941_1047712745/nn.red.ua2.inmobi.com/user/rmcuser/oozie-oozi/0034272-140120102756143-oozie-oozi-W/inmobi-ssvd_mahout--java/java-launcher.jar > at > org.apache.mahout.math.hadoop.stochasticsvd.SSVDHelper$1.compare(SSVDHelper.java:154) > at > org.apache.mahout.math.hadoop.stochasticsvd.SSVDHelper$1.compare(SSVDHelper.java:1) > at java.util.Arrays.mergeSort(Arrays.java:1270) > at java.util.Arrays.mergeSort(Arrays.java:1281) > at java.util.Arrays.mergeSort(Arrays.java:1281) > at java.util.Arrays.sort(Arrays.java:1210) > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator.init(SequenceFileDirValueIterator.java:112) > at > org.apache.mahout.common.iterator.sequencefile.SequenceFileDirValueIterator.<init>(SequenceFileDirValueIterator.java:94) > at > org.apache.mahout.math.hadoop.stochasticsvd.BtJob$BtMapper.setup(BtJob.java:220) > at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:142) > at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:647) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:323) > at org.apache.hadoop.mapred.Child$4.run(Child.java:266) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1278) > at org.apache.hadoop.mapred.Child.main(Child.java:260) > The bug is @ > https://github.com/apache/mahout/blob/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/BtJob.java, > near line 220. > and @ > https://github.com/apache/mahout/blob/trunk/core/src/main/java/org/apache/mahout/math/hadoop/stochasticsvd/SSVDHelper.java > near line 144. > SSVDHelper's PARTITION_COMPARATOR assumes all files in the distributed cache > will have a particular pattern whereas we have jar files in our distributed > cache which causes the above exception. -- This message was sent by Atlassian JIRA (v6.1.5#6160)