[
https://issues.apache.org/jira/browse/MAHOUT-985?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13399369#comment-13399369
]
Hudson commented on MAHOUT-985:
-------------------------------
Integrated in Mahout-Quality #1556 (See
[https://builds.apache.org/job/Mahout-Quality/1556/])
MAHOUT-985 ignore ARFF instance weights, handle ? correctly (Revision
1352857)
Result = SUCCESS
srowen : http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1352857
Files :
*
/mahout/trunk/integration/src/main/java/org/apache/mahout/utils/vectors/arff/ARFFIterator.java
*
/mahout/trunk/integration/src/main/java/org/apache/mahout/utils/vectors/arff/ARFFModel.java
*
/mahout/trunk/integration/src/test/java/org/apache/mahout/utils/vectors/arff/ARFFVectorIterableTest.java
> MapBackedArffModel Unable To Parse ARFF Files Containing Instance Weights
> -------------------------------------------------------------------------
>
> Key: MAHOUT-985
> URL: https://issues.apache.org/jira/browse/MAHOUT-985
> Project: Mahout
> Issue Type: Bug
> Components: Integration
> Affects Versions: 0.5
> Reporter: Dave Kor
> Assignee: Sean Owen
> Priority: Minor
> Labels: Arff
> Fix For: 0.8
>
> Attachments: MAHOUT-985.patch
>
>
> When parsing an Arff file that contain instance-specific weights,
> MapBackedArffModel throws the following NPE exception. While I have only
> tested this in 0.5, I suspect this bug also occur in 0.6
> Exception in thread "main" java.lang.NullPointerException
> at
> org.apache.mahout.utils.vectors.arff.MapBackedARFFModel.getValue(MapBackedARFFModel.java:87)
> at
> org.apache.mahout.utils.vectors.arff.ARFFIterator.computeNext(ARFFIterator.java:75)
> at
> org.apache.mahout.utils.vectors.arff.ARFFIterator.computeNext(ARFFIterator.java:30)
> at
> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
> at
> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
> at
> org.apache.mahout.utils.vectors.io.SequenceFileVectorWriter.write(SequenceFileVectorWriter.java:43)
> at
> org.apache.mahout.utils.vectors.arff.Driver.writeFile(Driver.java:159)
> at org.apache.mahout.utils.vectors.arff.Driver.main(Driver.java:127)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
> at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
> at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:187)
> The code works properly when all instance weights are set to the default
> value of 1. However when any instance has a non-default weight, such as in
> the sample Arff file below, the NPE occurs when MapBackedArffModel attempts
> to parse line 8.
> -----
> @relation 'Test Mahout'
> @attribute Attr0 numeric
> @attribute Label {True,False}
> @data
> 0,False
> 1,True,{2}
> -----
> The reason is that in Weka, all data instances are assumed to have a default
> weight of 1 and this default weight is not saved in the Arff file. However
> when a data instance DOES NOT have the default weight of 1, the non-default
> instance weight is appended at the end of the line surrounded by curly
> braces. When MapBackedArffModel.getValue method tries to parse this weight as
> an attribute, typeMap.get(idx) returns a null ARFFtype as there is no such
> attribute, which results in an NPE.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira