[ 
https://issues.apache.org/jira/browse/MAHOUT-556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12964772#action_12964772
 ] 

Ted Dunning commented on MAHOUT-556:
------------------------------------

Rohan,

THanks for looking into this.

I have started building up serializers of the entire sgd framework using 
serializers because large models just aren't feasible to serialize
using JSON.

Let me check the state of those changes.  Last I looked, I think that they were 
very close to ready.

> In the trainlogistic example the JSON model file which is created is missing 
> commas and making it unusable with runLogistic.
> ----------------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAHOUT-556
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-556
>             Project: Mahout
>          Issue Type: Bug
>          Components: Classification
>    Affects Versions: 0.5
>         Environment: Ubuntu 10.10, Hadoop-0.20.2 
>            Reporter: Rohan Anil
>            Priority: Minor
>
> Bug related to creation of the model when you run trainlogistic 
>  Its creating the JSON model file  using the toJson function as illustrated 
> below
> --------------------------------
> In,
>  LogisticModelParameters.java
> Function
> void saveTo(Writer out)
> {
> ...
> ..
> String savedForm = gson.toJson(this);
> ...
> }
> --------------------------------
> But this is not working as expected : -  String savedForm = gson.toJson(this);
> For my experiment using a different dataset - 
> I get the following model file : 
> {"targetVariable":"customer","typeMap":{"feature2":"n","feature3":"n",
>     "feature1":"n"},"numFeatures":334,"useBias":true,"maxTargetCategories":
>   2,"targetCategories":["0","1"],"lambda":1.0E-4,"learningRate":0.001,"lr":{
>     "mu0":0.001,"decayFactor":0.999,"stepOffset":10,"forgettingExponent":
>     -0.5,"perTermAnnealingOffset":20,"beta":{"rows":1,"cols":334,"data":[[
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,6.741887291022263E-4,0.0,0.0,-53.6076187622054,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,5.031178185395536E-5,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.04383410529689268,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,
>           0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0]]},
>     "numCategories":2,"step":260951,"updateSteps":{}"updateCounts":{}
>     "lambda":1.0E-4,"prior":{}"sealed":true,"gradient":{}}}
> If you notice the last part,
>   "numCategories":2,"step":260951,"updateSteps":{}"updateCounts":{}
>     "lambda":1.0E-4,"prior":{}"sealed":true,"gradient":{}}}
> are missing commas between updateSteps,updateCounts  and Sealed variables
> Investigating further, 
> These come from the  AbstractOnlineLogisticRegression.java and the above 
> variables are not initialized hence the wrong output by the toJson function. 
> This is a bug with  - > gson.toJson function,  I see that I am using gson-1.3 
> and upgrading to 1.4  by modifying core/pom.xml fixes things, But runLogistic 
> then complains about 
> 10/11/29 03:29:43 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found 
> in the classpath. Usage of hadoop-site.xml is deprecated. Instead use 
> core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of 
> core-default.xml, mapred-default.xml and hdfs-default.xml respectively
> Exception in thread "main" java.lang.RuntimeException: No-args constructor 
> for interface org.apache.mahout.math.Vector does not exist. Register an 
> InstanceCreator with Gson for this type to fix this problem.
>       at 
> com.google.gson.MappedObjectConstructor.constructWithNoArgConstructor(MappedObjectConstructor.java:64)
>       at 
> com.google.gson.MappedObjectConstructor.construct(MappedObjectConstructor.java:53)
>       at 
> com.google.gson.JsonObjectDeserializationVisitor.constructTarget(JsonObjectDeserializationVisitor.java:41)
>       at 
> com.google.gson.JsonDeserializationVisitor.getTarget(JsonDeserializationVisitor.java:56)
>       at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:101)
>       at 
> com.google.gson.JsonDeserializationVisitor.visitChild(JsonDeserializationVisitor.java:107)
>       at 
> com.google.gson.JsonDeserializationVisitor.visitChildAsObject(JsonDeserializationVisitor.java:95)
>       at 
> com.google.gson.JsonObjectDeserializationVisitor.visitObjectField(JsonObjectDeserializationVisitor.java:62)
>       at 
> com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:156)
>       at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:123)
>       at 
> com.google.gson.JsonDeserializationVisitor.visitChild(JsonDeserializationVisitor.java:107)
>       at 
> com.google.gson.JsonDeserializationVisitor.visitChildAsObject(JsonDeserializationVisitor.java:95)
>       at 
> com.google.gson.JsonObjectDeserializationVisitor.visitObjectField(JsonObjectDeserializationVisitor.java:62)
>       at 
> com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:156)
>       at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:123)
>       at 
> com.google.gson.JsonDeserializationContextDefault.fromJsonObject(JsonDeserializationContextDefault.java:73)
>       at 
> com.google.gson.JsonDeserializationContextDefault.deserialize(JsonDeserializationContextDefault.java:51)
>       at com.google.gson.Gson.fromJson(Gson.java:495)
>       at com.google.gson.Gson.fromJson(Gson.java:444)
>       at com.google.gson.Gson.fromJson(Gson.java:419)
>       at 
> org.apache.mahout.classifier.sgd.LogisticModelParameters.loadFrom(LogisticModelParameters.java:142)
>       at 
> org.apache.mahout.classifier.sgd.LogisticModelParameters.loadFrom(LogisticModelParameters.java:155)
>       at 
> org.apache.mahout.classifier.sgd.RunLogistic.main(RunLogistic.java:56)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:616)
>       at 
> org.apache.hadoop.util.ProgramDriver$ProgramDescription.invoke(ProgramDriver.java:68)
>       at org.apache.hadoop.util.ProgramDriver.driver(ProgramDriver.java:139)
>       at org.apache.mahout.driver.MahoutDriver.main(MahoutDriver.java:182)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>       at java.lang.reflect.Method.invoke(Method.java:616)
>       at org.apache.hadoop.util.RunJar.main(RunJar.java:156)
> Which I haven't had the time to investigate yet, Will post more results 
> tomorrow.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to