Hey Ted, Sorry for the noise. I am looking around in the o.a.m.classifier.sgd.ModelSerializer and I only see methods for writeJson...
On Dec 29, 2010, at 4:01 PM, Ted Dunning wrote: > Yes. > > That is evil. The problem is that GSON recurses on lists and that makes > memory use crazy bad. > > Try serializing as binary. I committed a change to allow that a few weeks > ago that added a method to ModelSerializer. The SGD models are also all > Writable's now which should make rolling your own serialization very easy.. > > > On Wed, Dec 29, 2010 at 3:59 PM, Chris Schilling > <chris.schill...@gmail.com>wrote: > >> Hi again, >> >> I notice that if I try to write the model for the 20 NG example, I am >> running out of memory. I am running on a small ec2 instance, so I run with >> the JVM with -Xmx1400m. >> >> So, I can train and dissect the model just fine. However, when I try to >> write the weights: >> ModelSerializer.writeJson("/tmp/sgd_adaptive.model", learningAlgorithm); >> >> My feature vector size is 10000. >> >> I get an OOM exception: >> >> Exception in thread "main" java.lang.OutOfMemoryError: Java heap space >> at >> org.apache.mahout.classifier.sgd.ModelSerializer$MatrixTypeAdapter.serialize(ModelSerializer.java:221) >> at >> org.apache.mahout.classifier.sgd.ModelSerializer$MatrixTypeAdapter.serialize(ModelSerializer.java:210) >> at >> com.google.gson.JsonSerializationVisitor.visitFieldUsingCustomHandler(JsonSerializationVisitor.java:148) >> at >> com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:141) >> at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:122) >> at >> com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:47) >> at >> com.google.gson.DefaultTypeAdapters$CollectionTypeAdapter.serialize(DefaultTypeAdapters.java:445) >> at >> com.google.gson.DefaultTypeAdapters$CollectionTypeAdapter.serialize(DefaultTypeAdapters.java:431) >> at >> com.google.gson.JsonSerializationVisitor.visitFieldUsingCustomHandler(JsonSerializationVisitor.java:148) >> at >> com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:141) >> at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:122) >> at >> com.google.gson.JsonSerializationVisitor.getJsonElementForChild(JsonSerializationVisitor.java:117) >> at >> com.google.gson.JsonSerializationVisitor.addAsChildOfObject(JsonSerializationVisitor.java:95) >> at >> com.google.gson.JsonSerializationVisitor.visitObjectField(JsonSerializationVisitor.java:90) >> at >> com.google.gson.ObjectNavigator.navigateClassFields(ObjectNavigator.java:147) >> at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:122) >> at >> com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:47) >> at >> com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:40) >> at >> org.apache.mahout.classifier.sgd.ModelSerializer$StateTypeAdapter.serialize(ModelSerializer.java:333) >> at >> org.apache.mahout.classifier.sgd.ModelSerializer$StateTypeAdapter.serialize(ModelSerializer.java:287) >> at >> com.google.gson.JsonSerializationVisitor.visitUsingCustomHandler(JsonSerializationVisitor.java:128) >> at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:96) >> at >> com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:47) >> at >> org.apache.mahout.classifier.sgd.ModelSerializer$EvolutionaryProcessTypeAdapter.serialize(ModelSerializer.java:375) >> at >> org.apache.mahout.classifier.sgd.ModelSerializer$EvolutionaryProcessTypeAdapter.serialize(ModelSerializer.java:339) >> at >> com.google.gson.JsonSerializationVisitor.visitUsingCustomHandler(JsonSerializationVisitor.java:128) >> at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:96) >> at >> com.google.gson.JsonSerializationContextDefault.serialize(JsonSerializationContextDefault.java:47) >> at >> org.apache.mahout.classifier.sgd.ModelSerializer$AdaptiveLogisticRegressionTypeAdapter.serialize(ModelSerializer.java:189) >> at >> org.apache.mahout.classifier.sgd.ModelSerializer$AdaptiveLogisticRegressionTypeAdapter.serialize(ModelSerializer.java:153) >> at >> com.google.gson.JsonSerializationVisitor.visitUsingCustomHandler(JsonSerializationVisitor.java:128) >> at com.google.gson.ObjectNavigator.accept(ObjectNavigator.java:96) >> >> Does this make sense? seems like too much memory to serialize. >> >> Thanks >> Chris