[
https://issues.apache.org/jira/browse/SOLR-1623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12786185#action_12786185
]
Laurent Chavet commented on SOLR-1623:
--------------------------------------
Yes this definitely repros in 1.4.
Unfortunately I think I need a lot of fields; here is what I am trying to do:
I want to store news articles and extract many topics for each story with a
score for each topic for each story.
So for example a story migh have a topic of Crime with a score of 20.
So what I am doing now is store:
Field:Topic Value:Crime indexed="true" stored="true" (need
to searched and retrieved)
Field:Weight_Topic_Crime Value:20 indexed="true" stored="true" (needs to be
sorted and retrieved)
Because there can be a lot of different value for the field topic; with this
schema we end up with a lot of fields starting with weight.
Any suggestion on how to achieve the same result in a different way?
Thanks,
Laurent
> Solr hangs (often throwing java.lang.OutOfMemoryError: PermGen space) when
> indexing many different field names
> --------------------------------------------------------------------------------------------------------------
>
> Key: SOLR-1623
> URL: https://issues.apache.org/jira/browse/SOLR-1623
> Project: Solr
> Issue Type: Bug
> Components: update
> Affects Versions: 1.3, 1.4
> Environment: Tomcat Version JVM Version
> JVM Vendor OS Name OS Version OS Architecture
> Apache Tomcat/6.0 snapshot 1.6.0_13-b03 Sun Microsystems Inc. Linux
> 2.6.18-164.el5 amd64
> and/or
> Tomcat Version JVM Version JVM Vendor
> OS Name OS Version OS Architecture
> Apache Tomcat/6.0.18 1.6.0_12-b04 Sun Microsystems Inc. Windows
> 2003 5.2 amd64
> Reporter: Laurent Chavet
> Priority: Critical
>
> With the following fields in schema.xml:
> <fields>
> <field name="id" type="sint" indexed="true" stored="true" required="true"
> />
> <dynamicField name="weight_*" type="sint" indexed="true"
> stored="true"/>
> </fields>
> Run the following code:
> import java.util.ArrayList;
> import java.util.List;
> import org.apache.solr.client.solrj.SolrServer;
> import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer;
> import org.apache.solr.common.SolrInputDocument;
> public static void main(String[] args) throws Exception {
> SolrServer server;
> try {
> server = new CommonsHttpSolrServer(args[0]);
> } catch (Exception e) {
> System.err.println("can't creater server using: " + args[0] + "
> " + e.getMessage());
> throw e;
> }
> for (int i = 0; i < 1000; i++) {
> List<SolrInputDocument> batchedDocs = new
> ArrayList<SolrInputDocument>();
> for (int j = 0; j < 1000; j++) {
> SolrInputDocument doc = new SolrInputDocument();
> doc.addField("id", i * 1000 + j);
> // hangs after 30 to 50 batches
>
> doc.addField("weight_aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa"
> + Integer.toString(i) + "_" + Integer.toString(j), i * 1000 + j);
> // hangs after about 200 batches
> //doc.addField("weight_" + Integer.toString(i) + "_" +
> Integer.toString(j), i * 1000 + j);
> batchedDocs.add(doc);
> }
> try {
> server.add(batchedDocs, true);
> System.err.println("Done with batch=" + i);
> // server.commit(); //doesn't change anything
> } catch (Exception e) {
> System.err.println("batchId=" + i + " bad batch: " +
> e.getMessage());
> throw e;
> }
> }
> }
> And soon the client (sometime throws) and solr will freeze. sometime you can
> see: java.lang.OutOfMemoryError: PermGen space in the server logs
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.