Hello O, It seems to me (but it's better to look at the heap histogram) that buffering sub-entities in SortedMapBackedCache blows heap off. I'm aware about two directions: - use file based cache instead. I don't know exactly how it works, you can start from https://issues.apache.org/jira/browse/SOLR-2382 and check how to enable berkleyDB cache; - personally, I'm promoting merging resultsets ordered by RDBMS https://issues.apache.org/jira/browse/SOLR-4799
On Fri, May 9, 2014 at 7:16 PM, O. Olson <olson_...@yahoo.it> wrote: > I have a Data Schema which is Hierarchical i.e. I have an Entity and a > number > of attributes. For a small subset of the Data - about 300 MB, I can do the > import with 3 GB memory. Now with the entire 4 GB Dataset, I find I cannot > do the import with 9 GB of memory. > I am using the SqlEntityProcessor as below: > > <dataConfig> > <dataSource driver="com.microsoft.sqlserver.jdbc.SQLServerDriver" > > url="jdbc:sqlserver://localhost\MSSQLSERVER;databaseName=SolrDB;user=solrusr;password=solrusr;"/> > <document> > <entity name="Entity" query="SELECT EntID, Image > FROM ENTITY_TABLE"> > <field column="EntID" name="EntID" /> > <field column="Image" name="Image" /> > > <entity name="EntityAttribute1" > query="SELECT AttributeValue, EntID FROM ATTR_TABLE > WHERE AttributeID=1" > cacheKey="EntID" > cacheLookup="Entity.EntID" > processor="SqlEntityProcessor" cacheImpl="SortedMapBackedCache"> > <field column="AttributeValue" > name="EntityAttribute1" /> > </entity> > <entity name="EntityAttribute2" > query="SELECT AttributeValue, EntID FROM ATTR_TABLE > WHERE AttributeID=2" > cacheKey="EntID" > cacheLookup="Entity.EntID" > processor="SqlEntityProcessor" cacheImpl="SortedMapBackedCache"> > <field column="AttributeValue" > name="EntityAttribute2" /> > </entity> > > > > </entity> > </document> > </dataConfig> > > > > What is the best way to import this data? Doing it without a cache, results > in many SQL queries. With the cache, I run out of memory. > > I’m curious why 4GB of data cannot entirely fit in memory. One thing I need > to mention is that I have about 400 to 500 attributes. > > Thanks in advance for any helpful advice. > O. O. > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/DataImport-using-SqlEntityProcessor-running-Out-of-Memory-tp4135080.html > Sent from the Solr - User mailing list archive at Nabble.com. > -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid Dynamics <http://www.griddynamics.com> <mkhlud...@griddynamics.com>