Re: need help from hard core solr experts - out of memory error

2014-04-20 Thread Candygram For Mongo
We have tried using fetchSize and we still got the same out of memory
errors.


On Fri, Apr 18, 2014 at 9:39 PM, Shawn Heisey  wrote:

> On 4/18/2014 6:15 PM, Candygram For Mongo wrote:
> > We are getting Out Of Memory errors when we try to execute a full import
> > using the Data Import Handler.  This error originally occurred on a
> > production environment with a database containing 27 million records.
>  Heap
> > memory was configured for 6GB and the server had 32GB of physical memory.
> >  We have been able to replicate the error on a local system with 6
> million
> > records.  We set the memory heap size to 64MB to accelerate the error
> > replication.  The indexing process has been failing in different
> scenarios.
> >  We have 9 test cases documented.  In some of the test cases we increased
> > the heap size to 128MB.  In our first test case we set heap memory to
> 512MB
> > which also failed.
>
> One characteristic of a JDBC connection is that unless you tell it
> otherwise, it will try to retrieve the entire resultset into RAM before
> any results are delivered to the application.  It's not Solr doing this,
> it's JDBC.
>
> In this case, there are 27 million rows in the resultset.  It's highly
> unlikely that this much data (along with the rest of Solr's memory
> requirements) will fit in 6GB of heap.
>
> JDBC has a built-in way to deal with this.  It's called fetchSize.  By
> using the batchSize parameter on your JdbcDataSource config, you can set
> the JDBC fetchSize.  Set it to something small, between 100 and 1000,
> and you'll probably get rid of the OOM problem.
>
> http://wiki.apache.org/solr/DataImportHandler#Configuring_JdbcDataSource
>
> If you had been using MySQL, I would have recommended that you set
> batchSize to -1.  This sets fetchSize to Integer.MIN_VALUE, which tells
> the MySQL driver to stream results instead of trying to either batch
> them or return everything.  I'm pretty sure that the Oracle driver
> doesn't work this way -- you would have to modify the dataimport source
> code to use their streaming method.
>
> Thanks,
> Shawn
>
>


Re: need help from hard core solr experts - out of memory error

2014-04-18 Thread Candygram For Mongo
I have uploaded several files including the problem description with
graphics to this link on Google drive:

https://drive.google.com/folderview?id=0B7UpFqsS5lSjWEhxRE1NN2tMNTQ&usp=sharing

I shared it with this address "solr-user@lucene.apache.org" so I am hoping
it can be accessed by people in the group.


On Fri, Apr 18, 2014 at 5:15 PM, Candygram For Mongo <
candygram.for.mo...@gmail.com> wrote:

> I have lots of log files and other files to support this issue (sometimes
> referenced in the text below) but I am not sure the best way to submit.  I
> don't want to overwhelm and I am not sure if this email will accept graphs
> and charts.  Please provide direction and I will send them.
>
>
> *Issue Description*
>
>
>
> We are getting Out Of Memory errors when we try to execute a full import
> using the Data Import Handler.  This error originally occurred on a
> production environment with a database containing 27 million records.  Heap
> memory was configured for 6GB and the server had 32GB of physical memory.
>  We have been able to replicate the error on a local system with 6 million
> records.  We set the memory heap size to 64MB to accelerate the error
> replication.  The indexing process has been failing in different scenarios.
>  We have 9 test cases documented.  In some of the test cases we increased
> the heap size to 128MB.  In our first test case we set heap memory to 512MB
> which also failed.
>
>
>
>
>
> *Environment Values Used*
>
>
>
> *SOLR/Lucene version: *4.2.1*
>
> *JVM version:
>
> Java(TM) SE Runtime Environment (build 1.7.0_07-b11)
>
> Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)
>
> *Indexer startup command:
>
> set JVMARGS= -XX:MaxPermSize=364m -Xss256K –Xmx128m –Xms128m
>
> java " %JVMARGS% ^
>
> -Dcom.sun.management.jmxremote.port=1092 ^
>
> -Dcom.sun.management.jmxremote.ssl=false ^
>
> -Dcom.sun.management.jmxremote.authenticate=false ^
>
> -jar start.jar
>
> *SOLR indexing HTTP parameters request:
>
> webapp=/solr path=/dataimport
> params={clean=false&command=full-import&wt=javabin&version=2}
>
>
>
> The information we use for the database retrieve using the Data Import
> Handler is as follows:
>
>
>
> 
> name="org_only"
>
> type="JdbcDataSource"
>
> driver="oracle.jdbc.OracleDriver"
>
> url="jdbc:oracle:thin:@{server
> name}:1521:{database name}"
>
> user="{username}"
>
> password="{password}"
>
> readOnly="false"
>
> />
>
>
>
>
>
> *The Query (simple, single table)*
>
>
>
> *select*
>
>
>
> *NVL(cast(STU.ACCT_ADDRESS_ALL.R_ID as varchar2(100)), 'null')*
>
> *as SOLR_ID,*
>
>
>
> *'STU.ACCT_ADDRESS_ALL'*
>
> *as SOLR_CATEGORY,*
>
>
>
> *NVL(cast(STU.ACCT_ADDRESS_ALL.R_ID as varchar2(255)), ' ') as
> ADDRESSALLRID,*
>
> *NVL(cast(STU.ACCT_ADDRESS_ALL.ADDR_TYPE as varchar2(255)), ' ') as
> ADDRESSALLADDRTYPECD,*
>
> *NVL(cast(STU.ACCT_ADDRESS_ALL.LONGITUDE as varchar2(255)), ' ') as
> ADDRESSALLLONGITUDE,*
>
> *NVL(cast(STU.ACCT_ADDRESS_ALL.LATITUDE as varchar2(255)), ' ') as
> ADDRESSALLLATITUDE,*
>
> *NVL(cast(STU.ACCT_ADDRESS_ALL.ADDR_NAME as varchar2(255)), ' ') as
> ADDRESSALLADDRNAME,*
>
> *NVL(cast(STU.ACCT_ADDRESS_ALL.CITY as varchar2(255)), ' ') as
> ADDRESSALLCITY,*
>
> *NVL(cast(STU.ACCT_ADDRESS_ALL.STATE as varchar2(255)), ' ') as
> ADDRESSALLSTATE,*
>
> *NVL(cast(STU.ACCT_ADDRESS_ALL.EMAIL_ADDR as varchar2(255)), ' ') as
> ADDRESSALLEMAILADDR *
>
>
>
> *from STU.ACCT_ADDRESS_ALL*
>
>
>
> You can see this information in the database.xml file.
>
>
>
> Our main solrconfig.xml file contains the following differences compared
> to a new downloaded solrconfig.xml file (the original content).
>
>
>
> 
>
>  regex="solr-dataimporthandler-.*\.jar" />
>
> 
>
> 
>
> 
>
> 
>
>
>
>
> ${solr.abortOnConfigurationError:true}
>
>
>
>  class="org.apache.solr.core.StandardDirectoryFactory" />
>
>
>
>  class="org.apache.solr.handler.dataimport.DataImportHandler">
>
> 
>
> database.xml
>
> 
>
> 
>
> 
>
>
>
>
>
> *Custom Libraries*
>
>

is there any way to post images and attachments to this mailing list?

2014-04-18 Thread Candygram For Mongo



Re: need help from hard core solr experts - out of memory error

2014-04-18 Thread Candygram For Mongo
We consistently reproduce this problem on multiple systems configured with
6GB and 12GB of heap space.  To quickly reproduce many cases for
troubleshooting we reduced the heap space to 64, 128 and 512MB.  With 6 or
12GB configured it takes hours to see the error.


On Fri, Apr 18, 2014 at 5:54 PM, Walter Underwood wrote:

> I see heap size commands for 128 Meg and 512 Meg. That will certainly run
> out of memory. Why do you think you have 6G of heap with these settings?
>
> –Xmx128m –Xms128m
> –Xmx512m –Xms512m
>
> wunder
>
> On Apr 18, 2014, at 5:15 PM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
>
> > I have lots of log files and other files to support this issue (sometimes
> > referenced in the text below) but I am not sure the best way to submit.
>  I
> > don't want to overwhelm and I am not sure if this email will accept
> graphs
> > and charts.  Please provide direction and I will send them.
> >
> >
> > *Issue Description*
> >
> >
> >
> > We are getting Out Of Memory errors when we try to execute a full import
> > using the Data Import Handler.  This error originally occurred on a
> > production environment with a database containing 27 million records.
>  Heap
> > memory was configured for 6GB and the server had 32GB of physical memory.
> > We have been able to replicate the error on a local system with 6 million
> > records.  We set the memory heap size to 64MB to accelerate the error
> > replication.  The indexing process has been failing in different
> scenarios.
> > We have 9 test cases documented.  In some of the test cases we increased
> > the heap size to 128MB.  In our first test case we set heap memory to
> 512MB
> > which also failed.
> >
> >
> >
> >
> >
> > *Environment Values Used*
> >
> >
> >
> > *SOLR/Lucene version: *4.2.1*
> >
> > *JVM version:
> >
> > Java(TM) SE Runtime Environment (build 1.7.0_07-b11)
> >
> > Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)
> >
> > *Indexer startup command:
> >
> > set JVMARGS= -XX:MaxPermSize=364m -Xss256K –Xmx128m –Xms128m
> >
> > java " %JVMARGS% ^
> >
> > -Dcom.sun.management.jmxremote.port=1092 ^
> >
> > -Dcom.sun.management.jmxremote.ssl=false ^
> >
> > -Dcom.sun.management.jmxremote.authenticate=false ^
> >
> > -jar start.jar
> >
> > *SOLR indexing HTTP parameters request:
> >
> > webapp=/solr path=/dataimport
> > params={clean=false&command=full-import&wt=javabin&version=2}
> >
> >
> >
> > The information we use for the database retrieve using the Data Import
> > Handler is as follows:
> >
> >
> >
> >  >
> >name="org_only"
> >
> >type="JdbcDataSource"
> >
> >driver="oracle.jdbc.OracleDriver"
> >
> >url="jdbc:oracle:thin:@{server
> name}:1521:{database
> > name}"
> >
> >user="{username}"
> >
> >password="{password}"
> >
> >readOnly="false"
> >
> >/>
> >
> >
> >
> >
> >
> > *The Query (simple, single table)*
> >
> >
> >
> > *select*
> >
> >
> >
> > *NVL(cast(STU.ACCT_ADDRESS_ALL.R_ID as varchar2(100)), 'null')*
> >
> > *as SOLR_ID,*
> >
> >
> >
> > *'STU.ACCT_ADDRESS_ALL'*
> >
> > *as SOLR_CATEGORY,*
> >
> >
> >
> > *NVL(cast(STU.ACCT_ADDRESS_ALL.R_ID as varchar2(255)), ' ') as
> > ADDRESSALLRID,*
> >
> > *NVL(cast(STU.ACCT_ADDRESS_ALL.ADDR_TYPE as varchar2(255)), ' ') as
> > ADDRESSALLADDRTYPECD,*
> >
> > *NVL(cast(STU.ACCT_ADDRESS_ALL.LONGITUDE as varchar2(255)), ' ') as
> > ADDRESSALLLONGITUDE,*
> >
> > *NVL(cast(STU.ACCT_ADDRESS_ALL.LATITUDE as varchar2(255)), ' ') as
> > ADDRESSALLLATITUDE,*
> >
> > *NVL(cast(STU.ACCT_ADDRESS_ALL.ADDR_NAME as varchar2(255)), ' ') as
> > ADDRESSALLADDRNAME,*
> >
> > *NVL(cast(STU.ACCT_ADDRESS_ALL.CITY as varchar2(255)), ' ') as
> > ADDRESSALLCITY,*
> >
> > *NVL(cast(STU.ACCT_ADDRESS_ALL.STATE as varchar2(255)), ' ') as
> > ADDRESSALLSTATE,*
> >
> > *NVL(cast(STU.ACCT_ADDRESS_ALL.EMAIL_ADDR as varchar2(255)),

need help from hard core solr experts - out of memory error

2014-04-18 Thread Candygram For Mongo
I have lots of log files and other files to support this issue (sometimes
referenced in the text below) but I am not sure the best way to submit.  I
don't want to overwhelm and I am not sure if this email will accept graphs
and charts.  Please provide direction and I will send them.


*Issue Description*



We are getting Out Of Memory errors when we try to execute a full import
using the Data Import Handler.  This error originally occurred on a
production environment with a database containing 27 million records.  Heap
memory was configured for 6GB and the server had 32GB of physical memory.
 We have been able to replicate the error on a local system with 6 million
records.  We set the memory heap size to 64MB to accelerate the error
replication.  The indexing process has been failing in different scenarios.
 We have 9 test cases documented.  In some of the test cases we increased
the heap size to 128MB.  In our first test case we set heap memory to 512MB
which also failed.





*Environment Values Used*



*SOLR/Lucene version: *4.2.1*

*JVM version:

Java(TM) SE Runtime Environment (build 1.7.0_07-b11)

Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)

*Indexer startup command:

set JVMARGS= -XX:MaxPermSize=364m -Xss256K –Xmx128m –Xms128m

java " %JVMARGS% ^

-Dcom.sun.management.jmxremote.port=1092 ^

-Dcom.sun.management.jmxremote.ssl=false ^

-Dcom.sun.management.jmxremote.authenticate=false ^

-jar start.jar

*SOLR indexing HTTP parameters request:

webapp=/solr path=/dataimport
params={clean=false&command=full-import&wt=javabin&version=2}



The information we use for the database retrieve using the Data Import
Handler is as follows:









*The Query (simple, single table)*



*select*



*NVL(cast(STU.ACCT_ADDRESS_ALL.R_ID as varchar2(100)), 'null')*

*as SOLR_ID,*



*'STU.ACCT_ADDRESS_ALL'*

*as SOLR_CATEGORY,*



*NVL(cast(STU.ACCT_ADDRESS_ALL.R_ID as varchar2(255)), ' ') as
ADDRESSALLRID,*

*NVL(cast(STU.ACCT_ADDRESS_ALL.ADDR_TYPE as varchar2(255)), ' ') as
ADDRESSALLADDRTYPECD,*

*NVL(cast(STU.ACCT_ADDRESS_ALL.LONGITUDE as varchar2(255)), ' ') as
ADDRESSALLLONGITUDE,*

*NVL(cast(STU.ACCT_ADDRESS_ALL.LATITUDE as varchar2(255)), ' ') as
ADDRESSALLLATITUDE,*

*NVL(cast(STU.ACCT_ADDRESS_ALL.ADDR_NAME as varchar2(255)), ' ') as
ADDRESSALLADDRNAME,*

*NVL(cast(STU.ACCT_ADDRESS_ALL.CITY as varchar2(255)), ' ') as
ADDRESSALLCITY,*

*NVL(cast(STU.ACCT_ADDRESS_ALL.STATE as varchar2(255)), ' ') as
ADDRESSALLSTATE,*

*NVL(cast(STU.ACCT_ADDRESS_ALL.EMAIL_ADDR as varchar2(255)), ' ') as
ADDRESSALLEMAILADDR *



*from STU.ACCT_ADDRESS_ALL*



You can see this information in the database.xml file.



Our main solrconfig.xml file contains the following differences compared to
a new downloaded solrconfig.xml
file(the
original content).

















${solr.abortOnConfigurationError:true}











database.xml











*Custom Libraries*



The common.jar contains a customized TokenFiltersFactory implementation
that we use for indexing.  They do some special treatment to the fields
read from the database.  How those classes are used is described in the
schema.xml file.  The webapp.jar file contains other related classes.  The
commons-pool-1.4.jar is an API from apache used for instances reuse.



The logic used in the TokenFiltersFactory is contained in the following
files:




ConcatFilterFactory.java


ConcatFilter.java


MDFilterSchemaFactory.java


MDFilter.java


MDFilterPoolObjectFactory.java


NullValueFilterFactory.java


NullValueFilter.java



How we use them is described in the schema.xml file.



We have been experimenting with the following configuration values:



maxIndexingThreads

ramBufferSizeMB

maxBufferedDocs

mergePolicy

maxMergeAtOnce

segmentsPerTier

maxMergedSegmentMB

autoCommit

maxDocs

maxTime

autoSoftCommit

maxTime



Using numerous combinations of these values, the indexing fails.





*IMPORTANT NOTE*



When we disable all of the copyfield tags contained in the schema.xml file,
or all but relatively few, the indexing completes successfully (see Test
Case 1).





*TEST CASES*



All of the test cases have been analyzed with the Visual VM tool.  All SOLR
configuration files and indexer log content are in the test case
directories included in a zip file.  We have included the most relevant
screenshots.  Test Case 2 is the only one that includes the thread dump.





*Test Case 1 *



JVM arguments = -XX:MaxPermSize=364m -Xss256K –Xmx512m –Xms512m



Results:

Indexing status: Completed

Time taken:   1:8:32.519

Error detail:   NO ERROR.

Index data directory size =  995 MB





*Test Case 2*



JVM arguments = -XX:

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-07 Thread Candygram For Mongo
.
> http://wiki.apache.org/solr/DataImportHandlerFaq#I.27m_using_DataImportHandler_with_a_MySQL_database._My_table_is_huge_and_DataImportHandler_is_going_out_of_memory._Why_does_DataImportHandler_bring_everything_to_memory.3F
>
> You have a lot of copyFields defined. There could be some gotchas when
> handling unusually much copy fields. I would really try CSV option here.
> Given that you have only full import SQL defined and it is not a complex
> one. It queries only one table. I believe Oracle has some tool to export a
> table to CSV file efficiently.
>
> On Saturday, April 5, 2014 3:05 AM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
>
> Does this user list allow attachments?  I have four files attached
> (database.xml, error.txt, schema.xml, solrconfig.xml).  We just ran the
> process again using the parameters you suggested, but not to a csv file.
>  It errored out quickly.  We are working on the csv file run.
>
> Removed both  and
>  parts/definitions from solrconfig.xml
>
> Disabled tlog by removing
>
>name="dir">${solr.ulog.dir:}
> 
>
> from solrconfig.xml
>
> Used commit=true parameter.
> ?commit=true&command=full-import
>
>
>
>
> On Fri, Apr 4, 2014 at 3:29 PM, Ahmet Arslan  wrote:
>
> Hi,
> >
> >This may not solve your problem but generally it is recommended to
> disable auto commit and transaction logs for bulk indexing.
> >And issue one commit at the very end. Do you tlogs enabled? I see "commit
> failed" in the error message thats why I am offering this.
> >
> >And regarding comma separated values, with this approach you focus on
> just solr importing process. You separate data acquisition phrase. And it
> is very fast load even big csv files
> http://wiki.apache.org/solr/UpdateCSV
> >I have never experienced OOM during indexing, I suspect data acquisition
> has role in it.
> >
> >Ahmet
> >
> >
> >On Saturday, April 5, 2014 1:18 AM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
> >
> >We would be happy to try that.  That sounds counter intuitive for the
> high volume of records we have.  Can you help me understand how that might
> solve our problem?
> >
> >
> >
> >
> >On Fri, Apr 4, 2014 at 2:34 PM, Ahmet Arslan  wrote:
> >
> >Hi,
> >>
> >>Can you remove auto commit for bulk import. Commit at the very end?
> >>
> >>Ahmet
> >>
> >>
> >>
> >>
> >>On Saturday, April 5, 2014 12:16 AM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
> >>In case the attached database.xml file didn't show up, I have pasted in
> the
> >>contents below:
> >>
> >>
> >> >>name="org_only"
> >>type="JdbcDataSource"
> >>driver="oracle.jdbc.OracleDriver"
> >>url="jdbc:oracle:thin:@test2.abc.com:1521:ORCL"
> >>user="admin"
> >>password="admin"
> >>readOnly="false"
> >>batchSize="100"
> >>/>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >> >>name="ADDRESS_ACCT_ALL.ADDR_TYPE_CD_abc" />
> >> name="ADDRESS_ACCT_ALL.LONGITUDE_abc" />
> >> />
> >> />
> >>
> >>
> >> name="ADDRESS_ACCT_ALL.EMAIL_ADDR_abc"
> >>/>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>
> >>On Fri, Apr 4, 2014 at 11:55 AM, Candygram For Mongo <
> >>candygram.for.mo...@gmail.com> wrote:
> >>
> >>> In this case we are indexing an Oracle database.
> >>>
> >>> We do not include the data-config.xml in our distribution.  We store
> the
> >>> database information in the database.xml file.  I have attached the
> >>> database.xml file.
> >>>
> >>> When we use the default merge policy settings, we get the same results.
> >>>
> >>>
> >>>
> >>> We have not tried to dump the table to a comma separated file.  We
> think
> >>> that dumping this size table to disk will introduce other memory
> problems
> >>> with big file management. We have not tested that case.
> >>>
> >>>
> >>> On Fri, Apr 4, 2014 at 7:25 AM, Ahmet Arslan 
> wrote:
> >>>
> >>>

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
ex.FreqProxTermsWriterPerField.writeProx(FreqProxTermsWriterPerField.java:145)
at
org.apache.lucene.index.FreqProxTermsWriterPerField.addTerm(FreqProxTermsWriterPerField.java:227)
at
org.apache.lucene.index.TermsHashPerField.add(TermsHashPerField.java:235)
at
org.apache.lucene.index.DocInverterPerField.processFields(DocInverterPerField.java:165)
at
org.apache.lucene.index.DocFieldProcessor.processDocument(DocFieldProcessor.java:254)
at
org.apache.lucene.index.DocumentsWriterPerThread.updateDocument(DocumentsWriterPerThread.java:256)
at
org.apache.lucene.index.DocumentsWriter.updateDocument(DocumentsWriter.java:376)
at
org.apache.lucene.index.IndexWriter.updateDocument(IndexWriter.java:1473)
at
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:201)
at
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:69)
at
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:51)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:477)
at
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:346)
at
org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
at
org.apache.solr.handler.dataimport.SolrWriter.upload(SolrWriter.java:70)
at
org.apache.solr.handler.dataimport.DataImportHandler$1.upload(DataImportHandler.java:235)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:500)
... 6 more

Apr 04, 2014 3:50:26 PM org.apache.solr.update.DirectUpdateHandler2 rollback
INFO: start rollback{}
Apr 04, 2014 3:50:26 PM org.apache.solr.update.DefaultSolrCoreState
newIndexWriter
INFO: Creating new IndexWriter...
Apr 04, 2014 3:50:26 PM org.apache.solr.update.DefaultSolrCoreState
newIndexWriter
INFO: Waiting until IndexWriter is unused... core=core0
Apr 04, 2014 3:50:26 PM org.apache.solr.update.DefaultSolrCoreState
newIndexWriter
INFO: Rollback old IndexWriter... core=core0
Apr 04, 2014 3:50:26 PM org.apache.solr.core.SolrDeletionPolicy onInit
INFO: SolrDeletionPolicy.onInit: commits:num=1
commit{dir=D:\AbcData\V12\application
server\server\indexer\example\solr\core0\data\index,segFN=segments_1,generation=1,filenames=[segments_1]
Apr 04, 2014 3:50:26 PM org.apache.solr.core.SolrDeletionPolicy
updateCommits
INFO: newest commit = 1[segments_1]
Apr 04, 2014 3:50:26 PM org.apache.solr.update.DefaultSolrCoreState
newIndexWriter
INFO: New IndexWriter is ready to be used.
Apr 04, 2014 3:50:26 PM org.apache.solr.update.DirectUpdateHandler2 rollback
INFO: end_rollback



On Fri, Apr 4, 2014 at 4:59 PM, Candygram For Mongo <
candygram.for.mo...@gmail.com> wrote:

> Guessing that the attachments won't work, I am pasting one file in each of
> four separate emails.
>
> database.xml
>
>
> 
>   name="org_only"
> type="JdbcDataSource"
>  driver="oracle.jdbc.OracleDriver"
> url="jdbc:oracle:thin:@test.abcdata.com:1521:ORCL"
>  user="admin"
> password="admin"
> readOnly="false"
>  />
> 
>
>
> 
>
> 
> 
> 
> 
> 
> 
> 
> 
> 
> 
>
> 
>
>
>
> 
> 
> 
> 
>
>
>
> On Fri, Apr 4, 2014 at 4:57 PM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
>
>> Does this user list allow attachments?  I have four files attached
>> (database.xml, error.txt, schema.xml, solrconfig.xml).  We just ran the
>> process again using the parameters you suggested, but not to a csv file.
>>  It errored out quickly.  We are working on the csv file run.
>>
>> Removed both  and  parts/definitions from
>> solrconfig.xml
>>
>> Disabled tlog by removing
>>
>>
>>   ${solr.ulog.dir:}
>> 
>>
>> from solrconfig.xml
>>
>> Used commit=true parameter. ?commit=true&command=full-import
>>
>>
>> On Fri, Apr 4, 2014 at 3:29 PM, Ahmet Arslan  wrote:
>>
>>> Hi,
>>>
>>> This may not solve your problem but generally it is recommended to
>>> disable auto commit and transaction logs for bulk indexing.
>>> And issue one commit at the very end. Do you tlogs enabled? I see
>>> "commit failed" in the error message thats why I am offering this.
>>>
>>> And regarding comma separated values, with this approach you focus on
>>> just solr importing process. You separate data acquisition phrase. And it
>>> is very fast load even big csv files
>>> http://wiki.apache.org/solr/UpdateCSV
>>> I have never exper

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
Guessing that the attachments won't work, I am pasting one file in each of
four separate emails.

database.xml































On Fri, Apr 4, 2014 at 4:57 PM, Candygram For Mongo <
candygram.for.mo...@gmail.com> wrote:

> Does this user list allow attachments?  I have four files attached
> (database.xml, error.txt, schema.xml, solrconfig.xml).  We just ran the
> process again using the parameters you suggested, but not to a csv file.
>  It errored out quickly.  We are working on the csv file run.
>
> Removed both  and  parts/definitions from
> solrconfig.xml
>
> Disabled tlog by removing
>
>
>   ${solr.ulog.dir:}
> 
>
> from solrconfig.xml
>
> Used commit=true parameter. ?commit=true&command=full-import
>
>
> On Fri, Apr 4, 2014 at 3:29 PM, Ahmet Arslan  wrote:
>
>> Hi,
>>
>> This may not solve your problem but generally it is recommended to
>> disable auto commit and transaction logs for bulk indexing.
>> And issue one commit at the very end. Do you tlogs enabled? I see "commit
>> failed" in the error message thats why I am offering this.
>>
>> And regarding comma separated values, with this approach you focus on
>> just solr importing process. You separate data acquisition phrase. And it
>> is very fast load even big csv files
>> http://wiki.apache.org/solr/UpdateCSV
>> I have never experienced OOM during indexing, I suspect data acquisition
>> has role in it.
>>
>> Ahmet
>>
>> On Saturday, April 5, 2014 1:18 AM, Candygram For Mongo <
>> candygram.for.mo...@gmail.com> wrote:
>>
>> We would be happy to try that.  That sounds counter intuitive for the
>> high volume of records we have.  Can you help me understand how that might
>> solve our problem?
>>
>>
>>
>>
>> On Fri, Apr 4, 2014 at 2:34 PM, Ahmet Arslan  wrote:
>>
>> Hi,
>> >
>> >Can you remove auto commit for bulk import. Commit at the very end?
>> >
>> >Ahmet
>> >
>> >
>> >
>> >
>> >On Saturday, April 5, 2014 12:16 AM, Candygram For Mongo <
>> candygram.for.mo...@gmail.com> wrote:
>> >In case the attached database.xml file didn't show up, I have pasted in
>> the
>> >contents below:
>> >
>> >
>> >> >name="org_only"
>> >type="JdbcDataSource"
>> >driver="oracle.jdbc.OracleDriver"
>> >url="jdbc:oracle:thin:@test2.abc.com:1521:ORCL"
>> >user="admin"
>> >password="admin"
>> >readOnly="false"
>> >batchSize="100"
>> >/>
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >> >name="ADDRESS_ACCT_ALL.ADDR_TYPE_CD_abc" />
>> >> name="ADDRESS_ACCT_ALL.LONGITUDE_abc" />
>> >> />
>> >> />
>> >
>> >
>> >> name="ADDRESS_ACCT_ALL.EMAIL_ADDR_abc"
>> >/>
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >On Fri, Apr 4, 2014 at 11:55 AM, Candygram For Mongo <
>> >candygram.for.mo...@gmail.com> wrote:
>> >
>> >> In this case we are indexing an Oracle database.
>> >>
>> >> We do not include the data-config.xml in our distribution.  We store
>> the
>> >> database information in the database.xml file.  I have attached the
>> >> database.xml file.
>> >>
>> >> When we use the default merge policy settings, we get the same results.
>> >>
>> >>
>> >>
>> >> We have not tried to dump the table to a comma separated file.  We
>> think
>> >> that dumping this size table to disk will introduce other memory
>> problems
>> >> with big file management. We have not tested that case.
>> >>
>> >>
>> >> On Fri, Apr 4, 2014 at 7:25 AM, Ahmet Arslan 
>> wrote:
>> >>
>> >>> Hi,
>> >>>
>> >>> Which database are you using? Can you send us data-config.xml?
>> >>>
>> >>> What happens when you use default merge policy settings?
>> >>>
>> >>> What happens when you dump your table to Comma Separated File and fed
>> >>> that file to solr?
>> >>>
>> >>&

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
I might have forgot to mention that we are using the DataImportHandler.  I
think we know how to remove auto commit.  How would we force a commit at
the end?


On Fri, Apr 4, 2014 at 3:18 PM, Candygram For Mongo <
candygram.for.mo...@gmail.com> wrote:

> We would be happy to try that.  That sounds counter intuitive for the high
> volume of records we have.  Can you help me understand how that might solve
> our problem?
>
>
>
> On Fri, Apr 4, 2014 at 2:34 PM, Ahmet Arslan  wrote:
>
>> Hi,
>>
>> Can you remove auto commit for bulk import. Commit at the very end?
>>
>> Ahmet
>>
>>
>>
>> On Saturday, April 5, 2014 12:16 AM, Candygram For Mongo <
>> candygram.for.mo...@gmail.com> wrote:
>> In case the attached database.xml file didn't show up, I have pasted in
>> the
>> contents below:
>>
>> 
>> > name="org_only"
>> type="JdbcDataSource"
>> driver="oracle.jdbc.OracleDriver"
>> url="jdbc:oracle:thin:@test2.abc.com:1521:ORCL"
>> user="admin"
>> password="admin"
>> readOnly="false"
>> batchSize="100"
>> />
>> 
>>
>>
>> 
>>
>> 
>> 
>> 
>> > name="ADDRESS_ACCT_ALL.ADDR_TYPE_CD_abc" />
>> > />
>> 
>> > />
>> 
>> 
>> > />
>>
>> 
>>
>>
>>
>> 
>> 
>> 
>> 
>>
>>
>>
>>
>>
>>
>> On Fri, Apr 4, 2014 at 11:55 AM, Candygram For Mongo <
>> candygram.for.mo...@gmail.com> wrote:
>>
>> > In this case we are indexing an Oracle database.
>> >
>> > We do not include the data-config.xml in our distribution.  We store the
>> > database information in the database.xml file.  I have attached the
>> > database.xml file.
>> >
>> > When we use the default merge policy settings, we get the same results.
>> >
>> >
>> >
>> > We have not tried to dump the table to a comma separated file.  We think
>> > that dumping this size table to disk will introduce other memory
>> problems
>> > with big file management. We have not tested that case.
>> >
>> >
>> > On Fri, Apr 4, 2014 at 7:25 AM, Ahmet Arslan  wrote:
>> >
>> >> Hi,
>> >>
>> >> Which database are you using? Can you send us data-config.xml?
>> >>
>> >> What happens when you use default merge policy settings?
>> >>
>> >> What happens when you dump your table to Comma Separated File and fed
>> >> that file to solr?
>> >>
>> >> Ahmet
>> >>
>> >> On Friday, April 4, 2014 5:10 PM, Candygram For Mongo <
>> >> candygram.for.mo...@gmail.com> wrote:
>> >>
>> >> The ramBufferSizeMB was set to 6MB only on the test system to make the
>> >> system crash sooner.  In production that tag is commented out which
>> >> I believe forces the default value to be used.
>> >>
>> >>
>> >>
>> >>
>> >> On Thu, Apr 3, 2014 at 5:46 PM, Ahmet Arslan 
>> wrote:
>> >>
>> >> Hi,
>> >> >
>> >> >out of curiosity, why did you set ramBufferSizeMB to 6?
>> >> >
>> >> >Ahmet
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >
>> >> >On Friday, April 4, 2014 3:27 AM, Candygram For Mongo <
>> >> candygram.for.mo...@gmail.com> wrote:
>> >> >*Main issue: Full Indexing is Causing a Java Heap Out of Memory
>> Exception
>> >> >
>> >> >*SOLR/Lucene version: *4.2.1*
>> >> >
>> >> >
>> >> >*JVM version:
>> >> >
>> >> >Java(TM) SE Runtime Environment (build 1.7.0_07-b11)
>> >> >
>> >> >Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)
>> >> >
>> >> >
>> >> >
>> >> >*Indexer startup command:
>> >> >
>> >> >set JVMARGS=-XX:MaxPermSize=364m -Xss256K -Xmx6144m -Xms6144m
>> >> >
>> >> >
>> >> >
>> >> >java " %JVMARGS% ^
>> >> >
>> >> >-Dcom.sun.management.jmxremote.port=1092 ^
>> >> >
>&g

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
We would be happy to try that.  That sounds counter intuitive for the high
volume of records we have.  Can you help me understand how that might solve
our problem?



On Fri, Apr 4, 2014 at 2:34 PM, Ahmet Arslan  wrote:

> Hi,
>
> Can you remove auto commit for bulk import. Commit at the very end?
>
> Ahmet
>
>
>
> On Saturday, April 5, 2014 12:16 AM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
> In case the attached database.xml file didn't show up, I have pasted in the
> contents below:
>
> 
>  name="org_only"
> type="JdbcDataSource"
> driver="oracle.jdbc.OracleDriver"
> url="jdbc:oracle:thin:@test2.abc.com:1521:ORCL"
> user="admin"
> password="admin"
> readOnly="false"
> batchSize="100"
> />
> 
>
>
> 
>
> 
> 
> 
>  name="ADDRESS_ACCT_ALL.ADDR_TYPE_CD_abc" />
>  />
> 
> 
> 
> 
>  />
>
> 
>
>
>
> 
> 
> 
> 
>
>
>
>
>
>
> On Fri, Apr 4, 2014 at 11:55 AM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
>
> > In this case we are indexing an Oracle database.
> >
> > We do not include the data-config.xml in our distribution.  We store the
> > database information in the database.xml file.  I have attached the
> > database.xml file.
> >
> > When we use the default merge policy settings, we get the same results.
> >
> >
> >
> > We have not tried to dump the table to a comma separated file.  We think
> > that dumping this size table to disk will introduce other memory problems
> > with big file management. We have not tested that case.
> >
> >
> > On Fri, Apr 4, 2014 at 7:25 AM, Ahmet Arslan  wrote:
> >
> >> Hi,
> >>
> >> Which database are you using? Can you send us data-config.xml?
> >>
> >> What happens when you use default merge policy settings?
> >>
> >> What happens when you dump your table to Comma Separated File and fed
> >> that file to solr?
> >>
> >> Ahmet
> >>
> >> On Friday, April 4, 2014 5:10 PM, Candygram For Mongo <
> >> candygram.for.mo...@gmail.com> wrote:
> >>
> >> The ramBufferSizeMB was set to 6MB only on the test system to make the
> >> system crash sooner.  In production that tag is commented out which
> >> I believe forces the default value to be used.
> >>
> >>
> >>
> >>
> >> On Thu, Apr 3, 2014 at 5:46 PM, Ahmet Arslan  wrote:
> >>
> >> Hi,
> >> >
> >> >out of curiosity, why did you set ramBufferSizeMB to 6?
> >> >
> >> >Ahmet
> >> >
> >> >
> >> >
> >> >
> >> >
> >> >On Friday, April 4, 2014 3:27 AM, Candygram For Mongo <
> >> candygram.for.mo...@gmail.com> wrote:
> >> >*Main issue: Full Indexing is Causing a Java Heap Out of Memory
> Exception
> >> >
> >> >*SOLR/Lucene version: *4.2.1*
> >> >
> >> >
> >> >*JVM version:
> >> >
> >> >Java(TM) SE Runtime Environment (build 1.7.0_07-b11)
> >> >
> >> >Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)
> >> >
> >> >
> >> >
> >> >*Indexer startup command:
> >> >
> >> >set JVMARGS=-XX:MaxPermSize=364m -Xss256K -Xmx6144m -Xms6144m
> >> >
> >> >
> >> >
> >> >java " %JVMARGS% ^
> >> >
> >> >-Dcom.sun.management.jmxremote.port=1092 ^
> >> >
> >> >-Dcom.sun.management.jmxremote.ssl=false ^
> >> >
> >> >-Dcom.sun.management.jmxremote.authenticate=false ^
> >> >
> >> >-jar start.jar
> >> >
> >> >
> >> >
> >> >*SOLR indexing HTTP parameters request:
> >> >
> >> >webapp=/solr path=/dataimport
> >> >params={clean=false&command=full-import&wt=javabin&version=2}
> >> >
> >> >
> >> >
> >> >We are getting a Java heap OOM exception when indexing (updating) 27
> >> >million records.  If we increase the Java heap memory settings the
> >> problem
> >> >goes away but we believe the problem has not been fixed and that we
> will
> >> >eventually get the same OOM exception.  We have other processes on the
> >> >server that also requir

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
In case the attached database.xml file didn't show up, I have pasted in the
contents below:
































On Fri, Apr 4, 2014 at 11:55 AM, Candygram For Mongo <
candygram.for.mo...@gmail.com> wrote:

> In this case we are indexing an Oracle database.
>
> We do not include the data-config.xml in our distribution.  We store the
> database information in the database.xml file.  I have attached the
> database.xml file.
>
> When we use the default merge policy settings, we get the same results.
>
>
>
> We have not tried to dump the table to a comma separated file.  We think
> that dumping this size table to disk will introduce other memory problems
> with big file management. We have not tested that case.
>
>
> On Fri, Apr 4, 2014 at 7:25 AM, Ahmet Arslan  wrote:
>
>> Hi,
>>
>> Which database are you using? Can you send us data-config.xml?
>>
>> What happens when you use default merge policy settings?
>>
>> What happens when you dump your table to Comma Separated File and fed
>> that file to solr?
>>
>> Ahmet
>>
>> On Friday, April 4, 2014 5:10 PM, Candygram For Mongo <
>> candygram.for.mo...@gmail.com> wrote:
>>
>> The ramBufferSizeMB was set to 6MB only on the test system to make the
>> system crash sooner.  In production that tag is commented out which
>> I believe forces the default value to be used.
>>
>>
>>
>>
>> On Thu, Apr 3, 2014 at 5:46 PM, Ahmet Arslan  wrote:
>>
>> Hi,
>> >
>> >out of curiosity, why did you set ramBufferSizeMB to 6?
>> >
>> >Ahmet
>> >
>> >
>> >
>> >
>> >
>> >On Friday, April 4, 2014 3:27 AM, Candygram For Mongo <
>> candygram.for.mo...@gmail.com> wrote:
>> >*Main issue: Full Indexing is Causing a Java Heap Out of Memory Exception
>> >
>> >*SOLR/Lucene version: *4.2.1*
>> >
>> >
>> >*JVM version:
>> >
>> >Java(TM) SE Runtime Environment (build 1.7.0_07-b11)
>> >
>> >Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)
>> >
>> >
>> >
>> >*Indexer startup command:
>> >
>> >set JVMARGS=-XX:MaxPermSize=364m -Xss256K -Xmx6144m -Xms6144m
>> >
>> >
>> >
>> >java " %JVMARGS% ^
>> >
>> >-Dcom.sun.management.jmxremote.port=1092 ^
>> >
>> >-Dcom.sun.management.jmxremote.ssl=false ^
>> >
>> >-Dcom.sun.management.jmxremote.authenticate=false ^
>> >
>> >-jar start.jar
>> >
>> >
>> >
>> >*SOLR indexing HTTP parameters request:
>> >
>> >webapp=/solr path=/dataimport
>> >params={clean=false&command=full-import&wt=javabin&version=2}
>> >
>> >
>> >
>> >We are getting a Java heap OOM exception when indexing (updating) 27
>> >million records.  If we increase the Java heap memory settings the
>> problem
>> >goes away but we believe the problem has not been fixed and that we will
>> >eventually get the same OOM exception.  We have other processes on the
>> >server that also require resources so we cannot continually increase the
>> >memory settings to resolve the OOM issue.  We are trying to find a way to
>> >configure the SOLR instance to reduce or preferably eliminate the
>> >possibility of an OOM exception.
>> >
>> >
>> >
>> >We can reproduce the problem on a test machine.  We set the Java heap
>> >memory size to 64MB to accelerate the exception.  If we increase this
>> >setting the same problems occurs, just hours later.  In the test
>> >environment, we are using the following parameters:
>> >
>> >
>> >
>> >JVMARGS=-XX:MaxPermSize=64m -Xss256K -Xmx64m -Xms64m
>> >
>> >
>> >
>> >Normally we use the default solrconfig.xml file with only the following
>> jar
>> >file references added:
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >Using these values and trying to index 6 million records from the
>> database,
>> >the Java Heap Out of Memory exception is thrown very quickly.
>> >
>> >
>> >
>> >We were able to complete a successful indexing by further modifying the
>> >solrconfig.xml and removing all or all but one  tags from the
>> >schema.xml file.
>> >
>> >
>> >
>&g

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
In this case we are indexing an Oracle database.

We do not include the data-config.xml in our distribution.  We store the
database information in the database.xml file.  I have attached the
database.xml file.

When we use the default merge policy settings, we get the same results.



We have not tried to dump the table to a comma separated file.  We think
that dumping this size table to disk will introduce other memory problems
with big file management. We have not tested that case.


On Fri, Apr 4, 2014 at 7:25 AM, Ahmet Arslan  wrote:

> Hi,
>
> Which database are you using? Can you send us data-config.xml?
>
> What happens when you use default merge policy settings?
>
> What happens when you dump your table to Comma Separated File and fed that
> file to solr?
>
> Ahmet
>
> On Friday, April 4, 2014 5:10 PM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
>
> The ramBufferSizeMB was set to 6MB only on the test system to make the
> system crash sooner.  In production that tag is commented out which
> I believe forces the default value to be used.
>
>
>
>
> On Thu, Apr 3, 2014 at 5:46 PM, Ahmet Arslan  wrote:
>
> Hi,
> >
> >out of curiosity, why did you set ramBufferSizeMB to 6?
> >
> >Ahmet
> >
> >
> >
> >
> >
> >On Friday, April 4, 2014 3:27 AM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
> >*Main issue: Full Indexing is Causing a Java Heap Out of Memory Exception
> >
> >*SOLR/Lucene version: *4.2.1*
> >
> >
> >*JVM version:
> >
> >Java(TM) SE Runtime Environment (build 1.7.0_07-b11)
> >
> >Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)
> >
> >
> >
> >*Indexer startup command:
> >
> >set JVMARGS=-XX:MaxPermSize=364m -Xss256K -Xmx6144m -Xms6144m
> >
> >
> >
> >java " %JVMARGS% ^
> >
> >-Dcom.sun.management.jmxremote.port=1092 ^
> >
> >-Dcom.sun.management.jmxremote.ssl=false ^
> >
> >-Dcom.sun.management.jmxremote.authenticate=false ^
> >
> >-jar start.jar
> >
> >
> >
> >*SOLR indexing HTTP parameters request:
> >
> >webapp=/solr path=/dataimport
> >params={clean=false&command=full-import&wt=javabin&version=2}
> >
> >
> >
> >We are getting a Java heap OOM exception when indexing (updating) 27
> >million records.  If we increase the Java heap memory settings the problem
> >goes away but we believe the problem has not been fixed and that we will
> >eventually get the same OOM exception.  We have other processes on the
> >server that also require resources so we cannot continually increase the
> >memory settings to resolve the OOM issue.  We are trying to find a way to
> >configure the SOLR instance to reduce or preferably eliminate the
> >possibility of an OOM exception.
> >
> >
> >
> >We can reproduce the problem on a test machine.  We set the Java heap
> >memory size to 64MB to accelerate the exception.  If we increase this
> >setting the same problems occurs, just hours later.  In the test
> >environment, we are using the following parameters:
> >
> >
> >
> >JVMARGS=-XX:MaxPermSize=64m -Xss256K -Xmx64m -Xms64m
> >
> >
> >
> >Normally we use the default solrconfig.xml file with only the following
> jar
> >file references added:
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >
> >Using these values and trying to index 6 million records from the
> database,
> >the Java Heap Out of Memory exception is thrown very quickly.
> >
> >
> >
> >We were able to complete a successful indexing by further modifying the
> >solrconfig.xml and removing all or all but one  tags from the
> >schema.xml file.
> >
> >
> >
> >The following solrconfig.xml values were modified:
> >
> >
> >
> >6
> >
> >
> >
> >
> >
> >2
> >
> >2
> >
> >10
> >
> >150
> >
> >
> >
> >
> >
> >
> >
> >15000  
> >
> >false
> >
> >
> >
> >
> >
> >Using our customized schema.xml file with two or more  tags,
> the
> >OOM exception is always thrown.  Based on the errors, the problem occurs
> >when the process was trying to do the merge.  The error is provided below:
> >
> >
> >
> >Exception in thread "Lucene Merge Thread #156"
> >org.apache.lucene.index.Mer

Re: Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-04 Thread Candygram For Mongo
The ramBufferSizeMB was set to 6MB only on the test system to make the
system crash sooner.  In production that tag is commented out which
I believe forces the default value to be used.


On Thu, Apr 3, 2014 at 5:46 PM, Ahmet Arslan  wrote:

> Hi,
>
> out of curiosity, why did you set ramBufferSizeMB to 6?
>
> Ahmet
>
>
>
>
> On Friday, April 4, 2014 3:27 AM, Candygram For Mongo <
> candygram.for.mo...@gmail.com> wrote:
> *Main issue: Full Indexing is Causing a Java Heap Out of Memory Exception
>
> *SOLR/Lucene version: *4.2.1*
>
> *JVM version:
>
> Java(TM) SE Runtime Environment (build 1.7.0_07-b11)
>
> Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)
>
>
>
> *Indexer startup command:
>
> set JVMARGS=-XX:MaxPermSize=364m -Xss256K -Xmx6144m -Xms6144m
>
>
>
> java " %JVMARGS% ^
>
> -Dcom.sun.management.jmxremote.port=1092 ^
>
> -Dcom.sun.management.jmxremote.ssl=false ^
>
> -Dcom.sun.management.jmxremote.authenticate=false ^
>
> -jar start.jar
>
>
>
> *SOLR indexing HTTP parameters request:
>
> webapp=/solr path=/dataimport
> params={clean=false&command=full-import&wt=javabin&version=2}
>
>
>
> We are getting a Java heap OOM exception when indexing (updating) 27
> million records.  If we increase the Java heap memory settings the problem
> goes away but we believe the problem has not been fixed and that we will
> eventually get the same OOM exception.  We have other processes on the
> server that also require resources so we cannot continually increase the
> memory settings to resolve the OOM issue.  We are trying to find a way to
> configure the SOLR instance to reduce or preferably eliminate the
> possibility of an OOM exception.
>
>
>
> We can reproduce the problem on a test machine.  We set the Java heap
> memory size to 64MB to accelerate the exception.  If we increase this
> setting the same problems occurs, just hours later.  In the test
> environment, we are using the following parameters:
>
>
>
> JVMARGS=-XX:MaxPermSize=64m -Xss256K -Xmx64m -Xms64m
>
>
>
> Normally we use the default solrconfig.xml file with only the following jar
> file references added:
>
>
>
> 
>
> 
>
> 
>
>
>
> Using these values and trying to index 6 million records from the database,
> the Java Heap Out of Memory exception is thrown very quickly.
>
>
>
> We were able to complete a successful indexing by further modifying the
> solrconfig.xml and removing all or all but one  tags from the
> schema.xml file.
>
>
>
> The following solrconfig.xml values were modified:
>
>
>
> 6
>
>
>
> 
>
> 2
>
> 2
>
> 10
>
> 150
>
> 
>
>
>
> 
>
> 15000  
>
> false
>
> 
>
>
>
> Using our customized schema.xml file with two or more  tags, the
> OOM exception is always thrown.  Based on the errors, the problem occurs
> when the process was trying to do the merge.  The error is provided below:
>
>
>
> Exception in thread "Lucene Merge Thread #156"
> org.apache.lucene.index.MergePolicy$MergeException:
> java.lang.OutOfMemoryError: Java heap space
>
> at
>
> org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:541)
>
> at
>
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:514)
>
> Caused by: java.lang.OutOfMemoryError: Java heap space
>
> at
>
> org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:180)
>
> at
>
> org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:146)
>
> at
>
> org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:301)
>
> at
> org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:259)
>
> at
> org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:233)
>
> at
> org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:137)
>
> at
> org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3693)
>
> at
> org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3296)
>
> at
>
> org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:401)
>
> at
>
> org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:478)
>
>

Re: Solr DataImport Hander

2014-04-03 Thread Candygram For Mongo
The ramBufferSizeMB was set to 6MB only on the test system to make the
system crash sooner.  In production that tag is commented out which
I believe forces the default value to be used.


On Thu, Apr 3, 2014 at 6:36 PM, Susheel Kumar <
susheel.ku...@thedigitalgroup.net> wrote:

> Hi Sanjay,
>
> This is how output will come since solr documents are flat.  In  your sub
> entity you queried emp name for a dept and for e.g. in case of deptno=10
> you had 3 employees so all came in ename field.
>
> You are getting the data, now it will be your UI function to present in
> whatever format you are looking to display.  In this case if you can
> maintain the sequence meaning, first ename {0 indice} is for first job {0
> indice} you are good.
>
> Thanks,
> Susheel
>
> -Original Message-
> From: sanjay92 [mailto:sanja...@hotmail.com]
> Sent: Thursday, April 03, 2014 12:30 PM
> To: solr-user@lucene.apache.org
> Subject: Solr DataImport Hander
>
> Hi,
> I am writing very simple Dept, Emp Solr DataImport Handler. it is working
> but when I query using http://localhost:8983/solr/select?q=*:*
> I see results in XML format . See attached file.
> deptemp.xml 
>
> Output from inner query does not correctly formatted to show : For each
> Dept, There are number of employees ( I want to show emp name and JOb) . I
> want to show first ename,Job as 1 row  but it is showing all emp name first
> and after that another column for Job.
> Do I to do any changes in schema.xml for proper display ?
>
> -
>
>
>
> data-config.xml looks like this:
> 
>  driver="oracle.jdbc.driver.OracleDriver"
>   url="jdbc:oracle:thin:@mydb:1521:SID"
>   user="*"
>   password="*"  />
>   
> 
> 
> 
> 
> 
>
>
> 
> 
>   
> 
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Solr-DataImport-Hander-tp4128911.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> This e-mail message may contain confidential or legally privileged
> information and is intended only for the use of the intended recipient(s).
> Any unauthorized disclosure, dissemination, distribution, copying or the
> taking of any action in reliance on the information herein is prohibited.
> E-mails are not secure and cannot be guaranteed to be error free as they
> can be intercepted, amended, or contain viruses. Anyone who communicates
> with us by e-mail is deemed to have accepted these risks. Company Name is
> not responsible for errors or omissions in this message and denies any
> responsibility for any damage arising from the use of e-mail. Any opinion
> defamatory or deemed to be defamatory or  any material which could be
> reasonably branded to be a species of plagiarism and other statements
> contained in this message and any attachment are solely those of the author
> and do not necessarily represent those of the company.
>


Full Indexing is Causing a Java Heap Out of Memory Exception

2014-04-03 Thread Candygram For Mongo
*Main issue: Full Indexing is Causing a Java Heap Out of Memory Exception

*SOLR/Lucene version: *4.2.1*

*JVM version:

Java(TM) SE Runtime Environment (build 1.7.0_07-b11)

Java HotSpot(TM) 64-Bit Server VM (build 23.3-b01, mixed mode)



*Indexer startup command:

set JVMARGS=-XX:MaxPermSize=364m -Xss256K -Xmx6144m -Xms6144m



java " %JVMARGS% ^

-Dcom.sun.management.jmxremote.port=1092 ^

-Dcom.sun.management.jmxremote.ssl=false ^

-Dcom.sun.management.jmxremote.authenticate=false ^

-jar start.jar



*SOLR indexing HTTP parameters request:

webapp=/solr path=/dataimport
params={clean=false&command=full-import&wt=javabin&version=2}



We are getting a Java heap OOM exception when indexing (updating) 27
million records.  If we increase the Java heap memory settings the problem
goes away but we believe the problem has not been fixed and that we will
eventually get the same OOM exception.  We have other processes on the
server that also require resources so we cannot continually increase the
memory settings to resolve the OOM issue.  We are trying to find a way to
configure the SOLR instance to reduce or preferably eliminate the
possibility of an OOM exception.



We can reproduce the problem on a test machine.  We set the Java heap
memory size to 64MB to accelerate the exception.  If we increase this
setting the same problems occurs, just hours later.  In the test
environment, we are using the following parameters:



JVMARGS=-XX:MaxPermSize=64m -Xss256K -Xmx64m -Xms64m



Normally we use the default solrconfig.xml file with only the following jar
file references added:











Using these values and trying to index 6 million records from the database,
the Java Heap Out of Memory exception is thrown very quickly.



We were able to complete a successful indexing by further modifying the
solrconfig.xml and removing all or all but one  tags from the
schema.xml file.



The following solrconfig.xml values were modified:



6





2

2

10

150







15000  

false





Using our customized schema.xml file with two or more  tags, the
OOM exception is always thrown.  Based on the errors, the problem occurs
when the process was trying to do the merge.  The error is provided below:



Exception in thread "Lucene Merge Thread #156"
org.apache.lucene.index.MergePolicy$MergeException:
java.lang.OutOfMemoryError: Java heap space

at
org.apache.lucene.index.ConcurrentMergeScheduler.handleMergeException(ConcurrentMergeScheduler.java:541)

at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:514)

Caused by: java.lang.OutOfMemoryError: Java heap space

at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.loadNumeric(Lucene42DocValuesProducer.java:180)

at
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer.getNumeric(Lucene42DocValuesProducer.java:146)

at
org.apache.lucene.index.SegmentCoreReaders.getNormValues(SegmentCoreReaders.java:301)

at
org.apache.lucene.index.SegmentReader.getNormValues(SegmentReader.java:259)

at
org.apache.lucene.index.SegmentMerger.mergeNorms(SegmentMerger.java:233)

at
org.apache.lucene.index.SegmentMerger.merge(SegmentMerger.java:137)

at
org.apache.lucene.index.IndexWriter.mergeMiddle(IndexWriter.java:3693)

at
org.apache.lucene.index.IndexWriter.merge(IndexWriter.java:3296)

at
org.apache.lucene.index.ConcurrentMergeScheduler.doMerge(ConcurrentMergeScheduler.java:401)

at
org.apache.lucene.index.ConcurrentMergeScheduler$MergeThread.run(ConcurrentMergeScheduler.java:478)

Mar 12, 2014 12:17:40 AM org.apache.solr.common.SolrException log

SEVERE: auto commit error...:java.lang.IllegalStateException: this writer
hit an OutOfMemoryError; cannot commit

at
org.apache.lucene.index.IndexWriter.startCommit(IndexWriter.java:3971)

at
org.apache.lucene.index.IndexWriter.prepareCommitInternal(IndexWriter.java:2744)

at
org.apache.lucene.index.IndexWriter.commitInternal(IndexWriter.java:2827)

at
org.apache.lucene.index.IndexWriter.commit(IndexWriter.java:2807)

at
org.apache.solr.update.DirectUpdateHandler2.commit(DirectUpdateHandler2.java:536)

at
org.apache.solr.update.CommitTracker.run(CommitTracker.java:216)

at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)

at
java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)

at java.util.concurrent.FutureTask.run(FutureTask.java:166)

at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)

at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)

at