[
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12604276#action_12604276
]
patrick o'leary commented on SOLR-469:
--------------------------------------
With the arow, I noticed by nulling it, that CMS GC was cleaning items up
faster in eden space.
Without it, Full GC kicked in more frequently. This was with indexing about
250~MB from mysql.
If you've not got that much data then there isn't much of a worry, it's just a
little optimization that reduces the need
to increase your jvm's mx and newsize settings.
Another thing I was looking at is the SolrWriter, instead of calling an
updateHandler directly, I think you should call
the UpdateRequestProcessorFactory and allow the UpdateRequestProcessor chain
handle the
*processAdd
*processDelete
*processCommit
*finish
It allows for custom ChainedUpdateProcessor'Factory's which is a fantastic
little known about item.
> Data Import RequestHandler
> --------------------------
>
> Key: SOLR-469
> URL: https://issues.apache.org/jira/browse/SOLR-469
> Project: Solr
> Issue Type: New Feature
> Components: update
> Affects Versions: 1.3
> Reporter: Noble Paul
> Assignee: Grant Ingersoll
> Fix For: 1.3
>
> Attachments: SOLR-469-contrib.patch, SOLR-469-contrib.patch,
> SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch,
> SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources
> into the Solr index .Think of it as an advanced form of SqlUpload Plugin
> (SOLR-103).
> The way it works is as follows.
> * Provide a configuration file (xml) to the Handler which takes in the
> necessary SQL queries and mappings to a solr schema
> - It also takes in a properties file for the data source
> configuraution
> * Given the configuration it can also generate the solr schema.xml
> * It is registered as a RequestHandler which can take two commands
> do-full-import, do-delta-import
> - do-full-import - dumps all the data from the Database into the
> index (based on the SQL query in configuration)
> - do-delta-import - dumps all the data that has changed since last
> import. (We assume a modified-timestamp column in tables)
> * It provides a admin page
> - where we can schedule it to be run automatically at regular
> intervals
> - It shows the status of the Handler (idle, full-import,
> delta-import)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.