[jira] Commented: (SOLR-469) Data Import RequestHandler

Noble Paul (JIRA) Thu, 26 Jun 2008 06:48:10 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12608429#action_12608429
 ]


Noble Paul commented on SOLR-469:
---------------------------------

bq.I'd suggest,that instead of relying on MySQL in TestJdbcDataSource, we 
instead use and embedded Derby or some sort of JDBC mock. I suggest Derby 
mainly b/c it's already ASF and I don't want to bother looking up licenses for 
HSQL or any of the others that might work.

We must remove the TestJdbcDataSource if we cannot integrate derby in the dev 
dependencies. 
bq.Also, I notice several interfaces that have a number of methods on them. 
Have you thought about abstract base classes instead?

Yes/No A lot of interfaces are never implemented by users like Context, 
VariableResolver They are kept as interfaces to make API's simple
The interfaces people need to implement are 
* EntityProcessor: We  expect users to extend EntityProcessorBase 
* Transformer : The most commonly implemented interface. I am ambivalent 
regarding this. I'm do  not know if it will change
* DataSource : This may be made abstract class

bq.What relation does the Context have to the HttpDataSource? 

DataSource is always created for an entity. The Context is the easiest  way to 
get info about the entity. The current DataSources do not use that info . But 
because we have the info readily available just pass it over.

bq.What if I wanted to slurp from a table on the fly?

CachedSqlEntityProcessor already does that. It slurps the table and caches the 
info

bq.Interactive mode has a bit of a chicken and the egg problem when it comes to 
JDBC, right, in that the Driver needs to be present in Solr/lib right?

Not sure If I got the question . Interactive dev mode does not need the drivers

bq.In the JDBCDataSource, not sure I follow the connection stuff. Can you 
explain a bit? 
We create connections using Drivermanager.getConnection(). No pooling because, 
the same connection is used throughout the indexing. one conn is created per 
entity. So no pooling implemented.

A  PooledJdbcDataSource impl?




> Data Import RequestHandler
> --------------------------
>
>                 Key: SOLR-469
>                 URL: https://issues.apache.org/jira/browse/SOLR-469
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 1.3
>            Reporter: Noble Paul
>            Assignee: Grant Ingersoll
>             Fix For: 1.3
>
>         Attachments: SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469-contrib.patch, SOLR-469-contrib.patch, SOLR-469-contrib.patch, 
> SOLR-469-contrib.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch, SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources 
> into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
> (SOLR-103).
> The way it works is as follows.
>     * Provide a configuration file (xml) to the Handler which takes in the 
> necessary SQL queries and mappings to a solr schema
>           - It also takes in a properties file for the data source 
> configuraution
>     * Given the configuration it can also generate the solr schema.xml
>     * It is registered as a RequestHandler which can take two commands 
> do-full-import, do-delta-import
>           -  do-full-import - dumps all the data from the Database into the 
> index (based on the SQL query in configuration)
>           - do-delta-import - dumps all the data that has changed since last 
> import. (We assume a modified-timestamp column in tables)
>     * It provides a admin page
>           - where we can schedule it to be run automatically at regular 
> intervals
>           - It shows the status of the Handler (idle, full-import, 
> delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-469) Data Import RequestHandler

Reply via email to