[jira] Issue Comment Edited: (SOLR-469) Data Import RequestHandler

Chris Moser (JIRA) Sun, 11 May 2008 10:48:21 -0700

    [ 
https://issues.apache.org/jira/browse/SOLR-469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12595928#action_12595928
 ]


cpmoser edited comment on SOLR-469 at 5/11/08 10:47 AM:
------------------------------------------------------------

Hi, 

Thanks for all of your work with the dataimporter.  It's made working with Solr 
much easier.

I think I found a small bug in SqlEntityProcessory.java starting on line 120:

{code:title=SqlEntityProcessor.java}
120:    boolean first = true;
121:    String[] primaryKeys = context.getEntityAttribute("pk").split(",");
122:    for (int i = 0; i < primaryKeys.length; i++) {
123:      if (!first) {
124:        sb.append(" and ");
125:        first = false;
126:      }
{code}

This causes problems in a generated SQL statement because it doesn't add the 
"and" string into the SQL statement when more than one field is provided in the 
pk entity value.  End result being a SQL syntax error.

Given {{first}} initialized as true, the {{if}} statement on line 123 will 
never happen (and first will never be set to false).  It looks like it would be 
more appropriate to have line 125 happen after the {{if}} statement on line 123.

This leads me to another issue, and that is the question of how to specify the 
table of the primary key when the primary key is ambiguous?  If there's a join 
condition in the SQL statment of a deltaQuery, and the any of the primary key 
columns are present in the joined table, the key is ambiguous and will cause a 
SQL error.

Is there a way to specify the table for the primary key?  Perhaps an attribute 
"pkTable" can be added as an option for the entity declaration, i.e. in 
SqlEntityProcessor.java:

{code:title=SqlEntityProcessor.java}
127:      Object val = resolver.resolve(primaryKeys[i]);
-->       if (context.getEntityAttribute("pktable").length()>0)
-->             sb.append(context.getEntityAttribute("pkTable")+".");
128:      sb.append(primaryKeys[i]).append(" = ");
{code}

This removes any potential ambiguity issues with joins when pkTable is 
specified.

      was (Author: cpmoser):
    Hi, 

Thanks for all of your work with the dataimporter.  It's made working with Solr 
much easier.

I think I found a small bug in SqlEntityProcessory.java starting on line 120:

{code:title=SqlEntityProcessor.java}
120:    boolean first = true;
121:    String[] primaryKeys = context.getEntityAttribute("pk").split(",");
122:    for (int i = 0; i < primaryKeys.length; i++) {
123:      if (!first) {
124:        sb.append(" and ");
125:        first = false;
126:      }
{code}

This causes problems in a generated SQL statement because it doesn't add the 
"and" string into the SQL statement when more than one field is provided in the 
pk entity value.  End result being a SQL syntax error.

Given {{first}} initialized as true, the {{if}} statement on line 123 will 
never happen (and first will never be set to false).  It looks like it would be 
more appropriate to have line 125 happen after the {{if}} statement on line 123.

This leads me to another issue, and that is the question of how to specify the 
table of the primary key when the primary key is ambiguous?  If there's a join 
condition in the SQL statment of a deltaQuery, and the any of the primary key 
columns are present in the joined table, the key is ambiguous and will cause a 
SQL error.

Is there a way to specify the table for the primary key?  Perhaps an attribute 
"pkTable" can be added as an option for the entity declaration, i.e. in 
SqlEntityProcessor.java:

{code:title=SqlEntityProcessor.java}
127:      Object val = resolver.resolve(primaryKeys[i]);
-->       if (context.getEntityAttribute("pktable").length()>0)
-->             sb.append(context.getEntityAttribute("pktable")+".");
128:      sb.append(primaryKeys[i]).append(" = ");
{code}

This removes any potential ambiguity issues with joins when pkTable is 
specified.
  
> Data Import RequestHandler
> --------------------------
>
>                 Key: SOLR-469
>                 URL: https://issues.apache.org/jira/browse/SOLR-469
>             Project: Solr
>          Issue Type: New Feature
>          Components: update
>    Affects Versions: 1.3
>            Reporter: Noble Paul
>            Assignee: Grant Ingersoll
>             Fix For: 1.3
>
>         Attachments: SOLR-469-contrib.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, SOLR-469.patch, 
> SOLR-469.patch, SOLR-469.patch
>
>
> We need a RequestHandler Which can import data from a DB or other dataSources 
> into the Solr index .Think of it as an advanced form of SqlUpload Plugin 
> (SOLR-103).
> The way it works is as follows.
>     * Provide a configuration file (xml) to the Handler which takes in the 
> necessary SQL queries and mappings to a solr schema
>           - It also takes in a properties file for the data source 
> configuraution
>     * Given the configuration it can also generate the solr schema.xml
>     * It is registered as a RequestHandler which can take two commands 
> do-full-import, do-delta-import
>           -  do-full-import - dumps all the data from the Database into the 
> index (based on the SQL query in configuration)
>           - do-delta-import - dumps all the data that has changed since last 
> import. (We assume a modified-timestamp column in tables)
>     * It provides a admin page
>           - where we can schedule it to be run automatically at regular 
> intervals
>           - It shows the status of the Handler (idle, full-import, 
> delta-import)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (SOLR-469) Data Import RequestHandler

Reply via email to