[ 
https://issues.apache.org/jira/browse/SOLR-1229?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12731780#action_12731780
 ] 

Lance Norskog edited comment on SOLR-1229 at 7/16/09 3:35 PM:
--------------------------------------------------------------

Ok - these run. Thanks.

Just to make sure I understand. The 'pk' attribute declares 2 things:  
    1) that this column must exist for a document to be generated, and 
    2) that this entity is the level where documents are created. Is this true?

tmpid appears as an unused name merely so that ${x.id} is sent into solr_id. 
Maybe name="" would be more clear for this purpose?

Lance





      was (Author: lancenorskog):
    Ok - these run. Thanks.

Just to make sure I understand. The 'pk' attribute declares 2 things:  
    1) that this column must exist for a document to be generated, and 
    2) that this entity is the level where documents are created. Is this true?

tmpid appears as an unused name merely so that ${x.id} is sent into solr_id. 
Maybe name="" would be more clear for this purpose?

Something is documented on the wiki but not used: multiple PKs in one entity.

On the wiki page, see the config file after "Writing a huge deltaQuery" - there 
is a attribute: 
{{{pk="ITEM_ID, CATEGORY_ID"}}}
There is code to parse this in DataImporter.InitEntity() and store the list in 
Entity.primaryKeys. But the list of PKs is never used. 

I think the use case for this is that the user requires more fields besides the 
uniqueKey for a document.  Is this right? This is definitely on my list of 
must-have features. The second field may or may not be declared "required" in 
the schema, so looking at the schema is not good enough. The field has to be 
declared "required" in the dataconfig.

Lance




  
> deletedPkQuery feature does not work when pk and uniqueKey field do not have 
> the same value
> -------------------------------------------------------------------------------------------
>
>                 Key: SOLR-1229
>                 URL: https://issues.apache.org/jira/browse/SOLR-1229
>             Project: Solr
>          Issue Type: Bug
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>            Reporter: Erik Hatcher
>            Assignee: Erik Hatcher
>             Fix For: 1.4
>
>         Attachments: SOLR-1229.patch, SOLR-1229.patch, SOLR-1229.patch, 
> SOLR-1229.patch, SOLR-1229.patch, SOLR-1229.patch, tests.patch
>
>
> Problem doing a delta-import such that records marked as "deleted" in the 
> database are removed from Solr using deletedPkQuery.
> Here's a config I'm using against a mocked test database:
> {code:xml}
> <dataConfig>
>  <dataSource driver="com.mysql.jdbc.Driver" url="jdbc:mysql://localhost/db"/>
>  <document name="tests">
>    <entity name="test"
>            pk="board_id"
>            transformer="TemplateTransformer"
>            deletedPkQuery="select board_id from boards where deleted = 'Y'"
>            query="select * from boards where deleted = 'N'"
>            deltaImportQuery="select * from boards where deleted = 'N'"
>            deltaQuery="select * from boards where deleted = 'N'"
>            preImportDeleteQuery="datasource:board">
>      <field column="id" template="board-${test.board_id}"/>
>      <field column="datasource" template="board"/>
>      <field column="title" />
>    </entity>
>  </document>
> </dataConfig>
> {code}
> Note that the uniqueKey in Solr is the "id" field.  And its value is a 
> template board-<PK>.
> I noticed the javadoc comments in DocBuilder#collectDelta it says "Note: In 
> our definition, unique key of Solr document is the primary key of the top 
> level entity".  This of course isn't really an appropriate assumption.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to