[ 
https://issues.apache.org/jira/browse/SOLR-1033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12675856#action_12675856
 ] 

Fergus McMenemie commented on SOLR-1033:
----------------------------------------

OK here goes. My document contains references to embeded imagery. For each 
image there is the image itself along with a thumbnail and caption. The source 
document contains:-

  <mediaObject vurl="1043130" imageType="graphic"/>

I have a search application that searches only the captions associated with a 
given image. It would be nice to populate solr fields with the correct relative 
path to each image and thumbnails at index time. Problem arises in that 
although the thumbnail is:

   s${e.vurl}.jpg

The name of the image itself varies depending on the first letter of the image 
type imageType! It could be one of 'picture' 'graphic' 'lineDrawing' or 'map'. 
ie:-

   p${e.vurl}.jpg
   g${e.vurl}.jpg
   l${e.vurl}.jpg
   m${e.vurl}.jpg

My patch would allow the following sort of thing to be added to a data-config. 
I feel this considerably increases its power and usefulness.

{{code}}
<entity name="x" .... transformer="TemplateTransformer,RegexTransformer">
  <field column="fileWebPath"            template="${jc.fileAbsolutePath}" 
regex="${dataimporter.request.contentdir}(.*)" replaceWith="/ford$1" />
  <field column="vurl"                          
xpath="/record/mediaBlock/mediaObject/@vurl" />
  <field column="imagetype"               
xpath="/record/mediaBlock/mediaObject/@imageType" regex="^(\w).*"/>
  <field column="imgWebPathICON"  regex="(.*)/.*" 
replaceWith="$1/imagery/s${x.vurl}.jpg" sourceColName="fileWebPath"/>
  <field column="imgWebPathFULL"  regex="(.*)/.*" 
replaceWith="$1/imagery/${x.imagetype}${x.vurl}.jpg"  
sourceColName="fileWebPath"/>
{{code}}


> DIH transformers cannot reuse output from previous transformations
> ------------------------------------------------------------------
>
>                 Key: SOLR-1033
>                 URL: https://issues.apache.org/jira/browse/SOLR-1033
>             Project: Solr
>          Issue Type: Improvement
>          Components: contrib - DataImportHandler
>    Affects Versions: 1.4
>         Environment: All operating systems and software platforms
>            Reporter: Fergus McMenemie
>             Fix For: 1.4
>
>         Attachments: SOLR-1033.patch, SOLR-1033.patch
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> It can be very useful to reuse the output from a DIH template in other 
> templates and or regex transformers. Currently this cannot be done. The 
> resolver is initialized at the start of the transformer run with what ever 
> values exist for a column name at that instant. As the transformer executes 
> it may define new values for column names. My change is intended to update 
> the hash used by the resolver after each successful transformation.
> This only applies to the template and regex transformers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to