Hello,
I use Nutch 1.19 to crawl my website and to index the data into AWS CloudSearch.
For this, I use the CloudSearch Index writer.
Everything works fine.
Now I want to copy the content of the "content" field into a different field in 
CloudSearch.
I've created this field in CloudSearch with the name "raw" and the same 
settings (except for the analysis scheme) as the "content" field.
In the index-writers.xml configuration file, I used the following configuration 
in order to copy the content:


<writer id="indexer_cloud_search_1" 
class="org.apache.nutch.indexwriter.cloudsearch.CloudSearchIndexWriter">
  <parameters>
    <param name="endpoint" value="MyEndpointAddress"/>
    <param name="region" value="eu-west-1"/>
    <param name="batch.dump" value="false"/>
    <param name="batch.maxSize" value="-1"/>
  </parameters>
  <mapping>
  <copy source="title" target="raw"/>
    <rename />
    <remove />
  </mapping>
</writer>

Everything works without errors, that means the standard content is indexed 
into CloudSearch but I do not see any content in the "raw" field.
Has anyone an idea, why  this happens?

Best regards,
Michael



Dr. Michael Fritsch
Technical Editor

[A picture containing graphics, graphic design, font, logo  Description 
automatically generated]<https://www.coremedia.com/>



Elevate Experience. Drive Impact.

E-Mail: [email protected]<mailto:[email protected]>

Phone: +49 (0) 40 325 587 0
www.coremedia.com<https://www.coremedia.com/>

[A pink and red letter on a black background  Description automatically 
generated with low 
confidence]<https://www.linkedin.com/company/coremedia-corp/>[A logo of a 
camera  Description automatically generated with low 
confidence]<https://www.instagram.com/coremediacc/>[A picture containing 
colorfulness, screenshot, graphics, red  Description automatically 
generated]<https://www.youtube.com/channel/UC3u29ExYv1263SfUBWnsgdQ>[cid:4f485639-0563-4287-b602-ae316bb1f84a]

[cid:5131bd4a-f40b-4efa-99d2-ad4dc76cbdeb]

--------------------------------------------------------------------------------

CoreMedia GmbH

Rödingsmarkt 9, 20459 Hamburg, Germany

Managing Director: Sören Stamer

Commercial Register: Amtsgericht Hamburg, HRB 162480

--------------------------------------------------------------------------------




Reply via email to