Hello,
I use Nutch 1.19 to crawl my website and to index the data into AWS CloudSearch.
For this, I use the CloudSearch Index writer.
Everything works fine.
Now I want to copy the content of the "content" field into a different field in
CloudSearch.
I've created this field in CloudSearch with the name "raw" and the same
settings (except for the analysis scheme) as the "content" field.
In the index-writers.xml configuration file, I used the following configuration
in order to copy the content:
<writer id="indexer_cloud_search_1"
class="org.apache.nutch.indexwriter.cloudsearch.CloudSearchIndexWriter">
<parameters>
<param name="endpoint" value="MyEndpointAddress"/>
<param name="region" value="eu-west-1"/>
<param name="batch.dump" value="false"/>
<param name="batch.maxSize" value="-1"/>
</parameters>
<mapping>
<copy source="title" target="raw"/>
<rename />
<remove />
</mapping>
</writer>
Everything works without errors, that means the standard content is indexed
into CloudSearch but I do not see any content in the "raw" field.
Has anyone an idea, why this happens?
Best regards,
Michael
Dr. Michael Fritsch
Technical Editor
[A picture containing graphics, graphic design, font, logo Description
automatically generated]<https://www.coremedia.com/>
Elevate Experience. Drive Impact.
E-Mail: [email protected]<mailto:[email protected]>
Phone: +49 (0) 40 325 587 0
www.coremedia.com<https://www.coremedia.com/>
[A pink and red letter on a black background Description automatically
generated with low
confidence]<https://www.linkedin.com/company/coremedia-corp/>[A logo of a
camera Description automatically generated with low
confidence]<https://www.instagram.com/coremediacc/>[A picture containing
colorfulness, screenshot, graphics, red Description automatically
generated]<https://www.youtube.com/channel/UC3u29ExYv1263SfUBWnsgdQ>[cid:4f485639-0563-4287-b602-ae316bb1f84a]
[cid:5131bd4a-f40b-4efa-99d2-ad4dc76cbdeb]
--------------------------------------------------------------------------------
CoreMedia GmbH
Rödingsmarkt 9, 20459 Hamburg, Germany
Managing Director: Sören Stamer
Commercial Register: Amtsgericht Hamburg, HRB 162480
--------------------------------------------------------------------------------