This is not functionality that ManifoldCF supports out of the box.  The
extracted links are used for crawling, not as metadata.

I don't see a general use-case for this either, so I think you're on your
own modifying the web connector code to do what you want.  The
RepositoryDocument structure has arbitrary multi-valued fields; just put
what you want into one such field and you should see it in Elastic Search.

Karl


On Thu, Feb 13, 2020 at 1:57 AM ritika jain <ritikajain5...@gmail.com>
wrote:

> Hi All,
>
> I am using Manifoldcf 2.12, Repository as Web connector and Output as ES.
> As per requirement now, I want to save all related sub-links of a
> particular document Identifier(at a time). For example :-DocumentId::-
> www.xyz.com, so I would like to extract all related sublinks say:-
> www.xyz.com/abc, www.xyz.com/pqr etc.and save it in variable and then
> pass it to Elastic search
>
> I had gone the Web Repo code and thought of the function extractLinks
> ( protected boolean extractLinks(String documentIdentifier,
> IProcessActivity activities, DocumentURLFilter filter)) can do so.
> Is the existing functionality of MF is able for this extraction or we have
> to customize it? Any help would be appreciated.
>
>
> Thanks
> Ritika
>

Reply via email to