Hi Tim, Thank you for your response. Yes, I am using /rmeta/form endpoint and I am getting info on embedded files seperately but not getting information for which parent this embedded file is belongs to so that I can track the chain of multilevel embedded files. So do have any meta property which tells us regarding this.
On Sat, Oct 22, 2022, 16:06 Tim Allison <talli...@apache.org> wrote: > 1) If you're using the /tika endpoint, embedded files are marked up as > such in the xhtml output with div tags. If you want full info on embedded > files, I'd strongly encourage using the /rmeta endpoint. > > 2) We don't offer content marked up with json, but we do offer a text > option, which can be returned in the X-Tika-Content tag in the json output. > See https://cwiki.apache.org/confluence/display/TIKA/TikaServer for > details on how to request text. > > This might also be useful: > https://cwiki.apache.org/confluence/display/TIKA/TikaServerEndpointsCompared > > > On Fri, Oct 21, 2022 at 11:12 PM Chetan Bikire <chetab...@gmail.com> > wrote: > >> 1) How does Tika server maintains Parent-Child relationship between main >> document and it's embedded documents (i.e. Email with multiple attachment) >> after parsing, so is their any property or tag using which we come to know >> relationships? >> >> 2) After parsing any document we are getting all tags in JSON format >> except *X-Tika-Content* tag which is in HTML format so is their any way >> to get this in json format? >> >> Please Assist. >> Thank You >> >