Thanks Josh, thats correct but rmeta/text allows you to control this but it only returns one level of text (not documents embedded within others) - when you use the recursive interface rmeta/all it always returns content as HTML and similarly unpack/all returns meta as CSV.
On Thu, Mar 21, 2024 at 1:40 PM Josh Burchard <[email protected]> wrote: > Samuel - Well, I use Tika server and I get my data back in JSON format > because I use the /rmeta/text endpoint and send the HTTP header > Accept:application/json. If you were to send Accept:text/plain would that > work for you? I've only done that in the context of the /tika endpoint and > that was long ago. Not sure how to do anything similar in the app because > I never use that. By the way, in the context of using the server I find > this table very helpful: > > > https://cwiki.apache.org/confluence/display/TIKA/TikaServerEndpointsCompared > > > > > > > > From: "Zig Zag" <[email protected]> > To: [email protected] > Date: 03/21/2024 03:49 PM > Subject: Re: Meta output format of tika server /unpack/all > ------------------------------ > > > > [CAUTION: This email is from outside the organization. Unless you trust > the sender, don't click links or open attachments as it may be a phishing > email, which can steal your information and compromise your computer.] > > > Similarly is it possible to have /rmeta/all format content/text as text > instead of HTML? > > On Thu, Mar 21, 2024 at 9:50 AM Zig Zag <*[email protected]* > <[email protected]>> wrote: > Hi All, > > Is there a way to get the __META__ output of /unpack/all in a JSON rather > than CSV ? > > Thank you, > Samuel > >
