[ 
https://issues.apache.org/jira/browse/TIKA-3703?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17677353#comment-17677353
 ] 

Tim Allison commented on TIKA-3703:
-----------------------------------

Users can compress the output with an "accepts" header, but you're 
right...why...  We don't want to add an extra step, and a zip is simpler.  
Thank you, Nick!

> Consider adding a frictionless data package output format
> ---------------------------------------------------------
>
>                 Key: TIKA-3703
>                 URL: https://issues.apache.org/jira/browse/TIKA-3703
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Major
>
> For those who want more than just text and metadata, e.g. bytes for 
> thumbnails, or embedded images or embedded files or rendered pages, it would 
> be great to return that data in a standard format. Our current /unpack 
> endpoint uses a zip file but with our own "standard".
> I was thinking about heading down the pure json option by including these 
> byte streams as base64 encoded metadata values in our current metadata 
> object. Not sure which is the better way to go.
> I'm opening this issue to discuss options.
>  
> Reference: [https://frictionlessdata.io/standards/#standards-toolkit]
> We'd want to make this available as an endpoint on tika-server 
> (\{{/v2/unpack}} or something else?) and as a commandline option in tika-app.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to