Hi Andy,

- just a quick report, as I wasn't able to solve the problem so far.

This working using curl as the client
>
> curl -X PUT -T aa.pdf http://localhost:9998/tika
>
> If I add '--header "Content-Type: application/pdf" ' it works fine for me,
too. If I don't specify the content-type I get a "415: Unsupported Media
Type". Just for others as a note ...

If I run the following:

let
  $file:="some.pdf",
  $request :=
<http:request  method='PUT'>
 <http:body media-type="application/octet-stream">{
  fetch:binary($file)
 }</http:body>
</http:request>
return
 http:send-request($request,"http://localhost:9998/tika";)

I get from BaseX (running in debug mode):

*java.lang.IllegalArgumentException: object is not an instance of declaring
class*

and (from Tika):

*INFO: tika (autodetecting type)*

Looks like there's already going something wrong on BaseX level. I still
get a response from Tika, but not the one I expected. If I change the
media-type to 'application/pdf' I no longer get the BaseX error, but a
document processing error (500) from Tika. 'application/pdf' is also the
media type that 'fetch:content-type()' returns..

So if it's not further specified, Tika tries to guess the content type but
cannot find one. If it's specified it returns a processing error. Like you
said maybe a problem with the content (as the content-length headers
differ).

Sorry for not being of much help but maybe someone else has an idea?

Cheers,
Lukas
_______________________________________________
BaseX-Talk mailing list
BaseX-Talk@mailman.uni-konstanz.de
https://mailman.uni-konstanz.de/mailman/listinfo/basex-talk

Reply via email to