Re: problem with using uax_url_email

2015-02-19 Thread Marria
Hi, for people having the same problem like me, here an answer I received from Pablo in PT group: About your problem I beleive this is a constraint of the Apache Tika [1], which is used by the mapper-attachment plugin. I believe that a search over Tika pdf limitations or a question on their

problem with using uax_url_email

2015-02-18 Thread Marria
Hi everybody, I want to perform URL extraction from my PDF files. I use mapper-attachment plugin to index my PDF files. In order to be able to perform some regex queries and extract all the urls present in a pdf file, I useduax_url_email: curl -X PUT localhost:9200/test -d '{ settings