It should work with mapper attachment. Remember that what you see in _ source is not what you get indexed.
About extracting and storing text content, fsriver does it. See https://github.com/dadoonet/fsriver#generated-fields -- David ;-) Twitter : @dadoonet / @elasticsearchfr / @scrutmydocs Le 22 avr. 2014 à 08:44, Prashant Agrawal <prashant.agra...@paladion.net> a écrit : Hi Rafał Kuć, I tried doing the same but I didnt get the result as I want. Just explaining the problem in details: 1) I have a pdf file which has the text as "There is already a big market for mid-range 4G LTE market, being pushed by telecom operators and device manufacturers." 2) I indexed this file in ES and when checked in ES the content present was in unicode like "PGh0bWwgeG1sbnM6dj0idXJuOnNjaGVtYXMtbWljcm9zb2Z0LWNvbTp2bWwiDQp4bWxuczpvPSJ1cm46c2NoZW1hHAtZXF1aXY9Q" 3) So if I search for "LTE" it wont return any result because the content stored in ES is in unicode format. So my question is, Is there anyway or any plugin to store the pdf content in normal string format so that I can perform the search on top of that. -- View this message in context: http://elasticsearch-users.115913.n3.nabble.com/Can-we-perform-the-text-search-present-in-the-images-or-pdf-files-through-elasticsearch-tp4054367p4054541.html Sent from the ElasticSearch Users mailing list archive at Nabble.com. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/1398149086168-4054541.post%40n3.nabble.com. For more options, visit https://groups.google.com/d/optout. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/BA350CA1-91D3-4A80-9F7F-6A45DC742C66%40pilato.fr. For more options, visit https://groups.google.com/d/optout.