Sorry for the confusion - I do want PDFs, but I am concerned with the retrieval of the image file when it ocr text is searched. I must be missing something. As showing below, I provide two fields "text" and the "content". In your second post you say I don't need the "content' field for images? So, how does the search return the image to the asking client "Web app" for instance when a text match occurs with the image "ocr text"? If I only include "text", then it will return the text part of the image only and not the image, correct?
source(XContentFactory.jsonBuilder() .startObject() .field("text",ocrText) //extracted ocr text from image .field( "file").startObject() .field("content", fileContents) //content is the encoded base64string of the image file? is it needed? .field("_indexed_chars", -1) .endObject() .endObject() On Thursday, February 27, 2014 1:16:36 PM UTC-5, Binh Ly wrote: > > Oh, the attachment part is for your PDF. If you don't need to index PDFs > then just remove that part: > > PutMappingResponse putMappingResponse = new > PutMappingRequestBuilder( > client.admin().indices()). > setIndices(INDEX_NAME).setType(DOCUMENT_TYPE).setSource( > XContentFactory.jsonBuilder().startObject() > .field(DOCUMENT_TYPE).startObject() > .field("properties").startObject() > .field("text").startObject() > .field("type", "string") > .endObject() > .endObject() > .endObject() > .endObject() > ).execute().actionGet(); > > Indexing: > > IndexResponse indexResponse = client.prepareIndex(INDEX_ > NAME, DOCUMENT_TYPE, "1") > .setSource(XContentFactory.jsonBuilder().startObject() > .field("text", ocrText) > .endObject() > ).execute().actionGet(); > > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/35b9a36f-0a4e-4973-8c03-8d35f0af1a9f%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.