Binh, Thanks, With your help I think I am closer to the answer. Wih the sample mapping you provided, I should be able to provide the base 64 contents of the image file as the "contents" field, and the ocrtext as "text field. So, when the ocr text is searched, i can return the "content" which is the image. With the above mapping I believe the image is saved in the _source as well as the field for "highlighting " purposes, Can I prevent it from being stored in _source by something like this?
startObject("_source").field("enabled","no").endObject() On Thursday, February 27, 2014 8:29:25 AM UTC-5, Binh Ly wrote: > > You certainly can add a new field, and then just put the OCR text into > that new field. So for example: > > Mapping: > > PutMappingResponse putMappingResponse = new > PutMappingRequestBuilder( > > client.admin().indices()).setIndices(INDEX_NAME).setType(DOCUMENT_TYPE).setSource( > XContentFactory.jsonBuilder().startObject() > .field(DOCUMENT_TYPE).startObject() > .field("properties").startObject() > .field("text").startObject() > .field("type", "string") > .endObject() > .field("file").startObject() > .field("store", "yes") > .field("type", "attachment") > .field("fields").startObject() > .field("file").startObject() > .field("store", "yes") > .endObject() > .endObject() > .endObject() > .endObject() > .endObject() > .endObject() > ).execute().actionGet(); > > Then put the OCR text into the "text" field: > > IndexResponse indexResponse = client.prepareIndex(INDEX_NAME, > DOCUMENT_TYPE, "1") > .setSource(XContentFactory.jsonBuilder().startObject() > .field("text", ocrText) > .field("file").startObject() > .field("content", fileContents) > .field("_indexed_chars", -1) > .endObject() > .endObject() > ).execute().actionGet(); > > You probably don't need to index the image binary information - not sure > what you would need it for. > -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/a7db1379-5161-4f7d-ab78-a683c8beb07d%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.