Hi, Every time I run a POST request using _update, I notice that any indexed information I didn't put in _source appears to go missing.
Obviously, it would be ideal if I didn't have to store, for example, the contents of a several-megabyte file in _source in order to keep it in my record after calling the _update method on my index/mapping. To start, here is the version info for elastic search: { "status" : 200, "name" : "Feron", "version" : { "number" : "1.3.1", "build_hash" : "2de6dc5268c32fb49b205233c138d93aaf772015", "build_timestamp" : "2014-07-28T14:45:15Z", "build_snapshot" : false, "lucene_version" : "4.9" }, "tagline" : "You Know, for Search" } Here's my cluster health: { "cluster_name" : "my-cluster", "status" : "green", "timed_out" : false, "number_of_nodes" : 2, "number_of_data_nodes" : 2, "active_primary_shards" : 5, "active_shards" : 10, "relocating_shards" : 0, "initializing_shards" : 0, "unassigned_shards" : 0 } A script for recreating the issue is attached. In it, I create a mapping and save a record using the attachment plugin. The records correctly match searches on a field in _source, a field excluded from _source, and within the content (attachment) field (also excluded from source). As soon as I make the POST request to …/_update searches against fields excluded from _source return 0 hits. Is the only solution to this to store all fields in _source if I plan on calling _update on the record? -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/341a24f2-aedf-4f5f-9a9e-1434b9ea1e62%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
indexserver="example.org:9200" indexname="sample" curl -XDELETE "http://$indexserver/$indexname/losing_data/_mapping?pretty=1" curl -XPUT "http://$indexserver/$indexname/losing_data/_mapping?pretty=1" -d ' { "losing_data" : { "_source" : { "enabled": true, "excludes" : [ "content", "not_sourced" ] }, "properties" : { "record_counts" : { "type" : "nested", "include_in_parent": true, "properties" : { "first_count" : { "type" : "long" }, "second_count" : { "type" : "long" } } }, "description" : { "type" : "string" }, "not_sourced" : { "type" : "string" }, "blarf" : { "type": "string" }, "content" : { "type" : "attachment" } } } } } ' file_path='test-1.rtf' # test-1.rtf is an RTF file containing the phrase "This contains red" file_content=`cat $file_path | perl -MMIME::Base64 -ne 'print encode_base64($_)'` json=' { "content" : "'${file_content}'", "not_sourced": "Giraffe", "description": "This is not about bears", "record_counts": { "first_count": 1, "second_count": 100 } } ' echo "$json" > json.file curl -XPUT "http://$indexserver/$indexname/losing_data/1" -d @json.file curl "http://$indexserver/$indexname/losing_data/_search?q=red&pretty=1" # should return one hit (from the file) curl "http://$indexserver/$indexname/losing_data/_search?q=giraffe&pretty=1" # should return one hit (from not_sourced field) curl "http://$indexserver/$indexname/losing_data/_search?q=bears&pretty=1" # should return one hit (from description) curl "http://$indexserver/$indexname/losing_data/1?pretty=1" # record_counts > first_count should be 1 curl -XPOST "http://$indexserver/$indexname/losing_data/1/_update" -d '{ "script": "ctx._source.record_counts.first_count += 1", "lang": "groovy" }' curl "http://$indexserver/$indexname/losing_data/1/?pretty=1" # record_counts > first_count should be 2 curl "http://$indexserver/$indexname/losing_data/_search?q=red&pretty=1" # I get 0 hits curl "http://$indexserver/$indexname/losing_data/_search?q=giraffe&pretty=1" # I get 0 hits curl "http://$indexserver/$indexname/losing_data/_search?q=bears&pretty=1" # description is in source, so I still get a hit