Hi, 

Every time I run a POST request using _update, I notice that any indexed 
information I didn't put in _source appears to go missing.

Obviously, it would be ideal if I didn't have to store, for example, the 
contents of a several-megabyte file in _source in order to keep it in my 
record after calling the _update method on my index/mapping. 

To start, here is the version info for elastic search:

{
  "status" : 200,
  "name" : "Feron",
  "version" : {
    "number" : "1.3.1",
    "build_hash" : "2de6dc5268c32fb49b205233c138d93aaf772015",
    "build_timestamp" : "2014-07-28T14:45:15Z",
    "build_snapshot" : false,
    "lucene_version" : "4.9"
  },
  "tagline" : "You Know, for Search"
}


Here's my cluster health: 


{
  "cluster_name" : "my-cluster",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 2,
  "number_of_data_nodes" : 2,
  "active_primary_shards" : 5,
  "active_shards" : 10,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0
}


A script for recreating the issue is attached. In it, I create a mapping and 
save a record using the attachment plugin. The records correctly match searches 
on a field in _source, a field excluded from _source, and within the content 
(attachment) field (also excluded from source).


As soon as I make the POST request to …/_update searches against fields 
excluded from _source return 0 hits.


Is the only solution to this to store all fields in _source if I plan on 
calling _update on the record?

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/341a24f2-aedf-4f5f-9a9e-1434b9ea1e62%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
indexserver="example.org:9200"
indexname="sample"
curl -XDELETE "http://$indexserver/$indexname/losing_data/_mapping?pretty=1";
curl -XPUT "http://$indexserver/$indexname/losing_data/_mapping?pretty=1"; -d '
{
      "losing_data" : {
        "_source" : {
          "enabled": true,
          "excludes" : [ "content", "not_sourced" ]
        },
        "properties" : {
          "record_counts" : {
            "type" : "nested",
            "include_in_parent": true,
            "properties" : {
              "first_count" : {
                "type" : "long"
              },
              "second_count" : {
                "type" : "long"
              }
            }
          },

          "description" : {
            "type" : "string"
          },
          "not_sourced" : {
            "type" : "string"
          },
          "blarf" : {
            "type": "string"
          },
          "content" : {
            "type" : "attachment"
            }
          }

        }
      }
}
'




file_path='test-1.rtf' # test-1.rtf is an RTF file containing the phrase "This contains red"
file_content=`cat $file_path | perl -MMIME::Base64 -ne 'print encode_base64($_)'`
json='
{
   "content" : "'${file_content}'",
   "not_sourced": "Giraffe",
   "description": "This is not about bears",
   "record_counts": {
    "first_count": 1,
    "second_count": 100
   }
}
'
echo "$json" > json.file
curl -XPUT "http://$indexserver/$indexname/losing_data/1"; -d @json.file

curl "http://$indexserver/$indexname/losing_data/_search?q=red&pretty=1";
# should return one hit (from the file)
curl "http://$indexserver/$indexname/losing_data/_search?q=giraffe&pretty=1";
# should return one hit (from not_sourced field)
curl "http://$indexserver/$indexname/losing_data/_search?q=bears&pretty=1";
# should return one hit (from description)

curl "http://$indexserver/$indexname/losing_data/1?pretty=1";
# record_counts > first_count should be 1
curl -XPOST "http://$indexserver/$indexname/losing_data/1/_update"; -d '{
    "script": "ctx._source.record_counts.first_count += 1",
    "lang": "groovy"
}'
curl "http://$indexserver/$indexname/losing_data/1/?pretty=1";
# record_counts > first_count should be 2

curl "http://$indexserver/$indexname/losing_data/_search?q=red&pretty=1";
# I get 0 hits
curl "http://$indexserver/$indexname/losing_data/_search?q=giraffe&pretty=1";
# I get 0 hits
curl "http://$indexserver/$indexname/losing_data/_search?q=bears&pretty=1";
# description is in source, so I still get a hit

Reply via email to