Hello, Using the highlight API for a simple query like this:
curl localhost:9200/company_52fb7b90c8318c4dc800006b/_search -d'{ "fields": [], "query": { "filtered": { "query": { "match": { "_all": "i do not" } } } }, "highlight": { "fields": { "metadatas.*": { "number_of_fragments" : 1, "fragment_size" : 20 } } } }' This should return snippet whose size does not exceeds 20 characters. Most of the time, this works, however i do have one document analyzed with the same mappings which yields really long snippets - in fact, it is not truncated, and contains all text. Here is a sample working as expected: {"took":21,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":19,"max_score":0.24860834,"hits":[{"_index":"company_52fb7b90c8318c4dc800006b","_type":"document","_id":"5309c5949ba7daaa265ffdd8","_score":0.24860834,"highlight":{"metadatas.text":[", and <em>do</em> not hesitate"]}},{"_index":"company_52fb7b90c8318c4dc800006b","_type":"document","_id":"5309c5949ba7daaa265ffdd6","_score":0.14883985,"highlight":{"metadatas.text":[" take his child.\n<em>I</em> <em>do</em>"]}},{"_index":"company_52fb7b90c8318c4dc800006b","_type":"document","_id":"5309c57a9ba7daaa265ffdc8","_score":0.1365959,"highlight":{"metadatas.text":[" resident of DC, <em>I</em> am"]}}]}} And here is the unruly one: {"took":122,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":19,"max_score":0.24860834,"hits":[{"_index":"company_52fb7b90c8318c4dc800006b","_type":"document","_id":"5309c5949ba7daaa265ffdd8","_score":0.24860834,"highlight":{"metadatas.text":[", and <em>do</em> not hesitate"]}},{"_index":"company_52fb7b90c8318c4dc800006b","_type":"document","_id":"5309c5949ba7daaa265ffdd6","_score":0.14883985,"highlight":{"metadatas.text":[" take his child.\n<em>I</em> <em>do</em>"]}},{"_index":"company_52fb7b90c8318c4dc800006b","_type":"document","_id":"5309c57a9ba7daaa265ffdc8","_score":0.1365959,"highlight":{"metadatas.text":[" resident of DC, <em>I</em> am"]}},{"_index":"company_52fb7b90c8318c4dc800006b","_type":"document","_id":"5309c57a9ba7daaa265ffdc7","_score":0.13437755,"highlight":{"metadatas.text":[".\n<em>I</em> <em>do</em> not enlighten those who are not eager to learn, nor arouse\nthose who are not anxious to give an explanation themselves. If <em>I</em>\nhave presented one corner of the square and they cannot come\nback to me with the other three, <em>I</em> should not go over the points\nagain.\n― Confucius\nBesides explaining JavaScript, this book tries to be an introduction to the basic\nprinciples of programming. Programming, it turns out, is hard. The\nfundamental rules are, most of the time, simple and clear. But programs,\nwhile built on top of these basic rules, tend to become complex enough to\nintroduce their own rules, their own complexity. Because of this, programming\nis rarely simple or predictable. As Donald Knuth, who is something of a\nfounding father of the field, says, it is an art.\nTo get something out of this book, more than just passive reading is required.\nTry to stay sharp, make an effort to solve the exercises, and only continue on\nwhen you are reasonably sure you understand the material that came before.\nThe computer programmer is a creator of universes for which he\nalone is responsible. Universes of virtually unlimited complexity can\nbe created in the form of computer programs.\n― Joseph Weizenbaum, Computer Power and Human Reason\nA program is many things. It is a piece of text typed by a programmer, it is\nthe directing force that makes the computer <em>do</em> what it does, it is data in the\ncomputer's memory, yet it controls the actions performed on this same\nmemory. Analogies that try to compare programs to objects we are familiar\nwith tend to fall short, but a superficially fitting one is that of a machine. The\ngears of a mechanical watch fit together ingeniously, and if the watchmaker\nwas any good, it will accurately show the time for many years. The elements\nof a program fit together in a similar way, and if the programmer knows what\nhe is doing, the program will run without crashing.\nA computer is a machine built to act as a host for these immaterial machines.\nComputers themselves can only <em>do</em> stupidly straightforward things. The reason\nthey are so useful is that they <em>do</em> these things at an incredibly high speed. A\nprogram can, by ingeniously combining many of these simple actions, <em>do</em> very\ncomplicated things.\nTo some of us, writing computer programs is a fascinating game. A program\nis a building of thought. It is costless to build, weightless, growing easily under\nour typing hands. If we get carried away, its size and complexity will grow out\nof control, confusing even the one who created it. This is the main problem of\nprogramming. It is why so much of today's software tends to crash, fail,\nscrew up.\nWhen a program works, it is beautiful. The art of programming is the skill of\ncontrolling complexity. The great program is subdued, made simple in its\ncomplexity.\nToday, many programmers believe that this complexity is best managed by\nusing only a small set of well-understood techniques in their programs. They\nhave composed strict rules about the form programs should have, and the\nmore zealous among them will denounce those who break these rules as bad\nprogrammers.\nWhat hostility to the richness of programming! To try to reduce it to\nsomething straightforward and predictable, to place a taboo on all the weird\nand beautiful programs. The landscape of programming techniques is\nenormous, fascinating in its diversity, still largely unexplored. It is certainly\nlittered with traps and snares, luring the inexperienced programmer into all\nkinds of horrible mistakes, but that only means you should proceed with\ncaution, keep your wits about you. As you learn, there will always be new\nchallenges, new territory to explore. The programmer who refuses to keep\nexploring will surely stagnate, forget his joy, lose the will to program (and\nbecome a manager).\nAs far as <em>I</em> am concerned, the definite criterion for a program is whether it is\ncorrect. Efficiency, clarity, and size are also important, but how to balance\nthese against each other is always a matter of judgement, a judgement that\neach programmer must make for himself. Rules of thumb are useful, but one\nshould never be afraid to break them.\nIn the beginning, at the birth of computing, there were no programming\nlanguages. Programs looked something like this:\n00110001 00000000 00000000\n00110001 00000001 00000001\n00110011 00000001 00000010\n01010001 00001011 00000010\n00100010 00000010 00001000\n01000011 00000001 00000000\n01000001 00000001 00000001\n00010000 00000010 00000000\n01100010 00000000 00000000\nThat is a program to add the numbers from one to ten together, and print out\nthe result (1 + 2 + ... + 10 = 55). It could run on a very simple kind of\ncomputer. To program early computers, it was necessary to set large arrays\nof switches in the right position, or punch holes in strips of cardboard and\nfeed them to the computer. You can imagine how this was a tedious,\nerror-prone procedure. Even the writing of simple programs required much\ncleverness and discipline, complex ones were nearly inconceivable.\nOf course, manually entering these arcane patterns of bits (which is what the\n1s and 0s above are generally called) did give the programmer a profound\nsense of being a mighty wizard. And that has to be worth something, in terms\nof job satisfaction.\nEach line of the program contains a single instruction. It could be written in\nEnglish like this:\nStore the number 0 in memory location 01.\nStore the number 1 in memory location 12.\nStore the value of memory location 1 in memory location 23.\nSubtract the number 11 from the value in memory location 24.\nIf the value in memory location 2 is the number 0, continue with\ninstruction 9\n5.\nAdd the value of memory location 1 to memory location 06.\nAdd the number 1 to the value of memory location 17.\nContinue with instruction 38.\nOutput the value of memory location 09.\nWhile that is more readable than the binary soup, it is still rather unpleasant.\nIt might help to use names instead of numbers for the instructions and\nmemory locations:\nSet 'total' to 0\nSet 'count' to 1\n[loop]\nSet 'compare' to 'count'\nSubtract 11 from 'compare'\nIf 'compare' is zero, continue at [end]\nAdd 'count' to 'total'\nAdd 1 to 'count'\nContinue at [loop]\n[end]\nOutput 'total'\nAt this point it is not too hard to see how the program works. Can you? The\nfirst two lines give two memory locations their starting values: total will be\nused to build up the result of the program, and count keeps track of the\nnumber that we are currently looking at. The lines using compare are probably\nthe weirdest ones. What the program wants to <em>do</em> is see if count is equal to\n11, in order to decide whether it can stop yet. Because the machine is so\nprimitive, it can only test whether a number is zero, and make a decision\n(jump) based on that. So it uses the memory location labelled compare to\ncompute the value of count - 11, and makes a decision based on that value.\nThe next two lines add the value of count to the result, and increment count\nby one every time the program has decided that it is not 11 yet.\nHere is the same program in JavaScript:\nvar total = 0, count = 1;\nwhile (count <= 10) {\ntotal += count;\ncount += 1;\n}\nprint(total);\nThis gives us a few more improvements. Most importantly, there is no need\nto specify the way we want the program to jump back and forth anymore.\nThe magic word while takes care of that. It continues executing the lines\nbelow it as long as the condition it was given holds: count <= 10, which means\n'count is less than or equal to 10'. Apparently, there is no need anymore to\ncreate a temporary value and compare that to zero. This was a stupid little\ndetail, and the power of programming languages is that they take care of\nstupid little details for us.\nFinally, here is what the program could look like if we happened to have the\nconvenient operations range and sum available, which respectively create a\ncollection of numbers within a range and compute the sum of a collection of\nnumbers:\nprint(sum(range(1, 10)));\nThe moral of this story, then, is that the same program can be expressed in\nlong and short, unreadable and readable ways. The first version of the\nprogram was extremely obscure, while this last one is almost English: print\nthe sum of the range of numbers from 1 to 10. (We will see in later chapters\nhow to build things like sum and range.)\nA good programming language helps the programmer by providing a more\nabstract way to express himself. It hides uninteresting details, provides\nconvenient building blocks (such as the while construct), and, most of the\ntime, allows the programmer to add building blocks himself (such as the sum\nand range operations).\nJavaScript is the language that is, at the moment, mostly being used to <em>do</em> all\nki......[truncated] Am I doing anything wrong? Over the course of 3 months, the problem was only reported twice (on two distinct documents), all other documents behaved correctly. Interestingly, updating the query to something more complex returns valid snippet, correctly truncated. -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/b342a193-8f98-4202-a9c1-84ec100e94ae%40googlegroups.com. For more options, visit https://groups.google.com/groups/opt_out.