I also used Clint's example and tried to map it to a document and search 
the field, but still getting html in query results... Here is my code. I 
appreciate the help.

//Tokenizer

PUT /foo/
{
 "settings": {
   "index" : {
      "analysis" : {
         "analyzer" : {
            "test_1" : {
               "char_filter" : [
                  "html_strip"
               ],
               "tokenizer" : "standard"
            }
         }
      }
   }
 }
}


//Mapping
PUT /foo/foo_type/_mapping
{
  "foo_type":{ 
         "properties" : {
                   "title": {
                         "type":"string",
                         "index": "analyzed", 
                         "analyzer":"test_1"
                         }
                       }
           }
}


Get /foo/foo_type/_mapping
{
   "foo": {
      "mappings": {
         "foo_type": {
            "properties": {
               "date": {
                  "type": "date",
                  "format": "dateOptionalTime"
               },
               "title": {
                  "type": "string",
                  "analyzer": "test_1"
               }
            }
         }
      }
   }
}


////Index/////////////
PUT /foo/foo_type/1
{
    "date" : "2009-11-15T14:12:12",
    "title" : "The quick & <b>brown</b> fox"
}


//Search //////////
GET /foo/_search?pretty:true
{
   "fields": ["title"], 
    "query": {
        "query_string": {
            "query": "brown",
            "analyzer": "test_1"
        }
    }
}


//Results showing html tags still//////
"hits": [
         {
            "_index": "foo",
            "_type": "foo_type",
            "_id": "1",
            "_score": 0.076713204,
            "fields": {
               "title": [
                  "The quick & <b>brown</b> fox" 
               ]
            }



On Thursday, August 7, 2014 6:06:56 PM UTC-4, Jörg Prante wrote:
>
> Have you checked Clint's example?
>
> https://gist.github.com/clintongormley/780895
>
> Jörg
>
>
> On Thu, Aug 7, 2014 at 8:23 PM, IronMike <sabda...@gmail.com <javascript:>
> > wrote:
>
>> I would like to strip html tags for indexing. Here is a simple example I 
>> tried so far, but doesn't seem to strip html tags. Any ideas what's missing?
>>
>> //settings & Mappings
>> POST twitter
>> {
>>   "mappings": {
>>     "tweet" : {
>>       "properties" : {
>>         "message" : {
>>           "type" :    "string",
>>           "analyzer": "strip_html_analyzer"
>>         },
>>         "date" : {
>>           "type" :   "date"
>>         },
>>         "name" : {
>>           "type" :   "string"
>>         }
>>       }
>>     }
>>   },
>>   "settings": {
>>     "analysis": {
>>       "analyzer": {
>>         "strip_html_analyzer":{
>>             "type":"custom",
>>             "tokenizer":"standard",
>>             "filter":"standard",
>>             "char_filter":"my_html"
>>         }
>>       },
>>       "char_filter": {
>>           "my_html":{
>>               "type":"html_strip"
>>           }
>>       }
>>     }
>>   }
>> }
>>
>>
>> //Index a document
>> PUT /twitter/tweet/1
>> {
>>     "name" : "mike",
>>     "date" : "2009-11-15T14:12:12",
>>     "message" : "<html>trying out <b>Elasticsearch</b>, This is an html 
>> test</html>"
>> }
>>
>>
>> //query result for "html", I expect the query to return nothing since it 
>> is supposed to strip the tag?
>> "hits": {
>>       "total": 1,
>>       "max_score": 0.11626227,
>>       "hits": [
>>          {
>>             "_index": "twitter",
>>             "_type": "tweet",
>>             "_id": "1",
>>             "_score": 0.11626227,
>>             "fields": {
>>                "message": [
>>                   "<html>trying out <b>Elasticsearch</b>, This is an html 
>> test</html>"
>>                ]
>>             },
>>             "highlight": {
>>                "message": [
>>                   "<html>trying out <b>Elasticsearch</b>, This is an 
>> <em>html</em> test</html>"
>>                ]
>>             }
>>          }
>>       ]
>>    }
>>
>>
>>
>>
>>  -- 
>> You received this message because you are subscribed to the Google Groups 
>> "elasticsearch" group.
>> To unsubscribe from this group and stop receiving emails from it, send an 
>> email to elasticsearc...@googlegroups.com <javascript:>.
>> To view this discussion on the web visit 
>> https://groups.google.com/d/msgid/elasticsearch/517fe8b8-0b38-4646-bc8f-a27896454515%40googlegroups.com
>>  
>> <https://groups.google.com/d/msgid/elasticsearch/517fe8b8-0b38-4646-bc8f-a27896454515%40googlegroups.com?utm_medium=email&utm_source=footer>
>> .
>> For more options, visit https://groups.google.com/d/optout.
>>
>
>

-- 
You received this message because you are subscribed to the Google Groups 
"elasticsearch" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to elasticsearch+unsubscr...@googlegroups.com.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/elasticsearch/a831f6f4-b47c-4c35-a40b-058e3c1b1043%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to