As we create a new index everyday, we're not concerned with retro-applying the fix to existing indices. So it seems Templates are the way to go here.
On Friday, February 6, 2015 at 2:52:49 PM UTC, John Smith wrote: > > A template wont help you here. I mean it's good to use and you should use > them. But once the schema is defined you can't change it. This is no > different then any database. > > Your best bet here is to do a bit of data cleansing/normalizing. > > If you know that the field is date field and sometimes the date is > different then you have to try to convert it proper date format before > inserting. Especially if you are trying to push it all into one field. > > Even if you use wildcards in templates like suggested above, you would > have to know that the date is different to have it pushed to another field. > > On Friday, 6 February 2015 06:41:49 UTC-5, Itamar Syn-Hershko wrote: >> >> You mean something like dynamic templates? >> http://code972.com/blog/2015/02/81-elasticsearch-one-tip-a-day-using-dynamic-templates-to-avoid-rigorous-mappings >> >> -- >> >> Itamar Syn-Hershko >> http://code972.com | @synhershko <https://twitter.com/synhershko> >> Freelance Developer & Consultant >> Lucene.NET committer and PMC member >> >> On Fri, Feb 6, 2015 at 1:39 PM, Paul Kavanagh <pkav...@shopkeep.com> >> wrote: >> >>> Hi all, >>> We're having a MapperParsingException problem with some field values >>> when we get when we use the JSON Filter for Logstash to explode out a JSON >>> document to Elasticsearch fields. >>> >>> In 99.9% of cases, certain of these fields are either blank, or contain >>> dates in the format of yyyy-mm-dd. This allows ES to dynamically map this >>> field to type dateOptionalTime. >>> >>> However, we occasionally see non-standard date formats in these fields, >>> which our main service can handle fine, but which throws a >>> MapperParsingException in Elasticsearch - such are here: >>> >>> >>> >>> [2015-02-06 10:46:50,679][WARN ][cluster.action.shard ] [logging- >>> production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2] >>> received shard failed for [logstash-2015.02.06][2], node[ >>> GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj- >>> QRuOZB713YAQwvA], reason [Failed to start shard, message [ >>> RecoveryFailedException[[logstash-2015.02.06][2]: Recovery failed from [ >>> logging-production-elasticsearch-ip-xxx-xxx-xxx-82][IALW-92RReiLffQjSL3I >>> -g][logging-production-elasticsearch-ip-xxx-xxx-xxx-82][inet[ip-xxx-xxx- >>> xxx-82.ec2.internal/xxx.xxx.xxx.82:9300]]{max_local_storage_nodes=1, >>> aws_availability_zone=us-east-1e, aws_az=us-east-1e} into [logging- >>> production-elasticsearch-ip-xxx-xxx-xxx-148][GZpltBjAQUqGyp2B1SLz_g][ >>> logging-production-elasticsearch-ip-xxx-xxx-xxx-148][inet[ip-xxx.xxx.xxx >>> .148.ec2.internal/xxx.xxx.xxx.148:9300]]{max_local_storage_nodes=1, >>> aws_availability_zone=us-east-1c, aws_az=us-east-1c}]; nested: >>> RemoteTransportException[[logging-production-elasticsearch-ip-xxx-xxx- >>> xxx-82][inet[/xxx.xxx.xxx.82:9300]][internal:index/shard/recovery/ >>> start_recovery]]; nested: RecoveryEngineException[[logstash-2015.02.06][ >>> 2] Phase[2] Execution failed]; nested: RemoteTransportException[[logging >>> -production-elasticsearch-ip-xxx-xxx-xxx-148][inet[/xxx.xxx.xxx.148:9300 >>> ]][internal:index/shard/recovery/translog_ops]]; nested: >>> MapperParsingException[failed to parse [apiservice.logstash.@fields. >>> parameters.start_time]]; nested: MapperParsingException[failed to parse >>> date field [Feb 5 2015 12:00 AM], tried both date format [ >>> dateOptionalTime], and timestamp number with locale []]; nested: >>> IllegalArgumentException[Invalid format: "Feb 5 2015 12:00 AM"]; ]] >>> >>> 2015-02-06 10:46:53,685][WARN ][cluster.action.shard ] [logging- >>> production-elasticsearch-ip-xxx-xxx-xxx-148] [logstash-2015.02.06][2] >>> received shard failed for [logstash-2015.02.06][2], node[ >>> GZpltBjAQUqGyp2B1SLz_g], [R], s[INITIALIZING], indexUUID [BEdTwj- >>> QRuOZB713YAQwvA], reason [master [logging-production-elasticsearch-ip- >>> xxx-xxx-xxx-148][GZpltBjAQUqGyp2B1SLz_g][logging-production- >>> elasticsearch-ip-xxx-xxx-xxx-148][inet[ip-xxx-xxx-xxx-148.ec2.internal/ >>> xxx.xxx.xxx.148:9300]]{max_local_storage_nodes=1, aws_availability_zone= >>> us-east-1c, aws_az=us-east-1c} marked shard as initializing, but shard >>> is marked as failed, resend shard failure] >>> >>> >>> Our planned solution was to create a template for Logstash indices that >>> will set these fields to string. But as the field above isn't the only >>> culprit, and more may be added overtime, it makes more sense to create a >>> template to map all fields under apiservice.logstash.@fields.parameters.* >>> to be string. (We never need to query on user entered data, but it's great >>> to have logged for debugging) >>> >>> Is it possible to do this with a template? I could not find a way to do >>> this via the template documentation on the ES site. >>> >>> Any guidance would be great! >>> >>> Thanks, >>> -Paul >>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "elasticsearch" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to elasticsearc...@googlegroups.com. >>> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/elasticsearch/6ca4030f-b6bb-4907-b2fc-e3166fa2a6af%40googlegroups.com >>> >>> <https://groups.google.com/d/msgid/elasticsearch/6ca4030f-b6bb-4907-b2fc-e3166fa2a6af%40googlegroups.com?utm_medium=email&utm_source=footer> >>> . >>> For more options, visit https://groups.google.com/d/optout. >>> >> >> -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/da874d3c-a16c-4b46-9aff-ca0f42eee92e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.