I want to store a bunch of documents in elasticsearch (which represent a hit to a website) including the user agent of the client that made the original HTTP request.
Since user agent strings have a lot of variance, and the useful parts need parsing out (OS, browser, version etc.) I would like to be able to perform aggregations on those extracted features. The simplest way I can think to do this would be to analyze the user agent string before indexing the document. The downside to this approach is as new/different user agent strings emerge (which is not unlikely) you would have to proactively update the parser. This may be impossibly/undesirable for a number of reasons, but what I'd really like to do is index the raw user agent string and then perform the analysis/feature extraction post-hoc at query time. Any ideas/pointers on how to do this? Aggregators? Custom analyzers? (How would you handle an update to the analyzer, would you need to re-run against all existing stored data?) -- You received this message because you are subscribed to the Google Groups "elasticsearch" group. To unsubscribe from this group and stop receiving emails from it, send an email to elasticsearch+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/elasticsearch/ed9bf030-f9bf-480a-88b1-a80421b9e79e%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.