Hi Ashish, Thanks for the solution. I made the changes and I can see the JSON message now. There is a JIRA raised on the same issue.
https://issues.apache.org/jira/browse/FLUME-2126 >From Hive when I load JSON data it automatically splits JSON fields to different columns. For some reason the ESSink doesnt load in the same way. I am not sure if I am setting the correct type. There is a parameter es. input.json I have to set to true in hive table . Is there any similar variable I have to set for ESSink Here is the raw data I am getting in Kibana. { "_index": "test-2014-05-08", "_type": "parsed_logs", "_id": "7qSBgRx-Q_GLaCDWARs_Cg", "_score": null, "_source": { "@message": "{\"action\":{\"id\":\"00001\"}}", "@timestamp": "2014-05-08T16:48:44.180Z", "@type": "application/json", "@fields": { "_attachment_mimetype": "application/json", "timestamp": "1399567724180", "_type": "application/json", "type": "application/json" } }, "sort": [ 1399567724180 ] } On Sun, Apr 13, 2014 at 4:56 PM, Ashish <[email protected]> wrote: > little more on the issue > > builder.field(fieldName, tmp); calls the XContentBuilder API where class > type is determined and appropriate method is called. Since tmp, which is > instance of XContentBuilder, doesn't match any of the defined if conditions > it goes to final else where the tmp.toString() is called, and field(String, > String) method is called so we get object address in index. > > Replacing > builder.field(fieldName, tmp); > with > builder.field(fieldName, tmp.string()); > > shall make things work, but I am not sure if this would be the best way to > use the API. > > Got the answer from ES user list :) > > http://elasticsearch-users.115913.n3.nabble.com/Issue-with-posting-json-data-to-elastic-search-via-Flume-td4054017.html > > Can ES experts comment on the best way forward? > > > > On Sun, Apr 13, 2014 at 8:10 PM, Ashish <[email protected]> wrote: > >> Have been able to reproduce the problem locally using the existing test >> cases inside ES Sink. The problem does exist. >> >> Did some initial investigation, the framework is able to detect the JSON >> content and tries to add it as complex field. >> timestamp is added only if present in header. >> >> In the class org.apache.flume.sink.elasticsearch.ContentBuilderUtil >> >> public static void addComplexField(XContentBuilder builder, String >> fieldName, >> XContentType contentType, byte[] data) throws IOException { >> XContentParser parser = null; >> try { >> XContentBuilder tmp = jsonBuilder(); >> parser = XContentFactory.xContent(contentType).createParser(data); >> parser.nextToken(); >> tmp.copyCurrentStructure(parser); >> builder.field(fieldName, tmp); <<<< This is where the we might have >> an issue (real action is happening inside this method >> call) >> >> Can someone familiar with this part look further into this? I shall debug >> further as soon as I have free cycles. >> >> thanks >> ashish >> >> >> >> On Fri, Apr 11, 2014 at 5:24 PM, Deepak Subhramanian < >> [email protected]> wrote: >> >>> Thanks Simon. I am also struggling with no luck. I tried using the >>> latest flume elastic search sink jar build from 1.5SNAPSHOT ,but still no >>> luck. I will try to see if it is an issue with elastic search api . When I >>> loaded json using hive it loaded JSON properly. But we have to pass a >>> property es.input.json in hive. Is there a way to pass the same in Flume. >>> >>> CREATE EXTERNAL TABLE json (data STRING >>> <http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#CO25-1>) >>> >>> >>> >>> STORED BY 'org.elasticsearch.hadoop.hive.EsStorageHandler' >>> TBLPROPERTIES('es.resource' = '...', >>> >>> >>> >>> 'es.input.json` = 'yes' >>> <http://www.elasticsearch.org/guide/en/elasticsearch/hadoop/current/hive.html#CO25-2>); >>> >>> >> >> >> -- >> thanks >> ashish >> >> Blog: http://www.ashishpaliwal.com/blog >> My Photo Galleries: http://www.pbase.com/ashishpaliwal >> > > > > -- > thanks > ashish > > Blog: http://www.ashishpaliwal.com/blog > My Photo Galleries: http://www.pbase.com/ashishpaliwal > -- Deepak Subhramanian
