Github user justinleet commented on a diff in the pull request: https://github.com/apache/metron/pull/824#discussion_r150854643 --- Diff: metron-platform/metron-indexing/src/main/java/org/apache/metron/indexing/dao/HBaseDao.java --- @@ -135,8 +138,9 @@ private Document getDocumentFromResult(Result result) throws IOException { Map.Entry<byte[], byte[]> entry= columns.lastEntry(); Long ts = Bytes.toLong(entry.getKey()); if(entry.getValue()!= null) { - String json = new String(entry.getValue()); - return new Document(json, Bytes.toString(result.getRow()), null, ts); + Map<String, Object> json = JSONUtils.INSTANCE.load(new String(entry.getValue()), new TypeReference<Map<String, Object>>() { + }); + return new Document(json, Bytes.toString(result.getRow()), (String) json.get(SOURCE_TYPE), ts); --- End diff -- I would prefer to see one of two things happen here. Either we keep the constant in the ES specific classes (which is admittedly less than ideal, but it does limit the pollution of ES knowledge into HBase classes) and populate source type from there (basically moving the loading and source type population there). Alternatively, we pass in a more general function that can be applied to the fields and configure and handle it appropriately. I think the second one is probably more general useful to be able to do, but given the state of ES5 upgrade making this particular case obsolete, I'm amenable to doing the first option. At bare minimum we should replace the '.'s with ':'s only if present. Even if there's not a Solr implementation, I don't want HBaseDao tied to ES so directly. @cestella Do you have a preference on implementation? I know you'd had some comments earlier, but I don't want to put words in your mouth.
---