Github user justinleet commented on a diff in the pull request:

    https://github.com/apache/metron/pull/824#discussion_r150854643
  
    --- Diff: 
metron-platform/metron-indexing/src/main/java/org/apache/metron/indexing/dao/HBaseDao.java
 ---
    @@ -135,8 +138,9 @@ private Document getDocumentFromResult(Result result) 
throws IOException {
         Map.Entry<byte[], byte[]> entry= columns.lastEntry();
         Long ts = Bytes.toLong(entry.getKey());
         if(entry.getValue()!= null) {
    -      String json = new String(entry.getValue());
    -      return new Document(json, Bytes.toString(result.getRow()), null, ts);
    +      Map<String, Object> json = JSONUtils.INSTANCE.load(new 
String(entry.getValue()), new TypeReference<Map<String, Object>>() {
    +      });
    +      return new Document(json, Bytes.toString(result.getRow()), (String) 
json.get(SOURCE_TYPE), ts);
    --- End diff --
    
    I would prefer to see one of two things happen here. Either we keep the 
constant in the ES specific classes (which is admittedly less than ideal, but 
it does limit the pollution of ES knowledge into HBase classes) and populate 
source type from there (basically moving the loading and source type population 
there).  Alternatively, we pass in a more general function that can be applied 
to the fields and configure and handle it appropriately.
    
    I think the second one is probably more general useful to be able to do, 
but given the state of ES5 upgrade making this particular case obsolete, I'm 
amenable to doing the first option.
    
    At bare minimum we should replace the '.'s with ':'s only if present.  Even 
if there's not a Solr implementation, I don't want HBaseDao tied to ES so 
directly.
    
    @cestella Do you have a preference on implementation?  I know you'd had 
some comments earlier, but I don't want to put words in your mouth.


---

Reply via email to