Github user nickwallen commented on a diff in the pull request:
https://github.com/apache/metron/pull/827#discussion_r149158814
--- Diff:
metron-platform/metron-indexing/src/test/java/org/apache/metron/indexing/dao/InMemoryDao.java
---
@@ -221,12 +222,24 @@ public void batchUpdate(Map<Document,
Optional<String>> updates) throws IOExcept
}
}
- public Map<String, Map<String, FieldType>>
getColumnMetadata(List<String> indices) throws IOException {
- Map<String, Map<String, FieldType>> columnMetadata = new HashMap<>();
+ @Override
+ public Map<String, FieldType> getColumnMetadata(List<String> indices)
throws IOException {
+ Map<String, FieldType> indexColumnMetadata = new HashMap<>();
for(String index: indices) {
- columnMetadata.put(index, new HashMap<>(COLUMN_METADATA.get(index)));
+ Map<String, FieldType> columnMetadata = COLUMN_METADATA.get(index);
+ for (Entry entry: columnMetadata.entrySet()) {
+ String field = (String) entry.getKey();
+ FieldType type = (FieldType) entry.getValue();
+ if (indexColumnMetadata.containsKey(field)) {
+ if (!type.equals(indexColumnMetadata.get(field))) {
+ indexColumnMetadata.remove(field);
--- End diff --
> I agree with you that it would be confusing but I'm not sure what the
correct behavior should be.
Right. I don't think there is an obviously correct answer here.
Whichever approach we do take, I think we should have a big fat,
`LOG.error` here logging the duplicate field names, the index names, and what
the impact of this is.
> Should we include the field but just set the type to OTHER? This is how
the ElasticsearchDao treats fields it doesn't have type information for but so
it might be better to explicitly state this in the column metadata endpoint
response.
Do you know how the UI behaves if we were to do this? I think your idea
definitely sounds reasonable, but I'd want to see how the UI behaves with the
logic defined like this.
---