[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user asfgit closed the pull request at: https://github.com/apache/metron/pull/1022 ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r191585931 --- Diff: metron-platform/metron-common/src/main/java/org/apache/metron/common/field/FieldNameConverters.java --- @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.metron.common.field; + +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; + +/** + * Enumerates a set of {@link FieldNameConverter} implementations. + * + * Provides shared instances of each {@link FieldNameConverter}. + * + * Allows the field name converter to be specified using a short-hand + * name, rather than the entire fully-qualified class name. + */ +public enum FieldNameConverters { --- End diff -- Ok. Check 'er out. ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r191574312 --- Diff: metron-platform/metron-common/src/main/java/org/apache/metron/common/field/FieldNameConverters.java --- @@ -0,0 +1,98 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ + +package org.apache.metron.common.field; + +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; + +/** + * Enumerates a set of {@link FieldNameConverter} implementations. + * + * Provides shared instances of each {@link FieldNameConverter}. + * + * Allows the field name converter to be specified using a short-hand + * name, rather than the entire fully-qualified class name. + */ +public enum FieldNameConverters { --- End diff -- Minor nit, can we have this implement `FieldNameConverter` so the enum can be used directly without calling `get()`? ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190977652 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- I can add the `create` to the `FieldNameConverters`. That makes sense here. Its going to be slightly different than what you have above, but is what you were going for. ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user mmiklavc commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190970420 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- Ok, I see what you mean. Agreed, it would be good to have some tests around that `write` method at some point, but I think it's ok to keep that out of scope for this particular PR. We have a WriterBoltIntegrationTest, but it's only covering parsers right now. At a later time we could look at expanding it to cover multiple writers and topologies and provide hooks on those builders to enable one-time setup and teardown across a suite of tests. ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190966698 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/SharedFieldNameConverterFactory.java --- @@ -0,0 +1,72 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; + +/** + * A {@link FieldNameConverterFactory} that returns instances of {@link FieldNameConverter} + * that are shared and reused. + * + * The instances are created and managed by the {@link FieldNameConverters} class. + */ +public class SharedFieldNameConverterFactory implements FieldNameConverterFactory { + --- End diff -- This looks like it's moving in the right direction, but as @mmiklavc gave an example for, why would we not replace this class with a `static create(String sensorType, WriterConfig config)` method on the enum? ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190964321 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- > I'm not sure I follow about the testing If you look closely at those tests, all that class tests is the response using `buildWriteReponse`. It does not actually test the class using the `write` method. When you try to do that, it blows up because of dependency issues. ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user mmiklavc commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190956913 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- Yes, agreed on the possible need for handling other writers. Is there a reason we wouldn't put that logic in the enum factory rather than creating another new class for it? It wouldn't even need to depend on the WriterConfiguration, actually. ``` public enum FieldNameConverters implements FieldNameConverter { NOOP(new NoopFieldNameConverter()), DEDOT(new DeDotFieldNameConverter()); private FieldNameConverter converter; FieldNameConverters(FieldNameConverter converter) { this.converter = converter; } public FieldNameConverter create(String sensorType, Optional converterName) { // default to the 'DEDOT' field name converter to maintain backwards compatibility return FieldNameConverters.valueOf(converterName.orElse("DEDOT")); } @Override public String convert(String originalField) { return converter.convert(originalField); } } ``` I'm not sure I follow about the testing - do you mean integration testing? We have an ElasticsearchWriterTest class that @justinleet wrote - https://github.com/apache/metron/blob/master/metron-platform/metron-elasticsearch/src/test/java/org/apache/metron/elasticsearch/writer/ElasticsearchWriterTest.java ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190950314 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- The prior commit 'b18c3bf' was just me fixing my original caching implementation, so at least I had a version of that with the bug fixed. The latest commit 'cee3994' removes the cache and relies on the `FactoryNameConverters` class to hand out shared instances. This is what we have discussed. ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190949109 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- The primary reason the logic was extracted into a separate class is because the `ElasticsearchWriter` cannot be easily unit tested as it stands today. I ran into tons of library conflicts when trying to do that. Keeping it separate also will make it easier to push out similar logic to other writers in the future; HDFS, Solr, should that be needed. ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user mmiklavc commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190946427 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- I actually liked what you had with your enum pattern. I was imaging a very minor change to your code, like below. The other change would be having the default converter value provided by the config class, or making it an Optional if you wanted the es writer to own setting the default: ``` public interface FieldNameConverter { String convert(String originalField); } public enum FieldNameConverters implements FieldNameConverter { NOOP(new NoopFieldNameConverter()), DEDOT(new DeDotFieldNameConverter()); private FieldNameConverter converter; FieldNameConverters(FieldNameConverter converter) { this.converter = converter; } @Override public String convert(String originalField) { return converter.convert(originalField); } } public class ElasticsearchWriter implements BulkMessageWriter, Serializable { write( ... ) { String converterType = config.getFieldNameConverter(sensorType); // or this -> String converterType = config.getFieldNameConverter(sensorType).orElse("DEDOT"); // fetch the field name converter for this sensor type FieldNameConverter fieldNameConverter = FieldNameConverters.valueOf(converterType); final String indexPostfix = dateFormat.format(new Date()); BulkRequestBuilder bulkRequest = client.prepareBulk(); for(JSONObject message: messages) { JSONObject esDoc = new JSONObject(); for(Object k : message.keySet()){ copyField(k.toString(), message, esDoc, fieldNameConverter); } ... } ... } private void copyField(... FieldNameConverter fieldNameConverter) { String destinationFieldName = fieldNameConverter.convert(sourceFieldName); ... } ... } ``` ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190926866 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- I still need a place to house the (sourceType, WriterConfiguration) -> FieldNameConverter logic. I will remove the cache mechanism though. ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user mmiklavc commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190915687 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- I definitely appreciate the thinking around state though I think we can safely leave that out for now. This isn't client-facing, so refactoring later would have less consequence than something we expose via config or the UI. I think you could probably use the enum as a factory directly and use FieldNameConverter converter = FieldNameConverters.valueOf(converterName).get() in ElasticsearchWriter line 80 directly. We do this a number of places as a combination strategy/factory pattern. https://github.com/apache/metron/blob/master/metron-platform/metron-common/src/main/java/org/apache/metron/common/utils/CompressionStrategies.java https://github.com/apache/metron/blob/master/metron-platform/metron-enrichment/src/main/java/org/apache/metron/enrichment/bolt/UnifiedEnrichmentBolt.java#L194 I definitely get wanting to provide a factory, and I think you effectively get that with the enum you already created. ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190742938 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- So you're saying, hey why not just... 1. Take out the caching and just have something like a `SimpleFieldNameConverterFactory`. This would translate the (sourceType, WriterConfiguration) to the right `FieldNameConverter`. 2. The caching would be handled by `FieldNameConverters` in the sense that they are effectively singletons. I think that would work if all `FieldNameConverter` implementations are stateless. And they all are stateless today, but I tried not to make that a constraint. I actually tried to make this work whether they are stateless or not. But alas there is a bug in what I did because I need each call to FieldNameConverters.get() to return a new instance. I'd be open to removing the cache, if its not needed and we think `FieldNameConverter`s would always remain stateless. ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190738333 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change + * to this configuration is recognized once the old {@link FieldNameConverter} expires + * from the cache. + * + * Defining a shorter expiration interval allows config changes to be recognized more + * quickly, but also can impact performance negatively. + */ +public class CachedFieldNameConverterFactory implements FieldNameConverterFactory { + + private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + /** + * A cache that contains a {@link FieldNameConverter} for each sensor type. + * + * A user can alter the {@link FieldNameConverter} for a given sensor at any time + * by altering the Indexing configuration. The actual {@link FieldNameConverter} + * in use for a given sensor will only change once the original converter has + * expired from the cache. + */ + private Cache fieldNameConverters; + + /** + * Creates a {@link CachedFieldNameConverterFactory}. + * + * @param expires The duration before {@link FieldNameConverter}s are expired. + * @param expiresUnits The units before {@link FieldNameConverter}s are expired. + */ + public CachedFieldNameConverterFactory(int expires, TimeUnit expiresUnits) { + +fieldNameConverters = createFieldNameConverterCache(expires, expiresUnits); + } + + /** + * Creates a {@link CachedFieldNameConverterFactory} where the cache expires after 5 minutes. + */ + public CachedFieldNameConverterFactory() { + +this(5, TimeUnit.MINUTES); + } + + /** + * Creates a {@link CachedFieldNameConverterFactory} using the given cache. This should only + * be used for testing. + * + * @param fieldNameConverters A {@link Cache} containing {@link FieldNameConverter}s. + */ + public CachedFieldNameConverterFactory(Cache fieldNameConverters) { + +this.fieldNameConverters = fieldNameConverters; + } + + /** + * Creates a cache of {@link FieldNameConverter}s, one for each source type. + * + * @return A cache of {@link FieldNameConverter}s. + */ + private Cache createFieldNameConverterCache(i
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190737770 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change + * to this configuration is recognized once the old {@link FieldNameConverter} expires + * from the cache. + * + * Defining a shorter expiration interval allows config changes to be recognized more + * quickly, but also can impact performance negatively. + */ +public class CachedFieldNameConverterFactory implements FieldNameConverterFactory { + + private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + /** + * A cache that contains a {@link FieldNameConverter} for each sensor type. + * + * A user can alter the {@link FieldNameConverter} for a given sensor at any time + * by altering the Indexing configuration. The actual {@link FieldNameConverter} + * in use for a given sensor will only change once the original converter has + * expired from the cache. + */ + private Cache fieldNameConverters; + + /** + * Creates a {@link CachedFieldNameConverterFactory}. + * + * @param expires The duration before {@link FieldNameConverter}s are expired. + * @param expiresUnits The units before {@link FieldNameConverter}s are expired. + */ + public CachedFieldNameConverterFactory(int expires, TimeUnit expiresUnits) { + +fieldNameConverters = createFieldNameConverterCache(expires, expiresUnits); + } + + /** + * Creates a {@link CachedFieldNameConverterFactory} where the cache expires after 5 minutes. + */ + public CachedFieldNameConverterFactory() { + +this(5, TimeUnit.MINUTES); + } + + /** + * Creates a {@link CachedFieldNameConverterFactory} using the given cache. This should only + * be used for testing. + * + * @param fieldNameConverters A {@link Cache} containing {@link FieldNameConverter}s. + */ + public CachedFieldNameConverterFactory(Cache fieldNameConverters) { + +this.fieldNameConverters = fieldNameConverters; + } + + /** + * Creates a cache of {@link FieldNameConverter}s, one for each source type. + * + * @return A cache of {@link FieldNameConverter}s. + */ + private Cache createFieldNameConverterCache
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user cestella commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190730655 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change + * to this configuration is recognized once the old {@link FieldNameConverter} expires + * from the cache. + * + * Defining a shorter expiration interval allows config changes to be recognized more + * quickly, but also can impact performance negatively. + */ +public class CachedFieldNameConverterFactory implements FieldNameConverterFactory { + + private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + /** + * A cache that contains a {@link FieldNameConverter} for each sensor type. + * + * A user can alter the {@link FieldNameConverter} for a given sensor at any time + * by altering the Indexing configuration. The actual {@link FieldNameConverter} + * in use for a given sensor will only change once the original converter has + * expired from the cache. + */ + private Cache fieldNameConverters; + + /** + * Creates a {@link CachedFieldNameConverterFactory}. + * + * @param expires The duration before {@link FieldNameConverter}s are expired. + * @param expiresUnits The units before {@link FieldNameConverter}s are expired. + */ + public CachedFieldNameConverterFactory(int expires, TimeUnit expiresUnits) { + +fieldNameConverters = createFieldNameConverterCache(expires, expiresUnits); + } + + /** + * Creates a {@link CachedFieldNameConverterFactory} where the cache expires after 5 minutes. + */ + public CachedFieldNameConverterFactory() { + +this(5, TimeUnit.MINUTES); + } + + /** + * Creates a {@link CachedFieldNameConverterFactory} using the given cache. This should only + * be used for testing. + * + * @param fieldNameConverters A {@link Cache} containing {@link FieldNameConverter}s. + */ + public CachedFieldNameConverterFactory(Cache fieldNameConverters) { + +this.fieldNameConverters = fieldNameConverters; + } + + /** + * Creates a cache of {@link FieldNameConverter}s, one for each source type. + * + * @return A cache of {@link FieldNameConverter}s. + */ + private Cache createFieldNameConverterCache(i
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user mmiklavc commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190727392 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- I think I'm just not fully grokking why we need additional caching here rather than `FieldNameConverters.DEDOT.get();`. It seems like the enum handles caching by virtue of the constructors in the enum itself? `NOOP(new NoopFieldNameConverter())` ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190715353 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- To help clarify, I could have implemented a different `FieldNameConverterFactory`; maybe one called `SimpleFieldNameConverterFactory`. This factory wouldn't even have to use a cache. Every time `ElasticsearchWriter` calls `create(String sensorType, WriterConfiguration config)`, it could just instantiate a new instance of the right `FieldNameConverter`. But of course, I didn't do that because this occurs for every field in every message that is written to Elasticsearch. The cache is there for performance. I offer this only as an example to help clarify the purpose of this class. ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190702283 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- Maybe I am misunderstanding you, but the ConfigurationsUpdater ensures that our configuration classes stay in-sync with Zk. That's now what this class does. This maintains a `FieldNameConverter` instance for each source type that is then used by the `ElasticsearchWriter` to do the field name conversion. Are you proposing a different way to do this? ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user nickwallen commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190701068 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change + * to this configuration is recognized once the old {@link FieldNameConverter} expires + * from the cache. + * + * Defining a shorter expiration interval allows config changes to be recognized more + * quickly, but also can impact performance negatively. + */ +public class CachedFieldNameConverterFactory implements FieldNameConverterFactory { + + private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + /** + * A cache that contains a {@link FieldNameConverter} for each sensor type. + * + * A user can alter the {@link FieldNameConverter} for a given sensor at any time + * by altering the Indexing configuration. The actual {@link FieldNameConverter} + * in use for a given sensor will only change once the original converter has + * expired from the cache. + */ + private Cache fieldNameConverters; + + /** + * Creates a {@link CachedFieldNameConverterFactory}. + * + * @param expires The duration before {@link FieldNameConverter}s are expired. + * @param expiresUnits The units before {@link FieldNameConverter}s are expired. + */ + public CachedFieldNameConverterFactory(int expires, TimeUnit expiresUnits) { + +fieldNameConverters = createFieldNameConverterCache(expires, expiresUnits); + } + + /** + * Creates a {@link CachedFieldNameConverterFactory} where the cache expires after 5 minutes. + */ + public CachedFieldNameConverterFactory() { + +this(5, TimeUnit.MINUTES); + } + + /** + * Creates a {@link CachedFieldNameConverterFactory} using the given cache. This should only + * be used for testing. + * + * @param fieldNameConverters A {@link Cache} containing {@link FieldNameConverter}s. + */ + public CachedFieldNameConverterFactory(Cache fieldNameConverters) { + +this.fieldNameConverters = fieldNameConverters; + } + + /** + * Creates a cache of {@link FieldNameConverter}s, one for each source type. + * + * @return A cache of {@link FieldNameConverter}s. + */ + private Cache createFieldNameConverterCache
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user mmiklavc commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190698390 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change + * to this configuration is recognized once the old {@link FieldNameConverter} expires + * from the cache. + * + * Defining a shorter expiration interval allows config changes to be recognized more + * quickly, but also can impact performance negatively. + */ +public class CachedFieldNameConverterFactory implements FieldNameConverterFactory { + + private static final Logger LOG = LoggerFactory.getLogger(MethodHandles.lookup().lookupClass()); + + /** + * A cache that contains a {@link FieldNameConverter} for each sensor type. + * + * A user can alter the {@link FieldNameConverter} for a given sensor at any time + * by altering the Indexing configuration. The actual {@link FieldNameConverter} + * in use for a given sensor will only change once the original converter has + * expired from the cache. + */ + private Cache fieldNameConverters; + + /** + * Creates a {@link CachedFieldNameConverterFactory}. + * + * @param expires The duration before {@link FieldNameConverter}s are expired. + * @param expiresUnits The units before {@link FieldNameConverter}s are expired. + */ + public CachedFieldNameConverterFactory(int expires, TimeUnit expiresUnits) { + +fieldNameConverters = createFieldNameConverterCache(expires, expiresUnits); + } + + /** + * Creates a {@link CachedFieldNameConverterFactory} where the cache expires after 5 minutes. + */ + public CachedFieldNameConverterFactory() { + +this(5, TimeUnit.MINUTES); + } + + /** + * Creates a {@link CachedFieldNameConverterFactory} using the given cache. This should only + * be used for testing. + * + * @param fieldNameConverters A {@link Cache} containing {@link FieldNameConverter}s. + */ + public CachedFieldNameConverterFactory(Cache fieldNameConverters) { + +this.fieldNameConverters = fieldNameConverters; + } + + /** + * Creates a cache of {@link FieldNameConverter}s, one for each source type. + * + * @return A cache of {@link FieldNameConverter}s. + */ + private Cache createFieldNameConverterCache(i
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
Github user mmiklavc commented on a diff in the pull request: https://github.com/apache/metron/pull/1022#discussion_r190700071 --- Diff: metron-platform/metron-elasticsearch/src/main/java/org/apache/metron/elasticsearch/writer/CachedFieldNameConverterFactory.java --- @@ -0,0 +1,155 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.metron.elasticsearch.writer; + +import com.github.benmanes.caffeine.cache.Cache; +import com.github.benmanes.caffeine.cache.Caffeine; +import org.apache.commons.lang.ClassUtils; +import org.apache.commons.lang3.StringUtils; +import org.apache.commons.lang3.exception.ExceptionUtils; +import org.apache.metron.common.configuration.writer.WriterConfiguration; +import org.apache.metron.common.field.DeDotFieldNameConverter; +import org.apache.metron.common.field.FieldNameConverter; +import org.apache.metron.common.field.FieldNameConverters; +import org.apache.metron.common.field.NoopFieldNameConverter; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.lang.invoke.MethodHandles; +import java.util.concurrent.TimeUnit; + +/** + * A {@link FieldNameConverterFactory} that is backed by a cache. + * + * Each sensor type can use a different {@link FieldNameConverter} implementation. + * + * The {@link WriterConfiguration} allows a user to define the {@link FieldNameConverter} + * that should be used for a given sensor type. + * + * The {@link FieldNameConverter}s are maintained in a cache for a fixed period of time + * after they are created. Once they expire, the {@link WriterConfiguration} is used to + * reload the {@link FieldNameConverter}. + * + * The user can change the {@link FieldNameConverter} in use at runtime. A change --- End diff -- I want to say that this is handled through the ConfigurationsUpdater classes already already. I believe the writer passes in an updated config on each call. https://github.com/apache/metron/blob/master/metron-platform/metron-common/src/main/java/org/apache/metron/common/zookeeper/configurations/ConfigurationsUpdater.java ---
[GitHub] metron pull request #1022: METRON-1569 Allow user to change field name conve...
GitHub user nickwallen opened a pull request: https://github.com/apache/metron/pull/1022 METRON-1569 Allow user to change field name conversion when indexing ⦠The `ElasticsearchWriter` has a mechanism to transform the field names of a message before it is written to Elasticsearch. Right now this mechanism is hard-coded to replace all '.' dots with ':' colons. This mechanism was needed for Elasticsearch 2.x which did not allow dots in field names. Now that Metron supports Elasticsearch 5.x this is no longer a problem. A user should be able to configure the field name transformation when writing to Elasticsearch, as needed. While it might have been simpler to just remove the de-dotting mechanism, this would break backwards compatibility. Taking this approach provides users with an upgrade path. ## Changes This change allows the user to configure the field name converter as part of the index writer configuration. Acceptable values include the following. * `DEDOT`: Replaces all '.' with ':' which is the default, backwards compatible behavior. * `NOOP`: No field name change. If no "fieldNameConverter" is defined, it defaults to using `DEDOT` which maintains backwards compatibility. A cache of `FieldNameConverter`s is maintained since the index writer configuration can be changed at run-time and each sensor has its own index writer configuration. An example configuration looks-like the following. ``` { "hdfs" : { "enabled" : false }, "elasticsearch" : { "index" : "bro", "batchSize" : 5, "enabled" : true, "fieldNameConverter": "NOOP" }, "solr" : { "enabled" : false } } ``` ## Code Changes * Added the `fieldNameConverter` parameter to the Index writer configuration. * Moved the `FieldNameConverter` implementations to a dedicated package in `metron-common`. * Renamed `ElasticsearchFieldNameConverter` to `DeDotFieldNameConverter`. * Implemented the `NoopFieldNameConverter` which does not modify the field name. * Created `FieldNameConverters` class that allows a user to specify either `DEDOT` or `NOOP` to choose the appropriate implementation. * Implemented a `CachedFieldNameConverterFactory` that encapsulates all the logic for choosing and instantiating the appropriate `FieldNameConverter`. * Updated `ElasticsearchWriter` to use the `CachedFieldNameConverterFactory`. * Updated the README to document the new configuration parameter. ## Manual Testing 1. Launch a development environment and login. ``` vagrant ssh sudo su - source /etc/default/metron ``` 1. Validate the environment by ensuring alerts are visible in the Alerts UI and that the Ambari Service Check completes successfully. This ensures that the change is backwards compatible. 1. Login to the Storm UI and enable DEBUG logging for `org.apache.metron.common` and `org.apache.metron.elasticsearch`. 1. The Storm worker logs in `/var/log/storm/worker-artifacts/random_access_indexing*/worker.log` should contain the following log statements, if you have enabled DEBUG logging correctly. This shows that the default `DEDOT` converter is in-use. ``` 2018-05-22 14:38:... [DEBUG] Renamed dotted field; original=source.type, new=source:type 2018-05-22 14:38:... [DEBUG] Renamed dotted field; original=adapter.geoadapter.end.ts, new=adapter:geoadapter:end:ts 2018-05-22 14:38:... [DEBUG] Renamed dotted field; original=threatintelsplitterbolt.splitter.end.ts, new=threatintelsplitterbolt:splitter:end:ts 2018-05-22 14:38:... [DEBUG] Renamed dotted field; original=adapter.threatinteladapter.begin.ts, new=adapter:threatinteladapter:begin:ts 2018-05-22 14:38:... [DEBUG] Renamed dotted field; original=enrichments.geo.ip_dst_addr.location_point, new=enrichments:geo:ip_dst_addr:location_point 2018-05-22 14:38:... [DEBUG] Renamed dotted field; original=adapter.threatinteladapter.end.ts, new=adapter:threatinteladapter:end:ts 2018-05-22 14:38:... [DEBUG] Renamed dotted field; original=enrichmentsplitterbolt.splitter.end.ts, new=enrichmentsplitterbolt:splitter:end:ts ``` 1. Launch the REPL. ``` ./bin/stellar -z $ZOOKEEPER ``` 1. Change the field name converter to NOOP. ``` [Stellar]>>> conf := SHELL_EDIT() { "hdfs" : { "enabled" : false }, "elasticsearch" : { "index" : "bro", "batchSize" : 5, "enabled" : true, "fieldNameConverter": "NOOP" }, "solr" : {