[
https://issues.apache.org/jira/browse/DRILL-8289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17598720#comment-17598720
]
ASF GitHub Bot commented on DRILL-8289:
---------------------------------------
cgivre commented on code in PR #2634:
URL: https://github.com/apache/drill/pull/2634#discussion_r960184271
##########
contrib/udfs/src/main/java/org/apache/drill/exec/udfs/ThreatHuntingFunctions.java:
##########
@@ -0,0 +1,179 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements. See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership. The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.udfs;
+
+import io.netty.buffer.DrillBuf;
+import org.apache.drill.exec.expr.DrillSimpleFunc;
+import org.apache.drill.exec.expr.annotations.FunctionTemplate;
+import org.apache.drill.exec.expr.annotations.Output;
+import org.apache.drill.exec.expr.annotations.Param;
+import org.apache.drill.exec.expr.holders.Float8Holder;
+import org.apache.drill.exec.expr.holders.VarCharHolder;
+
+import javax.inject.Inject;
+
+public class ThreatHuntingFunctions {
+ /**
+ * Punctuation pattern is useful for comparing log entries. It extracts the
all the punctuation and returns
+ * that pattern. Spaces are replaced with an underscore.
+ * <p>
+ * Usage: SELECT punctuation_pattern( string ) FROM...
+ */
+ @FunctionTemplate(names = {"punctuation_pattern", "punctuationPattern"},
+ scope = FunctionTemplate.FunctionScope.SIMPLE,
+ nulls = FunctionTemplate.NullHandling.NULL_IF_NULL)
+ public static class PunctuationPatternFunction implements DrillSimpleFunc {
+
+ @Param
+ VarCharHolder rawInput;
+
+ @Output
+ VarCharHolder out;
+
+ @Inject
+ DrillBuf buffer;
+
+ @Override
+ public void setup() {
+ }
+
+ @Override
+ public void eval() {
+
+ String input =
org.apache.drill.exec.expr.fn.impl.StringFunctionHelpers.toStringFromUTF8(rawInput.start,
rawInput.end, rawInput.buffer);
+
+ String punctuationPattern = input.replaceAll("[a-zA-Z0-9]", "");
+ punctuationPattern = punctuationPattern.replaceAll(" ", "_");
+
+ out.buffer = buffer;
+ out.start = 0;
+ out.end = punctuationPattern.getBytes().length;
Review Comment:
Fixed
> Add Threat Hunting Functions
> ----------------------------
>
> Key: DRILL-8289
> URL: https://issues.apache.org/jira/browse/DRILL-8289
> Project: Apache Drill
> Issue Type: New Feature
> Components: Functions - Drill
> Affects Versions: 2.0.0
> Reporter: Charles Givre
> Assignee: Charles Givre
> Priority: Major
> Fix For: 2.0.0
>
>
> # Threat Hunting Functions
> These functions are useful for doing threat hunting with Apache Drill. These
> were inspired by huntlib.[1]
> The functions are:
> * `punctuation_pattern(<string>)`: Extracts the pattern of punctuation in
> text.
> * `entropy(<string>)`: This function calculates the Shannon Entropy of a
> given string of text.
> * `entropyPerByte(<string>)`: This function calculates the Shannon Entropy of
> a given string of text, normed for the string length.
> [1]: https://github.com/target/huntlib
--
This message was sent by Atlassian Jira
(v8.20.10#820010)