[jira] [Commented] (DRILL-8376) Add Distribution UDFs

2022-12-24 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17651851#comment-17651851
 ] 

ASF GitHub Bot commented on DRILL-8376:
---

cgivre opened a new pull request, #2729:
URL: https://github.com/apache/drill/pull/2729

   # [DRILL-8376](https://issues.apache.org/jira/browse/DRILL-8376): Add 
Distribution UDFs
   
   ## Description
   This PR adds several new UDFs to help with statistical analysis.  They are 
`width_bucket` which mirrors the functionality of the POSTGRES function of the 
same name. 
(https://www.oreilly.com/library/view/sql-in-a/9780596155322/re91.html).  This 
function is useful for building histograms of data.
   
   This also adds the `kendall_correlation` and `pearson_correlation` functions 
which are two function for calculating correlation coefficients of two columns.
   
   ## Documentation
   Updated README.
   
   ## Testing
   Added unit tests.




> Add Distribution UDFs
> -
>
> Key: DRILL-8376
> URL: https://issues.apache.org/jira/browse/DRILL-8376
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.21
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
>
> Add `width_bucket`, `pearson_correlation` and `kendall_correlation` to Drill



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8372) Unfreed buffers when running a LIMIT 0 query over delimited text

2022-12-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17651306#comment-17651306
 ] 

ASF GitHub Bot commented on DRILL-8372:
---

jnturton opened a new pull request, #2728:
URL: https://github.com/apache/drill/pull/2728

   # [DRILL-8372](https://issues.apache.org/jira/browse/DRILL-8372): Unfreed 
buffers when running a LIMIT 0 query over delimited text
   
   ## Description
   
   TODO
   
   ## Documentation
   N/A
   
   ## Testing
   TODO
   




> Unfreed buffers when running a LIMIT 0 query over delimited text
> 
>
> Key: DRILL-8372
> URL: https://issues.apache.org/jira/browse/DRILL-8372
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Text  CSV
>Affects Versions: 1.21.0
>Reporter: James Turton
>Assignee: James Turton
>Priority: Major
> Fix For: 1.21.0
>
>
> With the following data layout
>  
> {code:java}
> /tmp/foo/bar:
> large_csv.csvh
> /tmp/foo/boo:
> large_csv.csvh
> {code}
> a LIMIT 0 query over it results in unfreed buffer errors.
> {code:java}
> apache drill (dfs.tmp)> select * from `foo` limit 0;
> Error: SYSTEM ERROR: IllegalStateException: Allocator[op:0:0:4:EasySubScan] 
> closed with outstanding buffers allocated (3).
> Allocator(op:0:0:4:EasySubScan) 100/299008/3182592/100 
> (res/actual/peak/limit)
>   child allocators: 0
>   ledgers: 3
>     ledger[113] allocator: op:0:0:4:EasySubScan), isOwning: true, size: 
> 262144, references: 1, life: 277785186322881..0, allocatorManager: [109, 
> life: 277785186258906..0] holds 1 buffers.
>         DrillBuf[142], udle: [110 0..262144]
>     ledger[114] allocator: op:0:0:4:EasySubScan), isOwning: true, size: 
> 32768, references: 1, life: 277785186463824..0, allocatorManager: [110, life: 
> 277785186414654..0] holds 1 buffers.
>         DrillBuf[143], udle: [111 0..32768]
>     ledger[112] allocator: op:0:0:4:EasySubScan), isOwning: true, size: 4096, 
> references: 1, life: 277785186046095..0, allocatorManager: [108, life: 
> 277785185921147..0] holds 1 buffers.
>         DrillBuf[141], udle: [109 0..4096]
>   reservations: 0 {code}
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8374) Set the Drill development version to 1.21.0-SNAPSHOT

2022-12-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17651280#comment-17651280
 ] 

ASF GitHub Bot commented on DRILL-8374:
---

jnturton opened a new pull request, #2727:
URL: https://github.com/apache/drill/pull/2727

   # [DRILL-8374](https://issues.apache.org/jira/browse/DRILL-8374): Set the 
Drill development version to 1.21.0-SNAPSHOT
   
   ## Description
   Changes the Maven version numbers in the Drill master branch from 2.0.0 to 
1.21.0. Discussion in the Drill mailing list established that the project would 
prefer to do a release in the near future than to wait to build up a changset 
for which a version jump to 2.0 would be appropriate.
   
   ## Documentation
   N/A
   
   ## Testing
   N/A
   




> Set the Drill development version to 1.21.0-SNAPSHOT
> 
>
> Key: DRILL-8374
> URL: https://issues.apache.org/jira/browse/DRILL-8374
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.21.0
>Reporter: James Turton
>Assignee: James Turton
>Priority: Trivial
> Fix For: 1.21.0
>
>
> Changes the Maven version numbers in the Drill master branch from 2.0.0 to 
> 1.21.0. Discussion in the Drill mailing list established that the project 
> would prefer to do a release in the near future than to wait to build up a 
> changset for which a version jump to 2.0 would be appropriate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17651260#comment-17651260
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

cgivre merged PR #2722:
URL: https://github.com/apache/drill/pull/2722




> Add Write/Append Capability to Splunk Plugin
> 
>
> Key: DRILL-8371
> URL: https://issues.apache.org/jira/browse/DRILL-8371
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Splunk
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> While Drill can currently read from Splunk indexes, it cannot write to them 
> or create them.  This proposed PR adds support for CTAS queries for Splunk as 
> well as INSERT and DROP TABLE. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17651130#comment-17651130
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

jnturton commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1055192926


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -98,27 +100,69 @@ public void updateSchema(VectorAccessible batch) {
   @Override
   public void startRecord() {
 logger.debug("Starting record");
-// Ensure that the new record is empty. This is not strictly necessary, 
but it is a belt and suspenders approach.
-splunkEvent.clear();
+// Ensure that the new record is empty.
+splunkEvent = new JSONObject();
   }
 
   @Override
-  public void endRecord() throws IOException {
+  public void endRecord() {
 logger.debug("Ending record");
+recordCount++;
+
+// Put event in buffer
+eventBuffer.add(splunkEvent);
+
 // Write the event to the Splunk index
-destinationIndex.submit(eventArgs, splunkEvent.toJSONString());
-// Clear out the splunk event.
-splunkEvent.clear();
+if (recordCount >= config.getPluginConfig().getWriterBatchSize()) {
+  try {
+writeEvents();
+  } catch (IOException e) {
+throw  UserException.dataWriteError(e)
+.message("Error writing data to Splunk: " + e.getMessage())
+.build(logger);
+  }
+
+  // Reset record count
+  recordCount = 0;
+}
   }
 
+
+  /*
+  args – Optional arguments for this stream. Valid parameters are: "host", 
"host_regex", "source", and "sourcetype".
+   */
   @Override
   public void abort() {
+logger.debug("Aborting writing records to Splunk.");
 // No op
   }
 
   @Override
   public void cleanup() {
-// No op
+try {
+  writeEvents();
+} catch (IOException e) {
+  throw  UserException.dataWriteError(e)
+  .message("Error writing data to Splunk: " + e.getMessage())
+  .build(logger);
+}
+  }
+
+  private void writeEvents() throws IOException {
+// Open the socket and stream, set up a timestamp
+destinationIndex.attachWith(new ReceiverBehavior() {

Review Comment:
   This results in a dedicated TCP socket being opened and closed for every 
writer batch.



##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -98,27 +100,69 @@ public void updateSchema(VectorAccessible batch) {
   @Override
   public void startRecord() {
 logger.debug("Starting record");
-// Ensure that the new record is empty. This is not strictly necessary, 
but it is a belt and suspenders approach.
-splunkEvent.clear();
+// Ensure that the new record is empty.
+splunkEvent = new JSONObject();
   }
 
   @Override
-  public void endRecord() throws IOException {
+  public void endRecord() {
 logger.debug("Ending record");
+recordCount++;
+
+// Put event in buffer
+eventBuffer.add(splunkEvent);
+
 // Write the event to the Splunk index
-destinationIndex.submit(eventArgs, splunkEvent.toJSONString());
-// Clear out the splunk event.
-splunkEvent.clear();
+if (recordCount >= config.getPluginConfig().getWriterBatchSize()) {
+  try {
+writeEvents();
+  } catch (IOException e) {
+throw  UserException.dataWriteError(e)
+.message("Error writing data to Splunk: " + e.getMessage())
+.build(logger);
+  }
+
+  // Reset record count
+  recordCount = 0;
+}
   }
 
+
+  /*
+  args – Optional arguments for this stream. Valid parameters are: "host", 
"host_regex", "source", and "sourcetype".
+   */
   @Override
   public void abort() {
+logger.debug("Aborting writing records to Splunk.");

Review Comment:
   Would there be any use in clearing eventBuffer here?





> Add Write/Append Capability to Splunk Plugin
> 
>
> Key: DRILL-8371
> URL: https://issues.apache.org/jira/browse/DRILL-8371
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Splunk
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> While Drill can currently read from Splunk indexes, it cannot write to them 
> or create them.  This proposed PR adds support for CTAS queries for Splunk as 
> well as INSERT and DROP TABLE. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17650079#comment-17650079
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1053981884


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,309 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private final JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+this.splunkEvent = new JSONObject();
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called before starting writing the 
records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+// Ensure that the new record is empty. This is not strictly necessary, 
but it is a belt and suspenders approach.
+splunkEvent.clear();

Review Comment:
   I removed this from the `endRecord` method.





> Add Write/Append Capability to Splunk Plugin
> 
>
> Key: DRILL-8371
> URL: https://issues.apache.org/jira/browse/DRILL-8371
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Splunk
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
>  

[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17650077#comment-17650077
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1053979483


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void endRecord() throws IOException {
+logger.debug("Ending record");
+// Write the event to the Splunk index
+destinationIndex.submit(eventArgs, splunkEvent.toJSONString());

Review Comment:
   @jnturton I figured this out.   Using Splunk's sample code from their SDK 
documentation resulted in Splunk not parsing the fields correctly which broke 
all the unit tests, and didn't work.  I did some experiments and found that 
removing the date actually solved the issue.
   
   Splunk's SDK provides a method for writing to a socket which does all the 
error handling.  I used that because that was what the docs 

[jira] [Commented] (DRILL-8179) Convert LTSV Format Plugin to EVF2

2022-12-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17650045#comment-17650045
 ] 

ASF GitHub Bot commented on DRILL-8179:
---

cgivre merged PR #2725:
URL: https://github.com/apache/drill/pull/2725




> Convert LTSV Format Plugin to EVF2
> --
>
> Key: DRILL-8179
> URL: https://issues.apache.org/jira/browse/DRILL-8179
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.20.1
>Reporter: Jingchuan Hu
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Get authorized by Charles, continue the conversion from LTSV to EVF2 directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8179) Convert LTSV Format Plugin to EVF2

2022-12-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649870#comment-17649870
 ] 

ASF GitHub Bot commented on DRILL-8179:
---

cgivre commented on code in PR #2725:
URL: https://github.com/apache/drill/pull/2725#discussion_r1053498604


##
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVBatchReader.java:
##
@@ -0,0 +1,264 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.ltsv;
+
+import com.github.lolo.ltsv.LtsvParser;
+import com.github.lolo.ltsv.LtsvParser.Builder;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.drill.common.AutoCloseables;
+import org.apache.drill.common.exceptions.CustomErrorContext;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.physical.impl.scan.v3.ManagedReader;
+import org.apache.drill.exec.physical.impl.scan.v3.file.FileDescrip;
+import org.apache.drill.exec.physical.impl.scan.v3.file.FileSchemaNegotiator;
+import org.apache.drill.exec.physical.resultSet.ResultSetLoader;
+import org.apache.drill.exec.physical.resultSet.RowSetLoader;
+import org.apache.drill.exec.record.metadata.ColumnMetadata;
+import org.apache.drill.exec.record.metadata.MetadataUtils;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.exec.vector.accessor.ScalarWriter;
+import org.apache.drill.shaded.guava.com.google.common.base.Strings;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.time.Instant;
+import java.time.LocalDate;
+import java.time.LocalTime;
+import java.time.format.DateTimeFormatter;
+import java.util.Date;
+import java.util.Iterator;
+import java.util.Map;
+
+public class LTSVBatchReader implements ManagedReader {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(LTSVBatchReader.class);
+  private final LTSVFormatPluginConfig config;
+  private final FileDescrip file;
+  private final CustomErrorContext errorContext;
+  private final LtsvParser ltsvParser;
+  private final RowSetLoader rowWriter;
+  private final FileSchemaNegotiator negotiator;
+  private InputStream fsStream;
+  private Iterator> rowIterator;
+
+
+  public LTSVBatchReader(LTSVFormatPluginConfig config, FileSchemaNegotiator 
negotiator) {
+this.config = config;
+this.negotiator = negotiator;
+file = negotiator.file();
+errorContext = negotiator.parentErrorContext();
+ltsvParser = buildParser();
+
+openFile();
+
+// If there is a provided schema, import it
+if (negotiator.providedSchema() != null) {
+  TupleMetadata schema = negotiator.providedSchema();
+  negotiator.tableSchema(schema, false);
+}
+ResultSetLoader loader = negotiator.build();
+rowWriter = loader.writer();
+
+  }
+
+  private void openFile() {
+try {
+  fsStream = 
file.fileSystem().openPossiblyCompressedStream(file.split().getPath());
+} catch (IOException e) {
+  throw UserException
+  .dataReadError(e)
+  .message("Unable to open LTSV File %s", file.split().getPath() + " " 
+ e.getMessage())
+  .addContext(errorContext)
+  .build(logger);
+}
+rowIterator = ltsvParser.parse(fsStream);
+  }
+
+  @Override
+  public boolean next() {
+while (!rowWriter.isFull()) {
+  if (!processNextRow()) {
+return false;
+  }
+}
+return true;
+  }
+
+  private LtsvParser buildParser() {
+Builder builder = LtsvParser.builder();
+builder.trimKeys();
+builder.trimValues();
+builder.skipNullValues();
+
+if (config.getParseMode().contentEquals("strict")) {
+  builder.strict();
+} else {
+  builder.lenient();
+}
+
+if (StringUtils.isNotEmpty(config.getEscapeCharacter())) {
+  builder.withEscapeChar(config.getEscapeCharacter().charAt(0));
+}
+
+if 

[jira] [Commented] (DRILL-8179) Convert LTSV Format Plugin to EVF2

2022-12-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649852#comment-17649852
 ] 

ASF GitHub Bot commented on DRILL-8179:
---

jnturton commented on code in PR #2725:
URL: https://github.com/apache/drill/pull/2725#discussion_r1053465233


##
contrib/format-ltsv/src/test/java/org/apache/drill/exec/store/ltsv/TestLTSVRecordReader.java:
##
@@ -37,34 +42,77 @@ public static void setup() throws Exception {
 
   @Test
   public void testWildcard() throws Exception {
-testBuilder()
-  .sqlQuery("SELECT * FROM cp.`simple.ltsv`")
-  .unOrdered()
-  .baselineColumns("host", "forwardedfor", "req", "status", "size", 
"referer", "ua", "reqtime", "apptime", "vhost")
-  .baselineValues("xxx.xxx.xxx.xxx", "-", "GET /v1/xxx HTTP/1.1", "200", 
"4968", "-", "Java/1.8.0_131", "2.532", "2.532", "api.example.com")
-  .baselineValues("xxx.xxx.xxx.xxx", "-", "GET /v1/yyy HTTP/1.1", "200", 
"412", "-", "Java/1.8.0_201", "3.580", "3.580", "api.example.com")
-  .go();
+String sql = "SELECT * FROM cp.`simple.ltsv`";

Review Comment:
   Let's rename this class TestLTSVQueries or similar now that LTSVRecordReader 
is gone?



##
contrib/format-ltsv/src/main/java/org/apache/drill/exec/store/ltsv/LTSVBatchReader.java:
##
@@ -0,0 +1,264 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.ltsv;
+
+import com.github.lolo.ltsv.LtsvParser;
+import com.github.lolo.ltsv.LtsvParser.Builder;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.drill.common.AutoCloseables;
+import org.apache.drill.common.exceptions.CustomErrorContext;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.physical.impl.scan.v3.ManagedReader;
+import org.apache.drill.exec.physical.impl.scan.v3.file.FileDescrip;
+import org.apache.drill.exec.physical.impl.scan.v3.file.FileSchemaNegotiator;
+import org.apache.drill.exec.physical.resultSet.ResultSetLoader;
+import org.apache.drill.exec.physical.resultSet.RowSetLoader;
+import org.apache.drill.exec.record.metadata.ColumnMetadata;
+import org.apache.drill.exec.record.metadata.MetadataUtils;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.exec.vector.accessor.ScalarWriter;
+import org.apache.drill.shaded.guava.com.google.common.base.Strings;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.io.InputStream;
+import java.text.ParseException;
+import java.text.SimpleDateFormat;
+import java.time.Instant;
+import java.time.LocalDate;
+import java.time.LocalTime;
+import java.time.format.DateTimeFormatter;
+import java.util.Date;
+import java.util.Iterator;
+import java.util.Map;
+
+public class LTSVBatchReader implements ManagedReader {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(LTSVBatchReader.class);
+  private final LTSVFormatPluginConfig config;
+  private final FileDescrip file;
+  private final CustomErrorContext errorContext;
+  private final LtsvParser ltsvParser;
+  private final RowSetLoader rowWriter;
+  private final FileSchemaNegotiator negotiator;
+  private InputStream fsStream;
+  private Iterator> rowIterator;
+
+
+  public LTSVBatchReader(LTSVFormatPluginConfig config, FileSchemaNegotiator 
negotiator) {
+this.config = config;
+this.negotiator = negotiator;
+file = negotiator.file();
+errorContext = negotiator.parentErrorContext();
+ltsvParser = buildParser();
+
+openFile();
+
+// If there is a provided schema, import it
+if (negotiator.providedSchema() != null) {
+  TupleMetadata schema = negotiator.providedSchema();
+  negotiator.tableSchema(schema, false);
+}
+ResultSetLoader loader = negotiator.build();
+rowWriter = loader.writer();
+
+  }
+
+  private void openFile() {
+try {
+  fsStream = 
file.fileSystem().openPossiblyCompressedStream(file.split().getPath());
+} catch (IOException e) {
+  throw 

[jira] [Commented] (DRILL-8328) HTTP UDF Not Resolving Storage Aliases

2022-12-20 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8328?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649823#comment-17649823
 ] 

ASF GitHub Bot commented on DRILL-8328:
---

jnturton commented on PR #2668:
URL: https://github.com/apache/drill/pull/2668#issuecomment-1359539411

   I've just removed the backport-to-stable tag since these UDFs arrived after 
Drill 1.20. Thanks to @kingswanwho for spotting this.




> HTTP UDF Not Resolving Storage Aliases
> --
>
> Key: DRILL-8328
> URL: https://issues.apache.org/jira/browse/DRILL-8328
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HTTP
>Affects Versions: 1.20.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Blocker
> Fix For: 1.20.3
>
>
> The http_request function currently does not resolve plugin aliases 
> correctly.  This PR fixes that issue. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8179) Convert LTSV Format Plugin to EVF2

2022-12-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8179?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649534#comment-17649534
 ] 

ASF GitHub Bot commented on DRILL-8179:
---

cgivre opened a new pull request, #2725:
URL: https://github.com/apache/drill/pull/2725

   # [DRILL-8179](https://issues.apache.org/jira/browse/DRILL-8179): Convert 
LTSV Format Plugin to EVF2
   
   ## Description
   With this PR, all format plugins are now using the EVF readers.   This is 
part of [DRILL-8132](https://issues.apache.org/jira/browse/DRILL-8312).  
   
   ## Documentation
   In addition to refactoring the plugin to use EVF V2, this code replaces the 
homegrown LTSV reader with a module that parses the data, and introduces new 
configuration variables.  These variables are all noted in the updated README.  
However they are all optional, so the user is not likely to notice any real 
difference. 
   
   One exception is the variable which controls error tolerance.  
   
   ## Testing
   Ran existing unit tests.




> Convert LTSV Format Plugin to EVF2
> --
>
> Key: DRILL-8179
> URL: https://issues.apache.org/jira/browse/DRILL-8179
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.20.1
>Reporter: Jingchuan Hu
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Get authorized by Charles, continue the conversion from LTSV to EVF2 directly.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649397#comment-17649397
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

jnturton commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052404918


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void endRecord() throws IOException {
+logger.debug("Ending record");
+// Write the event to the Splunk index
+destinationIndex.submit(eventArgs, splunkEvent.toJSONString());

Review Comment:
   @cgivre can we leave a comment explaining this to readers then?



##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,309 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright 

[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649386#comment-17649386
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052374395


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void endRecord() throws IOException {
+logger.debug("Ending record");
+// Write the event to the Splunk index
+destinationIndex.submit(eventArgs, splunkEvent.toJSONString());
+// Clear out the splunk event.
+splunkEvent = new JSONObject();

Review Comment:
   Yes. This line clears out the event so every row we start fresh.   I 
discovered there is a `clear` method so I called that rather than creating a 
new object every time. 





> Add Write/Append Capability to Splunk Plugin
> 
>
> Key: DRILL-8371
> URL: 

[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649384#comment-17649384
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052367809


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkInsertWriter.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+
+import java.util.List;
+
+public class SplunkInsertWriter extends SplunkWriter {
+  public static final String OPERATOR_TYPE = "SPLUNK_INSERT_WRITER";
+
+  private final SplunkStoragePlugin plugin;
+  private final List tableIdentifier;
+
+  @JsonCreator
+  public SplunkInsertWriter(
+  @JsonProperty("child") PhysicalOperator child,
+  @JsonProperty("tableIdentifier") List tableIdentifier,
+  @JsonProperty("storage") SplunkPluginConfig storageConfig,
+  @JacksonInject StoragePluginRegistry engineRegistry) {
+super(child, tableIdentifier, engineRegistry.resolve(storageConfig, 
SplunkStoragePlugin.class));

Review Comment:
   Fixed.





> Add Write/Append Capability to Splunk Plugin
> 
>
> Key: DRILL-8371
> URL: https://issues.apache.org/jira/browse/DRILL-8371
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Splunk
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> While Drill can currently read from Splunk indexes, it cannot write to them 
> or create them.  This proposed PR adds support for CTAS queries for Splunk as 
> well as INSERT and DROP TABLE. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649385#comment-17649385
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052368638


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,

Review Comment:
   Sorry..  I clarified the comment.  This is called once before the records 
are written.





> Add Write/Append Capability to Splunk Plugin
> 
>
> Key: DRILL-8371
> URL: https://issues.apache.org/jira/browse/DRILL-8371
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Splunk
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> While Drill can currently read from Splunk indexes, it cannot write to them 
> or create them.  This proposed PR adds support for CTAS queries for Splunk as 
> well as INSERT and DROP TABLE. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649382#comment-17649382
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052363847


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void endRecord() throws IOException {
+logger.debug("Ending record");
+// Write the event to the Splunk index
+destinationIndex.submit(eventArgs, splunkEvent.toJSONString());

Review Comment:
   I think there may be some bug in the Splunk SDK.  





> Add Write/Append Capability to Splunk Plugin
> 
>
> Key: DRILL-8371
> URL: https://issues.apache.org/jira/browse/DRILL-8371
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Splunk
>Affects Versions: 1.20.2
>Reporter: 

[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649378#comment-17649378
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052356461


##
contrib/storage-splunk/src/test/java/org/apache/drill/exec/store/splunk/SplunkWriterTest.java:
##
@@ -0,0 +1,191 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+import org.apache.drill.categories.SlowTest;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.common.types.TypeProtos.MinorType;
+import org.apache.drill.exec.physical.rowSet.DirectRowSet;
+import org.apache.drill.exec.physical.rowSet.RowSet;
+import org.apache.drill.exec.physical.rowSet.RowSetBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.test.QueryBuilder.QuerySummary;
+import org.apache.drill.test.rowSet.RowSetUtilities;
+import org.junit.FixMethodOrder;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+import org.junit.runners.MethodSorters;
+
+
+import static org.junit.Assert.assertEquals;
+import static org.junit.Assert.assertTrue;
+
+@FixMethodOrder(MethodSorters.JVM)
+@Category({SlowTest.class})
+public class SplunkWriterTest extends SplunkBaseTest {
+
+  @Test
+  public void testBasicCTAS() throws Exception {
+
+// Verify that there is no index called t1 in Splunk
+String sql = "SELECT * FROM INFORMATION_SCHEMA.`TABLES` WHERE TABLE_SCHEMA 
= 'splunk' AND TABLE_NAME LIKE 't1'";
+RowSet results = client.queryBuilder().sql(sql).rowSet();
+assertEquals(0, results.rowCount());
+results.clear();
+
+// Now create the table
+sql = "CREATE TABLE `splunk`.`t1` AS SELECT * FROM cp.`test_data.csvh`";
+QuerySummary summary = client.queryBuilder().sql(sql).run();
+assertTrue(summary.succeeded());
+
+// Verify that an index was created called t1 in Splunk
+sql = "SELECT * FROM INFORMATION_SCHEMA.`TABLES` WHERE TABLE_SCHEMA = 
'splunk' AND TABLE_NAME LIKE 't1'";
+results = client.queryBuilder().sql(sql).rowSet();
+assertEquals(1, results.rowCount());
+results.clear();
+
+// There seems to be some delay between the Drill query writing the data 
and the data being made
+// accessible.
+Thread.sleep(3);

Review Comment:
   Yeah.. There seems to be a processing delay between inserting data and it 
actually being queryable.   I don't think this is a Drill issue. 





> Add Write/Append Capability to Splunk Plugin
> 
>
> Key: DRILL-8371
> URL: https://issues.apache.org/jira/browse/DRILL-8371
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Splunk
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> While Drill can currently read from Splunk indexes, it cannot write to them 
> or create them.  This proposed PR adds support for CTAS queries for Splunk as 
> well as INSERT and DROP TABLE. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649376#comment-17649376
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052354949


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void endRecord() throws IOException {
+logger.debug("Ending record");
+// Write the event to the Splunk index
+destinationIndex.submit(eventArgs, splunkEvent.toJSONString());

Review Comment:
   @jnturton 
   I actually tried this first and I couldn't get Splunk to actually write any 
data.  I literally cut/pasted their code into Drill to no avail. 





> Add Write/Append Capability to Splunk Plugin
> 
>
> Key: DRILL-8371
> URL: https://issues.apache.org/jira/browse/DRILL-8371
> Project: Apache Drill
>  Issue Type: 

[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649375#comment-17649375
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

cgivre commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052352513


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void endRecord() throws IOException {
+logger.debug("Ending record");
+// Write the event to the Splunk index
+destinationIndex.submit(eventArgs, splunkEvent.toJSONString());
+// Clear out the splunk event.
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void abort() {
+// No op
+  }
+
+  @Override
+  public void cleanup() {
+// No op
+  }
+
+
+  @Override
+  public FieldConverter getNewNullableIntConverter(int fieldId, String 
fieldName, FieldReader reader) {
+return new ScalarSplunkConverter(fieldId, fieldName, reader);
+  }
+
+  @Override
+  public FieldConverter 

[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649374#comment-17649374
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

jnturton commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052337380


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkBatchWriter.java:
##
@@ -0,0 +1,308 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+
+import com.splunk.Args;
+import com.splunk.Index;
+import com.splunk.IndexCollection;
+import com.splunk.Service;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.exec.proto.UserBitShared.UserCredentials;
+import org.apache.drill.exec.record.VectorAccessible;
+import org.apache.drill.exec.store.AbstractRecordWriter;
+import org.apache.drill.exec.store.EventBasedRecordWriter.FieldConverter;
+import org.apache.drill.exec.vector.complex.reader.FieldReader;
+import org.json.simple.JSONObject;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.IOException;
+import java.util.List;
+import java.util.Map;
+
+public class SplunkBatchWriter extends AbstractRecordWriter {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(SplunkBatchWriter.class);
+  private static final String DEFAULT_SOURCETYPE = "drill";
+  private final UserCredentials userCredentials;
+  private final List tableIdentifier;
+  private final SplunkWriter config;
+  private final Args eventArgs;
+  protected final Service splunkService;
+  private JSONObject splunkEvent;
+  protected Index destinationIndex;
+
+
+  public SplunkBatchWriter(UserCredentials userCredentials, List 
tableIdentifier, SplunkWriter config) {
+this.config = config;
+this.tableIdentifier = tableIdentifier;
+this.userCredentials = userCredentials;
+
+SplunkConnection connection = new 
SplunkConnection(config.getPluginConfig(), userCredentials.getUserName());
+this.splunkService = connection.connect();
+
+// Populate event arguments
+this.eventArgs = new Args();
+eventArgs.put("sourcetype", DEFAULT_SOURCETYPE);
+  }
+
+  @Override
+  public void init(Map writerOptions) throws IOException {
+// No op
+  }
+
+  /**
+   * Update the schema in RecordWriter. Called at least once before starting 
writing the records. In this case,
+   * we add the index to Splunk here. Splunk's API is a little sparse and 
doesn't really do much in the way
+   * of error checking or providing feedback if the operation fails.
+   *
+   * @param batch {@link VectorAccessible} The incoming batch
+   */
+  @Override
+  public void updateSchema(VectorAccessible batch) {
+logger.debug("Updating schema for Splunk");
+
+//Get the collection of indexes
+IndexCollection indexes = splunkService.getIndexes();
+try {
+  String indexName = tableIdentifier.get(0);
+  indexes.create(indexName);
+  destinationIndex = splunkService.getIndexes().get(indexName);
+} catch (Exception e) {
+  // We have to catch a generic exception here, as Splunk's SDK does not 
really provide any kind of
+  // failure messaging.
+  throw UserException.systemError(e)
+.message("Error creating new index in Splunk plugin: " + 
e.getMessage())
+.build(logger);
+}
+  }
+
+
+  @Override
+  public void startRecord() {
+logger.debug("Starting record");
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void endRecord() throws IOException {
+logger.debug("Ending record");
+// Write the event to the Splunk index
+destinationIndex.submit(eventArgs, splunkEvent.toJSONString());
+// Clear out the splunk event.
+splunkEvent = new JSONObject();
+  }
+
+  @Override
+  public void abort() {
+// No op
+  }
+
+  @Override
+  public void cleanup() {
+// No op
+  }
+
+
+  @Override
+  public FieldConverter getNewNullableIntConverter(int fieldId, String 
fieldName, FieldReader reader) {
+return new ScalarSplunkConverter(fieldId, fieldName, reader);
+  }
+
+  @Override
+  public FieldConverter 

[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17649369#comment-17649369
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

jnturton commented on code in PR #2722:
URL: https://github.com/apache/drill/pull/2722#discussion_r1052328977


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkInsertWriter.java:
##
@@ -0,0 +1,72 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.splunk;
+
+import com.fasterxml.jackson.annotation.JacksonInject;
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+
+import java.util.List;
+
+public class SplunkInsertWriter extends SplunkWriter {
+  public static final String OPERATOR_TYPE = "SPLUNK_INSERT_WRITER";
+
+  private final SplunkStoragePlugin plugin;
+  private final List tableIdentifier;
+
+  @JsonCreator
+  public SplunkInsertWriter(
+  @JsonProperty("child") PhysicalOperator child,
+  @JsonProperty("tableIdentifier") List tableIdentifier,
+  @JsonProperty("storage") SplunkPluginConfig storageConfig,
+  @JacksonInject StoragePluginRegistry engineRegistry) {
+super(child, tableIdentifier, engineRegistry.resolve(storageConfig, 
SplunkStoragePlugin.class));

Review Comment:
   Did you mean to name this engineRegistry rather than, say, pluginRegistry?





> Add Write/Append Capability to Splunk Plugin
> 
>
> Key: DRILL-8371
> URL: https://issues.apache.org/jira/browse/DRILL-8371
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Splunk
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> While Drill can currently read from Splunk indexes, it cannot write to them 
> or create them.  This proposed PR adds support for CTAS queries for Splunk as 
> well as INSERT and DROP TABLE. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8371) Add Write/Append Capability to Splunk Plugin

2022-12-16 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648794#comment-17648794
 ] 

ASF GitHub Bot commented on DRILL-8371:
---

cgivre opened a new pull request, #2722:
URL: https://github.com/apache/drill/pull/2722

   # [DRILL-8371](https://issues.apache.org/jira/browse/DRILL-8371): Add 
Write/Append Capability to Splunk Plugin
   
   ## Description
   Adds the ability to create/delete Splunk indexes via CTAS and DROP TABLE 
queries.  Also adds ability to INSERT to a pre-existing index.
   
   ## Documentation
   Updated README.
   
   ## Testing
   Added unit tests and tested manually. 




> Add Write/Append Capability to Splunk Plugin
> 
>
> Key: DRILL-8371
> URL: https://issues.apache.org/jira/browse/DRILL-8371
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Splunk
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> While Drill can currently read from Splunk indexes, it cannot write to them 
> or create them.  This proposed PR adds support for CTAS queries for Splunk as 
> well as INSERT and DROP TABLE. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648135#comment-17648135
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

jnturton merged PR #2713:
URL: https://github.com/apache/drill/pull/2713




> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648056#comment-17648056
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

jnturton commented on PR #2713:
URL: https://github.com/apache/drill/pull/2713#issuecomment-1353117449

   > > @cgivre I've added a boot option that disables mount commands by 
default. So to make your Drill servers vulnerable to malicious Drill admins you 
have to set that in drill-override first. I can also add a message saying 
"think hard about the OS privileges that your Drill process user has before 
switching this on" to the docs for this feature and that's about all I can 
think to do for security here...
   > 
   > @jnturton Did you add some sort of warning for this?
   
   @cgivre the only place I think to add a warning is in the docs on the 
website which I'll only add once this gets merged. Because it's a boot option 
it doesn't enjoy an accompanying description field that gets shown to users 
like the system options do.




> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17648034#comment-17648034
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

cgivre commented on PR #2713:
URL: https://github.com/apache/drill/pull/2713#issuecomment-1353007032

   > @cgivre I've added a boot option that disables mount commands by default. 
So to make your Drill servers vulnerable to malicious Drill admins you have to 
set that in drill-override first. I can also add a message saying "think hard 
about the OS privileges that your Drill process user has before switching this 
on" to the docs for this feature and that's about all I can think to do for 
security here...
   
   @jnturton Did you add some sort of warning for this? 




> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-15 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647904#comment-17647904
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

jnturton commented on code in PR #2713:
URL: https://github.com/apache/drill/pull/2713#discussion_r1049318599


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemConfig.java:
##
@@ -53,18 +53,23 @@ public class FileSystemConfig extends StoragePluginConfig {
   public static final String NAME = "file";
 
   private final String connection;
+  private final String[] mountCommand, unmountCommand;

Review Comment:
   After a little more thought your suggestion won me over and I've converted 
them to Lists.





> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647873#comment-17647873
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

jnturton commented on code in PR #2713:
URL: https://github.com/apache/drill/pull/2713#discussion_r1049256392


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemConfig.java:
##
@@ -53,18 +53,23 @@ public class FileSystemConfig extends StoragePluginConfig {
   public static final String NAME = "file";
 
   private final String connection;
+  private final String[] mountCommand, unmountCommand;

Review Comment:
   @cgivre apologies, I missed this. The motivation was runtime efficiency 
since I don't need the (admittedly very small) overhead of dynamic collections 
here. The Runtime.exec method that I hand off to is also based on String[] so 
by matching that I don't need to do a List to array conversion. Are Lists 
preferable when working with arrays from Jackson though, because I can 
certainly change this?





> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647872#comment-17647872
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

jnturton commented on code in PR #2713:
URL: https://github.com/apache/drill/pull/2713#discussion_r1049256392


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemConfig.java:
##
@@ -53,18 +53,23 @@ public class FileSystemConfig extends StoragePluginConfig {
   public static final String NAME = "file";
 
   private final String connection;
+  private final String[] mountCommand, unmountCommand;

Review Comment:
   @cgivre apologies, I missed this. The motivation was runtime efficiency 
since I don't need the overhead of dynamic collections here. The Runtime.exec 
method that I hand off to is also based on String[] so by matching that I don't 
need to do a List to array conversion. Are Lists preferable when working with 
arrays from Jackson though, because I can certainly change this?





> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8369) Add support for querying DeltaLake snapshots by version

2022-12-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647812#comment-17647812
 ] 

ASF GitHub Bot commented on DRILL-8369:
---

cgivre merged PR #2718:
URL: https://github.com/apache/drill/pull/2718




> Add support for querying DeltaLake snapshots by version
> ---
>
> Key: DRILL-8369
> URL: https://issues.apache.org/jira/browse/DRILL-8369
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-14 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17647053#comment-17647053
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

jnturton commented on PR #2713:
URL: https://github.com/apache/drill/pull/2713#issuecomment-1351106936

   @cgivre I've added a boot option that disables mount commands by default. So 
to make your Drill servers vulnerable to malicious Drill admins you have to set 
that in drill-override first. I can also add a message saying "think hard about 
the OS privileges that your Drill process user has before switching this on" to 
the docs for this feature and that's about all I can think to do for security 
here...




> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8369) Add support for querying DeltaLake snapshots by version

2022-12-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646699#comment-17646699
 ] 

ASF GitHub Bot commented on DRILL-8369:
---

cgivre commented on code in PR #2718:
URL: https://github.com/apache/drill/pull/2718#discussion_r1047310929


##
contrib/format-deltalake/src/main/java/org/apache/drill/exec/store/delta/format/DeltaFormatPluginConfig.java:
##
@@ -18,15 +18,55 @@
 package org.apache.drill.exec.store.delta.format;
 
 import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
 import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.PlanStringBuilder;
 import org.apache.drill.common.logical.FormatPluginConfig;
 
+import java.util.Objects;
+
 @JsonTypeName(DeltaFormatPluginConfig.NAME)
 public class DeltaFormatPluginConfig implements FormatPluginConfig {
 
   public static final String NAME = "delta";
 
+  private final Long version;
+  private final Long timestamp;
+
   @JsonCreator
-  public DeltaFormatPluginConfig() {
+  public DeltaFormatPluginConfig(@JsonProperty("version") Long version,

Review Comment:
   I'm fine with leaving it as is.  I was just asking.  
   LGTM +1





> Add support for querying DeltaLake snapshots by version
> ---
>
> Key: DRILL-8369
> URL: https://issues.apache.org/jira/browse/DRILL-8369
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8369) Add support for querying DeltaLake snapshots by version

2022-12-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646698#comment-17646698
 ] 

ASF GitHub Bot commented on DRILL-8369:
---

vvysotskyi commented on code in PR #2718:
URL: https://github.com/apache/drill/pull/2718#discussion_r1047307275


##
contrib/format-deltalake/src/main/java/org/apache/drill/exec/store/delta/format/DeltaFormatPluginConfig.java:
##
@@ -18,15 +18,55 @@
 package org.apache.drill.exec.store.delta.format;
 
 import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
 import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.PlanStringBuilder;
 import org.apache.drill.common.logical.FormatPluginConfig;
 
+import java.util.Objects;
+
 @JsonTypeName(DeltaFormatPluginConfig.NAME)
 public class DeltaFormatPluginConfig implements FormatPluginConfig {
 
   public static final String NAME = "delta";
 
+  private final Long version;
+  private final Long timestamp;
+
   @JsonCreator
-  public DeltaFormatPluginConfig() {
+  public DeltaFormatPluginConfig(@JsonProperty("version") Long version,

Review Comment:
   I propose to leave it as is since it is required to be able to 
serialize/deserialize these values, and Drill uses the same code for reading 
plugin configs from the query plan and from UI, where the user sets them.





> Add support for querying DeltaLake snapshots by version
> ---
>
> Key: DRILL-8369
> URL: https://issues.apache.org/jira/browse/DRILL-8369
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8369) Add support for querying DeltaLake snapshots by version

2022-12-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646689#comment-17646689
 ] 

ASF GitHub Bot commented on DRILL-8369:
---

pjfanning commented on code in PR #2718:
URL: https://github.com/apache/drill/pull/2718#discussion_r1047285109


##
contrib/format-deltalake/src/main/java/org/apache/drill/exec/store/delta/format/DeltaFormatPluginConfig.java:
##
@@ -18,15 +18,55 @@
 package org.apache.drill.exec.store.delta.format;
 
 import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
 import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.PlanStringBuilder;
 import org.apache.drill.common.logical.FormatPluginConfig;
 
+import java.util.Objects;
+
 @JsonTypeName(DeltaFormatPluginConfig.NAME)
 public class DeltaFormatPluginConfig implements FormatPluginConfig {
 
   public static final String NAME = "delta";
 
+  private final Long version;
+  private final Long timestamp;
+
   @JsonCreator
-  public DeltaFormatPluginConfig() {
+  public DeltaFormatPluginConfig(@JsonProperty("version") Long version,

Review Comment:
   I dislike the need for all the annotations but Jackson insists on when you 
go down certain paths.
   
   If you put back the empty constructor as well as having the 2 param 
constructor, you will possibly find that you don't need the JsonCreator and 
JsonProperty annotations any more.
   
   Jackson seems to prefer to use the no params constructor and to use 
reflection to set the values on the fields.





> Add support for querying DeltaLake snapshots by version
> ---
>
> Key: DRILL-8369
> URL: https://issues.apache.org/jira/browse/DRILL-8369
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8370) Upgrade splunk-sdk-java to 1.9.3

2022-12-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646654#comment-17646654
 ] 

ASF GitHub Bot commented on DRILL-8370:
---

jnturton merged PR #2719:
URL: https://github.com/apache/drill/pull/2719




> Upgrade splunk-sdk-java to 1.9.3
> 
>
> Key: DRILL-8370
> URL: https://issues.apache.org/jira/browse/DRILL-8370
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 1.20.3
>
>
> Changes in splunk-sdk-java since 1.9.1.
> {quote}
> h3. Minor Changes
>  * Re-fetch logic for instancetype and version fields if not set within 
> Service instance to avoid NPE (GitHub PR 
> [#202|https://github.com/splunk/splunk-sdk-java/pull/202])
>  * Check for local IP as alternative to _localhost_ within HostnameVerifier, 
> addressing issue with certain local workflows
>  * Added null check for child to handle error when no value is passed for a 
> parameter in modular-inputs (Ref issue 
> [#198|https://github.com/splunk/splunk-sdk-java/issues/198] & GitHub PR 
> [#199|https://github.com/splunk/splunk-sdk-java/pull/199])
> h3. New Features and APIs
> {quote} * 
> {quote}Added feature that allows to update ACL properties of an entity 
> (GitHub PR [#196|https://github.com/splunk/splunk-sdk-java/pull/196]){quote}
>  
> Also removes the execution order dependence in the Splunk unit tests created 
> by their expecting a certain number of records in the _audit index. I believe 
> this order dependence is behind recent, apparently random CI run failures in 
> the Splunk unit tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8370) Upgrade splunk-sdk-java to 1.9.3

2022-12-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646652#comment-17646652
 ] 

ASF GitHub Bot commented on DRILL-8370:
---

jnturton commented on PR #2719:
URL: https://github.com/apache/drill/pull/2719#issuecomment-1348668638

   @cgivre okay the last commit may have gotten rid of those sporadic CI 
failures 爛.




> Upgrade splunk-sdk-java to 1.9.3
> 
>
> Key: DRILL-8370
> URL: https://issues.apache.org/jira/browse/DRILL-8370
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 1.20.3
>
>
> Changes in splunk-sdk-java since 1.9.1.
> {quote}
> h3. Minor Changes
>  * Re-fetch logic for instancetype and version fields if not set within 
> Service instance to avoid NPE (GitHub PR 
> [#202|https://github.com/splunk/splunk-sdk-java/pull/202])
>  * Check for local IP as alternative to _localhost_ within HostnameVerifier, 
> addressing issue with certain local workflows
>  * Added null check for child to handle error when no value is passed for a 
> parameter in modular-inputs (Ref issue 
> [#198|https://github.com/splunk/splunk-sdk-java/issues/198] & GitHub PR 
> [#199|https://github.com/splunk/splunk-sdk-java/pull/199])
> h3. New Features and APIs
> {quote} * 
> {quote}Added feature that allows to update ACL properties of an entity 
> (GitHub PR [#196|https://github.com/splunk/splunk-sdk-java/pull/196]){quote}
>  
> Also removes the execution order dependence in the Splunk unit tests created 
> by their expecting a certain number of records in the _audit index. I believe 
> this order dependence is behind recent, apparently random CI run failures in 
> the Splunk unit tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8369) Add support for querying DeltaLake snapshots by version

2022-12-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646512#comment-17646512
 ] 

ASF GitHub Bot commented on DRILL-8369:
---

vvysotskyi commented on code in PR #2718:
URL: https://github.com/apache/drill/pull/2718#discussion_r1046827000


##
contrib/format-deltalake/src/main/java/org/apache/drill/exec/store/delta/format/DeltaFormatPluginConfig.java:
##
@@ -18,15 +18,55 @@
 package org.apache.drill.exec.store.delta.format;
 
 import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
 import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.PlanStringBuilder;
 import org.apache.drill.common.logical.FormatPluginConfig;
 
+import java.util.Objects;
+
 @JsonTypeName(DeltaFormatPluginConfig.NAME)
 public class DeltaFormatPluginConfig implements FormatPluginConfig {
 
   public static final String NAME = "delta";
 
+  private final Long version;
+  private final Long timestamp;
+
   @JsonCreator
-  public DeltaFormatPluginConfig() {
+  public DeltaFormatPluginConfig(@JsonProperty("version") Long version,

Review Comment:
   It will fail with the following error:
   ```
   Invalid type definition for type 
`org.apache.drill.exec.store.delta.format.DeltaFormatPluginConfig`: Argument #0 
of constructor [constructor for 
`org.apache.drill.exec.store.delta.format.DeltaFormatPluginConfig` (2 args), 
annotations: {interface 
com.fasterxml.jackson.annotation.JsonCreator=@com.fasterxml.jackson.annotation.JsonCreator(mode=DEFAULT)}
 has no property name (and is not Injectable): can not use as property-based 
Creator
   ```





> Add support for querying DeltaLake snapshots by version
> ---
>
> Key: DRILL-8369
> URL: https://issues.apache.org/jira/browse/DRILL-8369
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8369) Add support for querying DeltaLake snapshots by version

2022-12-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646502#comment-17646502
 ] 

ASF GitHub Bot commented on DRILL-8369:
---

pjfanning commented on code in PR #2718:
URL: https://github.com/apache/drill/pull/2718#discussion_r1046793510


##
contrib/format-deltalake/src/main/java/org/apache/drill/exec/store/delta/format/DeltaFormatPluginConfig.java:
##
@@ -18,15 +18,55 @@
 package org.apache.drill.exec.store.delta.format;
 
 import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
 import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.PlanStringBuilder;
 import org.apache.drill.common.logical.FormatPluginConfig;
 
+import java.util.Objects;
+
 @JsonTypeName(DeltaFormatPluginConfig.NAME)
 public class DeltaFormatPluginConfig implements FormatPluginConfig {
 
   public static final String NAME = "delta";
 
+  private final Long version;
+  private final Long timestamp;
+
   @JsonCreator
-  public DeltaFormatPluginConfig() {
+  public DeltaFormatPluginConfig(@JsonProperty("version") Long version,

Review Comment:
   I'm a Jackson contributor - but admit that its not always obvious when and 
why annotations are needed.
   These JsonProperty annotations do seem unnecessary - what happens if you 
leave them out but keep the rest of the new code?





> Add support for querying DeltaLake snapshots by version
> ---
>
> Key: DRILL-8369
> URL: https://issues.apache.org/jira/browse/DRILL-8369
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8369) Add support for querying DeltaLake snapshots by version

2022-12-13 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646485#comment-17646485
 ] 

ASF GitHub Bot commented on DRILL-8369:
---

vvysotskyi commented on code in PR #2718:
URL: https://github.com/apache/drill/pull/2718#discussion_r1046773655


##
contrib/format-deltalake/src/main/java/org/apache/drill/exec/store/delta/format/DeltaFormatPluginConfig.java:
##
@@ -18,15 +18,55 @@
 package org.apache.drill.exec.store.delta.format;
 
 import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
 import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.PlanStringBuilder;
 import org.apache.drill.common.logical.FormatPluginConfig;
 
+import java.util.Objects;
+
 @JsonTypeName(DeltaFormatPluginConfig.NAME)
 public class DeltaFormatPluginConfig implements FormatPluginConfig {
 
   public static final String NAME = "delta";
 
+  private final Long version;
+  private final Long timestamp;
+
   @JsonCreator
-  public DeltaFormatPluginConfig() {
+  public DeltaFormatPluginConfig(@JsonProperty("version") Long version,

Review Comment:
   I think hiding it could cause issues when serializing / deserializing query 
plans. But people will not know about this property if it is not specified in 
the docs for this storage plugin configs.





> Add support for querying DeltaLake snapshots by version
> ---
>
> Key: DRILL-8369
> URL: https://issues.apache.org/jira/browse/DRILL-8369
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8369) Add support for querying DeltaLake snapshots by version

2022-12-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646423#comment-17646423
 ] 

ASF GitHub Bot commented on DRILL-8369:
---

cgivre commented on code in PR #2718:
URL: https://github.com/apache/drill/pull/2718#discussion_r1046647612


##
contrib/format-deltalake/src/main/java/org/apache/drill/exec/store/delta/format/DeltaFormatPluginConfig.java:
##
@@ -18,15 +18,55 @@
 package org.apache.drill.exec.store.delta.format;
 
 import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
 import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.PlanStringBuilder;
 import org.apache.drill.common.logical.FormatPluginConfig;
 
+import java.util.Objects;
+
 @JsonTypeName(DeltaFormatPluginConfig.NAME)
 public class DeltaFormatPluginConfig implements FormatPluginConfig {
 
   public static final String NAME = "delta";
 
+  private final Long version;
+  private final Long timestamp;
+
   @JsonCreator
-  public DeltaFormatPluginConfig() {
+  public DeltaFormatPluginConfig(@JsonProperty("version") Long version,

Review Comment:
   I don't remember the annotation, but do you think it would be a good idea to 
hide these in the config so people don't set them accidentally?





> Add support for querying DeltaLake snapshots by version
> ---
>
> Key: DRILL-8369
> URL: https://issues.apache.org/jira/browse/DRILL-8369
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8358) Storage plugin for querying other Apache Drill clusters

2022-12-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646331#comment-17646331
 ] 

ASF GitHub Bot commented on DRILL-8358:
---

vvysotskyi commented on code in PR #2709:
URL: https://github.com/apache/drill/pull/2709#discussion_r1046390848


##
contrib/storage-drill/src/test/java/org/apache/drill/exec/store/drill/plugin/DrillPluginQueriesTest.java:
##
@@ -0,0 +1,312 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.drill.plugin;
+
+import org.apache.drill.common.AutoCloseables;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.physical.rowSet.RowSet;
+import org.apache.drill.exec.physical.rowSet.RowSetBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.test.ClientFixture;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.AfterClass;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.util.Properties;
+
+import static org.junit.Assert.assertEquals;
+
+public class DrillPluginQueriesTest extends ClusterTest {
+
+  private static final String TABLE_NAME = "dfs.tmp.test_table";
+
+  private static ClusterFixture drill;
+  private static ClientFixture drillClient;
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+initPlugin();
+  }
+
+  @AfterClass
+  public static void shutdown() throws Exception {
+AutoCloseables.close(drill, drillClient);
+  }
+
+  private static void initPlugin() throws Exception {
+startCluster(ClusterFixture.builder(dirTestWatcher));
+drill = ClusterFixture.builder(dirTestWatcher).build();
+
+DrillStoragePluginConfig config = new DrillStoragePluginConfig(
+  "jdbc:drill:drillbit=localhost:" + drill.drillbit().getUserPort(),
+  new Properties(), null);
+config.setEnabled(true);
+cluster.defineStoragePlugin("drill", config);
+cluster.defineStoragePlugin("drill2", config);
+drillClient = drill.clientFixture();
+
+drillClient.queryBuilder()
+  .sql("create table %s as select * from cp.`tpch/nation.parquet`", 
TABLE_NAME)
+  .run();
+  }
+
+  @Test
+  public void testSerDe() throws Exception {
+String plan = queryBuilder().sql("select * from drill.%s", 
TABLE_NAME).explainJson();
+long count = queryBuilder().physical(plan).run().recordCount();
+assertEquals(25, count);
+  }
+
+  @Test
+  public void testShowDatabases() throws Exception {
+testBuilder()
+  .sqlQuery("show databases where SCHEMA_NAME='drill.dfs.tmp'")
+  .unOrdered()
+  .baselineColumns("SCHEMA_NAME")
+  .baselineValues("drill.dfs.tmp")
+  .go();
+  }
+
+  @Test
+  public void testShowTables() throws Exception {
+testBuilder()
+  .sqlQuery("show tables IN drill.INFORMATION_SCHEMA")
+  .unOrdered()
+  .baselineColumns("TABLE_SCHEMA", "TABLE_NAME")
+  .baselineValues("drill.information_schema", "VIEWS")
+  .baselineValues("drill.information_schema", "CATALOGS")
+  .baselineValues("drill.information_schema", "COLUMNS")
+  .baselineValues("drill.information_schema", "PARTITIONS")
+  .baselineValues("drill.information_schema", "FILES")
+  .baselineValues("drill.information_schema", "SCHEMATA")
+  .baselineValues("drill.information_schema", "TABLES")
+  .go();
+  }
+
+  @Test
+  public void testProjectPushDown() throws Exception {
+String query = "select n_nationkey, n_regionkey, n_name from drill.%s";
+
+queryBuilder()
+.sql(query, TABLE_NAME)
+.planMatcher()
+.include("query=\"SELECT `n_nationkey`, `n_regionkey`, `n_name`")
+.exclude("\\*")
+.match();
+
+RowSet sets = queryBuilder()
+  .sql(query, TABLE_NAME)
+  .rowSet();
+
+TupleMetadata schema = new SchemaBuilder()
+  .add("n_nationkey", TypeProtos.MinorType.INT)
+  .add("n_regionkey", TypeProtos.MinorType.INT)
+  .add("n_name", TypeProtos.MinorType.VARCHAR)
+  

[jira] [Commented] (DRILL-8358) Storage plugin for querying other Apache Drill clusters

2022-12-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646234#comment-17646234
 ] 

ASF GitHub Bot commented on DRILL-8358:
---

cgivre merged PR #2709:
URL: https://github.com/apache/drill/pull/2709




> Storage plugin for querying other Apache Drill clusters
> ---
>
> Key: DRILL-8358
> URL: https://issues.apache.org/jira/browse/DRILL-8358
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8370) Upgrade splunk-sdk-java to 1.9.3

2022-12-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646196#comment-17646196
 ] 

ASF GitHub Bot commented on DRILL-8370:
---

github-code-scanning[bot] commented on code in PR #2719:
URL: https://github.com/apache/drill/pull/2719#discussion_r1046056001


##
contrib/storage-splunk/src/main/java/org/apache/drill/exec/store/splunk/SplunkConnection.java:
##
@@ -152,4 +172,45 @@
   public EntityCollection getIndexes() {
 return service.getIndexes();
   }
+
+  /**
+   * As of version 1.8, Splunk's SDK introduced a boolean parameter which
+   * is supposed to control whether the SDK will validate SSL certificates
+   * or not.  Unfortunately the parameter does not actually seem to have
+   * any effect and the end result is that when making Splunk calls,
+   * Splunk will always attempt to verify the SSL certificates, even when
+   * the parameter is set to false.  This method does what the parameter
+   * is supposed to do in the SDK and adds and all trusting SSL Socket
+   * Factory to the HTTP client in Splunk's SDK.  In the event Splunk
+   * fixes this issue, we can remove this method.
+   *
+   * @return A {@link SSLSocketFactory} which trusts any SSL certificate,
+   *   even ones from Splunk
+   * @throws KeyManagementException Thros
+   */
+  private SSLSocketFactory createAllTrustingSSLFactory() throws 
KeyManagementException {
+SSLContext context;
+try {
+  context = SSLContext.getInstance("TLS");
+} catch (NoSuchAlgorithmException e) {
+  throw UserException.validationError(e)
+.message("Error establishing SSL connection: Invalid scheme: " + 
e.getMessage())
+.build(logger);
+}
+TrustManager[] trustAll = new TrustManager[]{
+new X509TrustManager() {
+  public X509Certificate[] getAcceptedIssuers() {
+return null;
+  }
+  public void checkClientTrusted(X509Certificate[] certs, String 
authType) {
+// No op
+  }
+  public void checkServerTrusted(X509Certificate[] certs, String 
authType) {
+// No op
+  }
+}
+};
+context.init(null, trustAll, null);

Review Comment:
   ## `TrustManager` that accepts all certificates
   
   This uses [TrustManager](1), which is defined in [SplunkConnection$](2) and 
trusts any certificate.
   
   [Show more 
details](https://github.com/apache/drill/security/code-scanning/42)





> Upgrade splunk-sdk-java to 1.9.3
> 
>
> Key: DRILL-8370
> URL: https://issues.apache.org/jira/browse/DRILL-8370
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 1.20.3
>
>
> Changes in splunk-sdk-java since 1.9.1.
> {quote}
> h3. Minor Changes
>  * Re-fetch logic for instancetype and version fields if not set within 
> Service instance to avoid NPE (GitHub PR 
> [#202|https://github.com/splunk/splunk-sdk-java/pull/202])
>  * Check for local IP as alternative to _localhost_ within HostnameVerifier, 
> addressing issue with certain local workflows
>  * Added null check for child to handle error when no value is passed for a 
> parameter in modular-inputs (Ref issue 
> [#198|https://github.com/splunk/splunk-sdk-java/issues/198] & GitHub PR 
> [#199|https://github.com/splunk/splunk-sdk-java/pull/199])
> h3. New Features and APIs
> {quote} * 
> {quote}Added feature that allows to update ACL properties of an entity 
> (GitHub PR [#196|https://github.com/splunk/splunk-sdk-java/pull/196]){quote}
>  
> Also removes the execution order dependence in the Splunk unit tests created 
> by their expecting a certain number of records in the _audit index. I believe 
> this order dependence is behind recent, apparently random CI run failures in 
> the Splunk unit tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8370) Upgrade splunk-sdk-java to 1.9.3

2022-12-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646163#comment-17646163
 ] 

ASF GitHub Bot commented on DRILL-8370:
---

jnturton commented on PR #2719:
URL: https://github.com/apache/drill/pull/2719#issuecomment-1346704255

   The latest commit adds a patch for Splunk's broken validateCertificates 
toggle by @cgivre.




> Upgrade splunk-sdk-java to 1.9.3
> 
>
> Key: DRILL-8370
> URL: https://issues.apache.org/jira/browse/DRILL-8370
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 1.20.3
>
>
> Changes in splunk-sdk-java since 1.9.1.
> {quote}
> h3. Minor Changes
>  * Re-fetch logic for instancetype and version fields if not set within 
> Service instance to avoid NPE (GitHub PR 
> [#202|https://github.com/splunk/splunk-sdk-java/pull/202])
>  * Check for local IP as alternative to _localhost_ within HostnameVerifier, 
> addressing issue with certain local workflows
>  * Added null check for child to handle error when no value is passed for a 
> parameter in modular-inputs (Ref issue 
> [#198|https://github.com/splunk/splunk-sdk-java/issues/198] & GitHub PR 
> [#199|https://github.com/splunk/splunk-sdk-java/pull/199])
> h3. New Features and APIs
> {quote} * 
> {quote}Added feature that allows to update ACL properties of an entity 
> (GitHub PR [#196|https://github.com/splunk/splunk-sdk-java/pull/196]){quote}
>  
> Also removes the execution order dependence in the Splunk unit tests created 
> by their expecting a certain number of records in the _audit index. I believe 
> this order dependence is behind recent, apparently random CI run failures in 
> the Splunk unit tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8370) Upgrade splunk-sdk-java to 1.9.3

2022-12-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646164#comment-17646164
 ] 

ASF GitHub Bot commented on DRILL-8370:
---

cgivre commented on PR #2719:
URL: https://github.com/apache/drill/pull/2719#issuecomment-1346705339

   Just noting here for future reference, but here is the issue on Splunk's SDK 
that references the SSL issue: 
   https://github.com/splunk/splunk-sdk-java/issues/204




> Upgrade splunk-sdk-java to 1.9.3
> 
>
> Key: DRILL-8370
> URL: https://issues.apache.org/jira/browse/DRILL-8370
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 1.20.3
>
>
> Changes in splunk-sdk-java since 1.9.1.
> {quote}
> h3. Minor Changes
>  * Re-fetch logic for instancetype and version fields if not set within 
> Service instance to avoid NPE (GitHub PR 
> [#202|https://github.com/splunk/splunk-sdk-java/pull/202])
>  * Check for local IP as alternative to _localhost_ within HostnameVerifier, 
> addressing issue with certain local workflows
>  * Added null check for child to handle error when no value is passed for a 
> parameter in modular-inputs (Ref issue 
> [#198|https://github.com/splunk/splunk-sdk-java/issues/198] & GitHub PR 
> [#199|https://github.com/splunk/splunk-sdk-java/pull/199])
> h3. New Features and APIs
> {quote} * 
> {quote}Added feature that allows to update ACL properties of an entity 
> (GitHub PR [#196|https://github.com/splunk/splunk-sdk-java/pull/196]){quote}
>  
> Also removes the execution order dependence in the Splunk unit tests created 
> by their expecting a certain number of records in the _audit index. I believe 
> this order dependence is behind recent, apparently random CI run failures in 
> the Splunk unit tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8358) Storage plugin for querying other Apache Drill clusters

2022-12-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646104#comment-17646104
 ] 

ASF GitHub Bot commented on DRILL-8358:
---

jnturton commented on code in PR #2709:
URL: https://github.com/apache/drill/pull/2709#discussion_r1039337680


##
contrib/storage-drill/src/test/java/org/apache/drill/exec/store/drill/plugin/DrillPluginQueriesTest.java:
##
@@ -0,0 +1,312 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.drill.plugin;
+
+import org.apache.drill.common.AutoCloseables;
+import org.apache.drill.common.types.TypeProtos;
+import org.apache.drill.exec.physical.rowSet.RowSet;
+import org.apache.drill.exec.physical.rowSet.RowSetBuilder;
+import org.apache.drill.exec.record.metadata.SchemaBuilder;
+import org.apache.drill.exec.record.metadata.TupleMetadata;
+import org.apache.drill.test.ClientFixture;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.apache.drill.test.rowSet.RowSetComparison;
+import org.junit.AfterClass;
+import org.junit.BeforeClass;
+import org.junit.Test;
+
+import java.util.Properties;
+
+import static org.junit.Assert.assertEquals;
+
+public class DrillPluginQueriesTest extends ClusterTest {
+
+  private static final String TABLE_NAME = "dfs.tmp.test_table";
+
+  private static ClusterFixture drill;
+  private static ClientFixture drillClient;
+
+  @BeforeClass
+  public static void setUpBeforeClass() throws Exception {
+initPlugin();
+  }
+
+  @AfterClass
+  public static void shutdown() throws Exception {
+AutoCloseables.close(drill, drillClient);
+  }
+
+  private static void initPlugin() throws Exception {
+startCluster(ClusterFixture.builder(dirTestWatcher));
+drill = ClusterFixture.builder(dirTestWatcher).build();
+
+DrillStoragePluginConfig config = new DrillStoragePluginConfig(
+  "jdbc:drill:drillbit=localhost:" + drill.drillbit().getUserPort(),
+  new Properties(), null);
+config.setEnabled(true);
+cluster.defineStoragePlugin("drill", config);
+cluster.defineStoragePlugin("drill2", config);
+drillClient = drill.clientFixture();
+
+drillClient.queryBuilder()
+  .sql("create table %s as select * from cp.`tpch/nation.parquet`", 
TABLE_NAME)
+  .run();
+  }
+
+  @Test
+  public void testSerDe() throws Exception {
+String plan = queryBuilder().sql("select * from drill.%s", 
TABLE_NAME).explainJson();
+long count = queryBuilder().physical(plan).run().recordCount();
+assertEquals(25, count);
+  }
+
+  @Test
+  public void testShowDatabases() throws Exception {
+testBuilder()
+  .sqlQuery("show databases where SCHEMA_NAME='drill.dfs.tmp'")
+  .unOrdered()
+  .baselineColumns("SCHEMA_NAME")
+  .baselineValues("drill.dfs.tmp")
+  .go();
+  }
+
+  @Test
+  public void testShowTables() throws Exception {
+testBuilder()
+  .sqlQuery("show tables IN drill.INFORMATION_SCHEMA")
+  .unOrdered()
+  .baselineColumns("TABLE_SCHEMA", "TABLE_NAME")
+  .baselineValues("drill.information_schema", "VIEWS")
+  .baselineValues("drill.information_schema", "CATALOGS")
+  .baselineValues("drill.information_schema", "COLUMNS")
+  .baselineValues("drill.information_schema", "PARTITIONS")
+  .baselineValues("drill.information_schema", "FILES")
+  .baselineValues("drill.information_schema", "SCHEMATA")
+  .baselineValues("drill.information_schema", "TABLES")
+  .go();
+  }
+
+  @Test
+  public void testProjectPushDown() throws Exception {
+String query = "select n_nationkey, n_regionkey, n_name from drill.%s";
+
+queryBuilder()
+.sql(query, TABLE_NAME)
+.planMatcher()
+.include("query=\"SELECT `n_nationkey`, `n_regionkey`, `n_name`")
+.exclude("\\*")
+.match();
+
+RowSet sets = queryBuilder()
+  .sql(query, TABLE_NAME)
+  .rowSet();
+
+TupleMetadata schema = new SchemaBuilder()
+  .add("n_nationkey", TypeProtos.MinorType.INT)
+  .add("n_regionkey", TypeProtos.MinorType.INT)
+  .add("n_name", TypeProtos.MinorType.VARCHAR)
+

[jira] [Commented] (DRILL-8370) Upgrade splunk-sdk-java to 1.9.3

2022-12-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8370?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646088#comment-17646088
 ] 

ASF GitHub Bot commented on DRILL-8370:
---

jnturton opened a new pull request, #2719:
URL: https://github.com/apache/drill/pull/2719

   # [DRILL-8370](https://issues.apache.org/jira/browse/DRILL-8370): Upgrade 
splunk-sdk-java to 1.9.3
   
   ## Description
   
   Changes in splunk-sdk-java since 1.9.1.
   
   >  Minor Changes
   > 
   > - Re-fetch logic for instancetype and version fields if not set within 
Service instance to avoid NPE (GitHub PR 
[#202](https://github.com/splunk/splunk-sdk-java/pull/202))
   > - Check for local IP as alternative to localhost within HostnameVerifier, 
addressing issue with certain local workflows
   > - Added null check for child to handle error when no value is passed for a 
parameter in modular-inputs (Ref issue 
[#198](https://github.com/splunk/splunk-sdk-java/issues/198) & GitHub PR 
[#199](https://github.com/splunk/splunk-sdk-java/pull/199))
   > 
   > New Features and APIs
   > 
   > - Added feature that allows to update ACL properties of an entity (GitHub 
PR [#196](https://github.com/splunk/splunk-sdk-java/pull/196))
   
   Also removes the execution order dependence in the Splunk unit tests created 
by their expecting a certain number of records in the _audit index. I believe 
this order dependence is behind recent, apparently random CI run failures in 
the Splunk unit tests.
   
   ## Documentation
   N/A
   
   ## Testing
   Splunk unit tests.
   




> Upgrade splunk-sdk-java to 1.9.3
> 
>
> Key: DRILL-8370
> URL: https://issues.apache.org/jira/browse/DRILL-8370
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 1.20.3
>
>
> Changes in splunk-sdk-java since 1.9.1.
> {quote}
> h3. Minor Changes
>  * Re-fetch logic for instancetype and version fields if not set within 
> Service instance to avoid NPE (GitHub PR 
> [#202|https://github.com/splunk/splunk-sdk-java/pull/202])
>  * Check for local IP as alternative to _localhost_ within HostnameVerifier, 
> addressing issue with certain local workflows
>  * Added null check for child to handle error when no value is passed for a 
> parameter in modular-inputs (Ref issue 
> [#198|https://github.com/splunk/splunk-sdk-java/issues/198] & GitHub PR 
> [#199|https://github.com/splunk/splunk-sdk-java/pull/199])
> h3. New Features and APIs
> {quote} * 
> {quote}Added feature that allows to update ACL properties of an entity 
> (GitHub PR [#196|https://github.com/splunk/splunk-sdk-java/pull/196]){quote}
>  
> Also removes the execution order dependence in the Splunk unit tests created 
> by their expecting a certain number of records in the _audit index. I believe 
> this order dependence is behind recent, apparently random CI run failures in 
> the Splunk unit tests.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8357) Add new config options to the Splunk storage plugin

2022-12-12 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17646082#comment-17646082
 ] 

ASF GitHub Bot commented on DRILL-8357:
---

jnturton commented on PR #2705:
URL: https://github.com/apache/drill/pull/2705#issuecomment-1346461462

   @kingswanwho please note the backport label I just added to this merged PR 
and also to #2706.




> Add new config options to the Splunk storage plugin
> ---
>
> Key: DRILL-8357
> URL: https://issues.apache.org/jira/browse/DRILL-8357
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> The following five new options can be added to the Splunk storage config.
> {code:java}
>   // Whether the Splunk client will validates the server's SSL cert.
>   private final boolean validateCertificates;
>   // The application context of the service.
>   private final String app;
>   // The owner context of the service.
>   private final String owner;
>   // A Splunk authentication token to use for the session.
>   private final String token;
>   // A valid login cookie.
>   private final String cookie;{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8364) Add Support for OAuth Enabled File Systems

2022-12-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645723#comment-17645723
 ] 

ASF GitHub Bot commented on DRILL-8364:
---

cgivre merged PR #2714:
URL: https://github.com/apache/drill/pull/2714




> Add Support for OAuth Enabled File Systems
> --
>
> Key: DRILL-8364
> URL: https://issues.apache.org/jira/browse/DRILL-8364
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently Drill supports reading from file systems such as HDFS, S3 and 
> others that use token based authentication.  This PR extends Drill's plugin 
> architecture so that Drill can connect with other file systems which use 
> OAuth 2.0 for authentication.
> This PR also adds support for Drill to query Box. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8364) Add Support for OAuth Enabled File Systems

2022-12-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645699#comment-17645699
 ] 

ASF GitHub Bot commented on DRILL-8364:
---

cgivre commented on code in PR #2714:
URL: https://github.com/apache/drill/pull/2714#discussion_r1045149074


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemSchemaFactory.java:
##
@@ -82,13 +83,21 @@ public class FileSystemSchema extends AbstractSchema {
 public FileSystemSchema(String name, SchemaConfig schemaConfig) throws 
IOException {
   super(Collections.emptyList(), name);
   final DrillFileSystem fs = 
ImpersonationUtil.createFileSystem(schemaConfig.getUserName(), 
plugin.getFsConf());
+  // Set OAuth Information
+  OAuthConfig oAuthConfig = plugin.getConfig().oAuthConfig();
+  if (oAuthConfig != null) {
+OAuthEnabledFileSystem underlyingFileSystem = (OAuthEnabledFileSystem) 
fs.getUnderlyingFs();

Review Comment:
   @jnturton Good question.  I think that may be possible but with a lot of 
refactoring.  I don't fully understand the file system creation process, but in 
following the flow, I do think that would involve a lot of refactoring. 
   
   On an unrelated note, Hadoop seems to ship with other classes which extend 
`FileSystem` such as FTP, SFTP and a few others. It may be possible for Drill 
to query those by simply adding a few import statements.





> Add Support for OAuth Enabled File Systems
> --
>
> Key: DRILL-8364
> URL: https://issues.apache.org/jira/browse/DRILL-8364
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently Drill supports reading from file systems such as HDFS, S3 and 
> others that use token based authentication.  This PR extends Drill's plugin 
> architecture so that Drill can connect with other file systems which use 
> OAuth 2.0 for authentication.
> This PR also adds support for Drill to query Box. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8369) Add support for querying DeltaLake snapshots by version

2022-12-10 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8369?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645644#comment-17645644
 ] 

ASF GitHub Bot commented on DRILL-8369:
---

vvysotskyi opened a new pull request, #2718:
URL: https://github.com/apache/drill/pull/2718

   # [DRILL-8369](https://issues.apache.org/jira/browse/DRILL-8369): Add 
support for querying DeltaLake snapshots by version
   
   ## Description
   Added functionality for querying specific Delta Lake data versions.
   
   ## Documentation
   See README.md
   
   ## Testing
   Added UT.
   




> Add support for querying DeltaLake snapshots by version
> ---
>
> Key: DRILL-8369
> URL: https://issues.apache.org/jira/browse/DRILL-8369
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8364) Add Support for OAuth Enabled File Systems

2022-12-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1764#comment-1764
 ] 

ASF GitHub Bot commented on DRILL-8364:
---

jnturton commented on code in PR #2714:
URL: https://github.com/apache/drill/pull/2714#discussion_r1044992012


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemSchemaFactory.java:
##
@@ -82,13 +83,21 @@ public class FileSystemSchema extends AbstractSchema {
 public FileSystemSchema(String name, SchemaConfig schemaConfig) throws 
IOException {
   super(Collections.emptyList(), name);
   final DrillFileSystem fs = 
ImpersonationUtil.createFileSystem(schemaConfig.getUserName(), 
plugin.getFsConf());
+  // Set OAuth Information
+  OAuthConfig oAuthConfig = plugin.getConfig().oAuthConfig();
+  if (oAuthConfig != null) {
+OAuthEnabledFileSystem underlyingFileSystem = (OAuthEnabledFileSystem) 
fs.getUnderlyingFs();

Review Comment:
   Last question from me - would it work out cleaner to make 
OAuthEnabledFileSystem inherit from DrillFileSystem? In particular, could that 
elimnate this new getUnderlyingFs method()? Or it would cause trouble elsewhere?





> Add Support for OAuth Enabled File Systems
> --
>
> Key: DRILL-8364
> URL: https://issues.apache.org/jira/browse/DRILL-8364
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently Drill supports reading from file systems such as HDFS, S3 and 
> others that use token based authentication.  This PR extends Drill's plugin 
> architecture so that Drill can connect with other file systems which use 
> OAuth 2.0 for authentication.
> This PR also adds support for Drill to query Box. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8364) Add Support for OAuth Enabled File Systems

2022-12-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645553#comment-17645553
 ] 

ASF GitHub Bot commented on DRILL-8364:
---

jnturton commented on code in PR #2714:
URL: https://github.com/apache/drill/pull/2714#discussion_r1044990682


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/SeekableByteArrayInputStream.java:
##
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.dfs;
+
+import org.apache.hadoop.fs.PositionedReadable;
+import org.apache.hadoop.fs.Seekable;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+
+public class SeekableByteArrayInputStream extends ByteArrayInputStream 
implements Seekable, PositionedReadable {

Review Comment:
   Okay let's do it that way.





> Add Support for OAuth Enabled File Systems
> --
>
> Key: DRILL-8364
> URL: https://issues.apache.org/jira/browse/DRILL-8364
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently Drill supports reading from file systems such as HDFS, S3 and 
> others that use token based authentication.  This PR extends Drill's plugin 
> architecture so that Drill can connect with other file systems which use 
> OAuth 2.0 for authentication.
> This PR also adds support for Drill to query Box. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8358) Storage plugin for querying other Apache Drill clusters

2022-12-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645193#comment-17645193
 ] 

ASF GitHub Bot commented on DRILL-8358:
---

vvysotskyi commented on PR #2709:
URL: https://github.com/apache/drill/pull/2709#issuecomment-1344035229

   @cgivre, yes, you can create a plugin in drill1 with the name drill2, and 
query all plugins that drill2 has configured from drill1, so if drill2 has file 
system plugin called dfs2, query for drill1 will be the following:
   ```sql
   SELECT *
   FROM drill2.dfs2.ws.`file`
   ```




> Storage plugin for querying other Apache Drill clusters
> ---
>
> Key: DRILL-8358
> URL: https://issues.apache.org/jira/browse/DRILL-8358
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8358) Storage plugin for querying other Apache Drill clusters

2022-12-09 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17645188#comment-17645188
 ] 

ASF GitHub Bot commented on DRILL-8358:
---

vvysotskyi commented on code in PR #2709:
URL: https://github.com/apache/drill/pull/2709#discussion_r1044224231


##
contrib/storage-drill/src/main/java/org/apache/drill/exec/store/drill/plugin/DrillSubScan.java:
##
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.drill.plugin;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.PhysicalVisitor;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+
+import java.util.Collections;
+import java.util.Iterator;
+import java.util.List;
+
+@JsonTypeName("drill-read")
+public class DrillSubScan extends AbstractBase implements SubScan {
+
+  public static final String OPERATOR_TYPE = "DRILL_SUB_SCAN";
+
+  private final String query;
+
+  @JsonProperty
+  private final DrillStoragePluginConfig pluginConfig;
+
+  @JsonCreator
+  public DrillSubScan(
+  @JsonProperty("userName") String userName,
+  @JsonProperty("mongoPluginConfig") StoragePluginConfig pluginConfig,

Review Comment:
   No, it isn't, thanks, fixed it.





> Storage plugin for querying other Apache Drill clusters
> ---
>
> Key: DRILL-8358
> URL: https://issues.apache.org/jira/browse/DRILL-8358
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8358) Storage plugin for querying other Apache Drill clusters

2022-12-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644996#comment-17644996
 ] 

ASF GitHub Bot commented on DRILL-8358:
---

cgivre commented on code in PR #2709:
URL: https://github.com/apache/drill/pull/2709#discussion_r1036638344


##
contrib/storage-drill/src/main/java/org/apache/drill/exec/store/drill/plugin/DrillSubScan.java:
##
@@ -0,0 +1,91 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.drill.plugin;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.exec.physical.base.AbstractBase;
+import org.apache.drill.exec.physical.base.PhysicalOperator;
+import org.apache.drill.exec.physical.base.PhysicalVisitor;
+import org.apache.drill.exec.physical.base.SubScan;
+import org.apache.drill.shaded.guava.com.google.common.base.Preconditions;
+
+import java.util.Collections;
+import java.util.Iterator;
+import java.util.List;
+
+@JsonTypeName("drill-read")
+public class DrillSubScan extends AbstractBase implements SubScan {
+
+  public static final String OPERATOR_TYPE = "DRILL_SUB_SCAN";
+
+  private final String query;
+
+  @JsonProperty
+  private final DrillStoragePluginConfig pluginConfig;
+
+  @JsonCreator
+  public DrillSubScan(
+  @JsonProperty("userName") String userName,
+  @JsonProperty("mongoPluginConfig") StoragePluginConfig pluginConfig,

Review Comment:
   Is this supposed to be `mongoPluginConfig`?





> Storage plugin for querying other Apache Drill clusters
> ---
>
> Key: DRILL-8358
> URL: https://issues.apache.org/jira/browse/DRILL-8358
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8358) Storage plugin for querying other Apache Drill clusters

2022-12-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644957#comment-17644957
 ] 

ASF GitHub Bot commented on DRILL-8358:
---

vvysotskyi commented on code in PR #2709:
URL: https://github.com/apache/drill/pull/2709#discussion_r1043755276


##
contrib/storage-drill/src/main/java/org/apache/drill/exec/store/drill/plugin/DrillStoragePluginConfig.java:
##
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.drill.plugin;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.calcite.avatica.ConnectStringParser;
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.common.config.DrillProperties;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.common.logical.security.CredentialsProvider;
+import org.apache.drill.common.logical.security.PlainCredentialsProvider;
+import org.apache.drill.exec.client.DrillClient;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.rpc.RpcException;
+
+import java.sql.SQLException;
+import java.util.Objects;
+import java.util.Optional;
+import java.util.Properties;
+
+@JsonTypeName(DrillStoragePluginConfig.NAME)
+public class DrillStoragePluginConfig extends StoragePluginConfig {
+  public static final String NAME = "drill";
+  public static final String CONNECTION_STRING_PREFIX = "jdbc:drill:";
+
+  private static final String DEFAULT_QUOTING_IDENTIFIER = "`";
+
+  private final String connection;
+  private final Properties properties;
+
+  @JsonCreator
+  public DrillStoragePluginConfig(
+  @JsonProperty("connection") String connection,
+  @JsonProperty("properties") Properties properties,
+  @JsonProperty("credentialsProvider") CredentialsProvider 
credentialsProvider) {
+super(getCredentialsProvider(credentialsProvider), credentialsProvider == 
null);
+this.connection = connection;
+this.properties = Optional.ofNullable(properties).orElse(new Properties());
+  }
+
+  private DrillStoragePluginConfig(DrillStoragePluginConfig that,
+CredentialsProvider credentialsProvider) {
+super(getCredentialsProvider(credentialsProvider),
+  credentialsProvider == null, that.authMode);
+this.connection = that.connection;
+this.properties = that.properties;
+  }
+
+  @JsonProperty("connection")
+  public String getConnection() {
+return connection;
+  }
+
+  @JsonProperty("properties")
+  public Properties getProperties() {
+return properties;
+  }
+
+  private static CredentialsProvider 
getCredentialsProvider(CredentialsProvider credentialsProvider) {
+return credentialsProvider != null ? credentialsProvider : 
PlainCredentialsProvider.EMPTY_CREDENTIALS_PROVIDER;
+  }
+
+  @JsonIgnore
+  public String getIdentifierQuoteString() {
+return properties.getProperty(DrillProperties.QUOTING_IDENTIFIERS, 
DEFAULT_QUOTING_IDENTIFIER);
+  }
+
+  @Override
+  public DrillStoragePluginConfig updateCredentialProvider(CredentialsProvider 
credentialsProvider) {
+return new DrillStoragePluginConfig(this, credentialsProvider);
+  }
+
+  @JsonIgnore
+  public DrillClient getDrillClient(String userName, BufferAllocator 
allocator) {
+try {
+  String urlSuffix = 
connection.substring(CONNECTION_STRING_PREFIX.length());
+  Properties props = ConnectStringParser.parse(urlSuffix, properties);
+  props.putAll(credentialsProvider.getUserCredentials(userName));

Review Comment:
   Thanks, fixed.





> Storage plugin for querying other Apache Drill clusters
> ---
>
> Key: DRILL-8358
> URL: https://issues.apache.org/jira/browse/DRILL-8358
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by 

[jira] [Commented] (DRILL-8364) Add Support for OAuth Enabled File Systems

2022-12-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644955#comment-17644955
 ] 

ASF GitHub Bot commented on DRILL-8364:
---

cgivre commented on code in PR #2714:
URL: https://github.com/apache/drill/pull/2714#discussion_r1043748748


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/SeekableByteArrayInputStream.java:
##
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.dfs;
+
+import org.apache.hadoop.fs.PositionedReadable;
+import org.apache.hadoop.fs.Seekable;
+
+import java.io.ByteArrayInputStream;
+import java.io.IOException;
+
+public class SeekableByteArrayInputStream extends ByteArrayInputStream 
implements Seekable, PositionedReadable {

Review Comment:
   @jnturton 
   I was able to replace this in the Dropbox reader, however the Box reader did 
not work.   Since there is additional work planned in Drill-8367, is it ok to 
leave this as is and we will fix it in the context of Drill-8367?





> Add Support for OAuth Enabled File Systems
> --
>
> Key: DRILL-8364
> URL: https://issues.apache.org/jira/browse/DRILL-8364
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently Drill supports reading from file systems such as HDFS, S3 and 
> others that use token based authentication.  This PR extends Drill's plugin 
> architecture so that Drill can connect with other file systems which use 
> OAuth 2.0 for authentication.
> This PR also adds support for Drill to query Box. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8368) Update Yauaa to 7.9.0

2022-12-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644953#comment-17644953
 ] 

ASF GitHub Bot commented on DRILL-8368:
---

cgivre merged PR #2717:
URL: https://github.com/apache/drill/pull/2717




> Update Yauaa to 7.9.0
> -
>
> Key: DRILL-8368
> URL: https://issues.apache.org/jira/browse/DRILL-8368
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8364) Add Support for OAuth Enabled File Systems

2022-12-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644896#comment-17644896
 ] 

ASF GitHub Bot commented on DRILL-8364:
---

cgivre commented on code in PR #2714:
URL: https://github.com/apache/drill/pull/2714#discussion_r1043592053


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/BoxFileSystem.java:
##
@@ -0,0 +1,459 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.dfs;
+
+import com.box.sdk.BoxAPIConnection;
+import com.box.sdk.BoxFile;
+import com.box.sdk.BoxFolder;
+import com.box.sdk.BoxFolder.Info;
+import com.box.sdk.BoxItem;
+import com.box.sdk.BoxSearch;
+import com.box.sdk.BoxSearchParameters;
+import com.box.sdk.PartialCollection;
+import org.apache.commons.io.FilenameUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.logical.security.CredentialsProvider;
+import org.apache.drill.exec.oauth.PersistentTokenTable;
+import org.apache.drill.exec.store.security.oauth.OAuthTokenCredentials;
+import 
org.apache.drill.exec.store.security.oauth.OAuthTokenCredentials.Builder;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.permission.FsPermission;
+import org.apache.hadoop.util.Progressable;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.ByteArrayOutputStream;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Date;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+
+public class BoxFileSystem extends OAuthEnabledFileSystem {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(BoxFileSystem.class);
+  private static final String TIMEOUT_DEFAULT = "5000";
+  private static final List SEARCH_CONTENT_TYPES = new 
ArrayList<>(Collections.singletonList("name"));
+  private Path workingDirectory;
+  private BoxAPIConnection client;
+  private String workingDirectoryID;
+  private BoxFolder rootFolder;
+  private boolean usesDeveloperToken;
+  private final List ancestorFolderIDs = new ArrayList<>();
+  private final Map itemCache = new HashMap<>();
+
+  /**
+   * Returns a URI which identifies this FileSystem.
+   *
+   * @return the URI of this filesystem.
+   */
+  @Override
+  public URI getUri() {
+try {
+  return new URI("box:///");
+} catch (URISyntaxException e) {
+  throw new RuntimeException(e);
+}
+  }
+
+  /**
+   * Opens an FSDataInputStream at the indicated Path.
+   *
+   * @param inputPath the file name to open
+   * @param bufferSize the size of the buffer to be used.
+   * @throws IOException IO failure
+   */
+  @Override
+  public FSDataInputStream open(Path inputPath, int bufferSize) throws 
IOException {
+client = getClient();
+ByteArrayOutputStream out = new ByteArrayOutputStream();
+
+BoxItem item = getItem(inputPath);
+if (item instanceof BoxFile) {
+  BoxFile file = (BoxFile) getItem(inputPath);

Review Comment:
   Fixed





> Add Support for OAuth Enabled File Systems
> --
>
> Key: DRILL-8364
> URL: https://issues.apache.org/jira/browse/DRILL-8364
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently Drill supports reading from file systems such as HDFS, S3 and 
> others that use token based authentication.  This PR extends Drill's plugin 
> architecture so that Drill can connect with other file systems which use 
> OAuth 2.0 for authentication.
> This PR also adds support for Drill to query Box. 



--
This message 

[jira] [Commented] (DRILL-8364) Add Support for OAuth Enabled File Systems

2022-12-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644834#comment-17644834
 ] 

ASF GitHub Bot commented on DRILL-8364:
---

jnturton commented on code in PR #2714:
URL: https://github.com/apache/drill/pull/2714#discussion_r1042270417


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/BoxFileSystem.java:
##
@@ -0,0 +1,459 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.dfs;
+
+import com.box.sdk.BoxAPIConnection;
+import com.box.sdk.BoxFile;
+import com.box.sdk.BoxFolder;
+import com.box.sdk.BoxFolder.Info;
+import com.box.sdk.BoxItem;
+import com.box.sdk.BoxSearch;
+import com.box.sdk.BoxSearchParameters;
+import com.box.sdk.PartialCollection;
+import org.apache.commons.io.FilenameUtils;
+import org.apache.commons.lang3.StringUtils;
+import org.apache.drill.common.exceptions.UserException;
+import org.apache.drill.common.logical.security.CredentialsProvider;
+import org.apache.drill.exec.oauth.PersistentTokenTable;
+import org.apache.drill.exec.store.security.oauth.OAuthTokenCredentials;
+import 
org.apache.drill.exec.store.security.oauth.OAuthTokenCredentials.Builder;
+import org.apache.hadoop.fs.FSDataInputStream;
+import org.apache.hadoop.fs.FSDataOutputStream;
+import org.apache.hadoop.fs.FileStatus;
+import org.apache.hadoop.fs.Path;
+import org.apache.hadoop.fs.permission.FsPermission;
+import org.apache.hadoop.util.Progressable;
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import java.io.ByteArrayOutputStream;
+import java.io.FileNotFoundException;
+import java.io.IOException;
+import java.net.URI;
+import java.net.URISyntaxException;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.Collections;
+import java.util.Date;
+import java.util.HashMap;
+import java.util.List;
+import java.util.Map;
+
+
+public class BoxFileSystem extends OAuthEnabledFileSystem {
+
+  private static final Logger logger = 
LoggerFactory.getLogger(BoxFileSystem.class);
+  private static final String TIMEOUT_DEFAULT = "5000";
+  private static final List SEARCH_CONTENT_TYPES = new 
ArrayList<>(Collections.singletonList("name"));
+  private Path workingDirectory;
+  private BoxAPIConnection client;
+  private String workingDirectoryID;
+  private BoxFolder rootFolder;
+  private boolean usesDeveloperToken;
+  private final List ancestorFolderIDs = new ArrayList<>();
+  private final Map itemCache = new HashMap<>();
+
+  /**
+   * Returns a URI which identifies this FileSystem.
+   *
+   * @return the URI of this filesystem.
+   */
+  @Override
+  public URI getUri() {
+try {
+  return new URI("box:///");
+} catch (URISyntaxException e) {
+  throw new RuntimeException(e);
+}
+  }
+
+  /**
+   * Opens an FSDataInputStream at the indicated Path.
+   *
+   * @param inputPath the file name to open
+   * @param bufferSize the size of the buffer to be used.
+   * @throws IOException IO failure
+   */
+  @Override
+  public FSDataInputStream open(Path inputPath, int bufferSize) throws 
IOException {
+client = getClient();
+ByteArrayOutputStream out = new ByteArrayOutputStream();
+
+BoxItem item = getItem(inputPath);
+if (item instanceof BoxFile) {
+  BoxFile file = (BoxFile) getItem(inputPath);
+  updateTokens();
+
+  file.download(out);
+  updateTokens();
+
+  FSDataInputStream fsDataInputStream = new FSDataInputStream(new 
SeekableByteArrayInputStream(out.toByteArray()));

Review Comment:
   We're buffering query data into heap memory here, something we don't want to 
do, but I've just created DRILL-8367 so that we work through all of the places 
where this is done in a separate exercise.



##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/SeekableByteArrayInputStream.java:
##
@@ -0,0 +1,84 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the 

[jira] [Commented] (DRILL-8368) Update Yauaa to 7.9.0

2022-12-08 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644723#comment-17644723
 ] 

ASF GitHub Bot commented on DRILL-8368:
---

nielsbasjes opened a new pull request, #2717:
URL: https://github.com/apache/drill/pull/2717

   # [DRILL-8368](https://issues.apache.org/jira/browse/DRILL-8368): Updating 
Yauaa to 7.9.0
   
   ## Description
   
   Updating dependency
   




> Update Yauaa to 7.9.0
> -
>
> Key: DRILL-8368
> URL: https://issues.apache.org/jira/browse/DRILL-8368
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Niels Basjes
>Assignee: Niels Basjes
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8364) Add Support for OAuth Enabled File Systems

2022-12-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644345#comment-17644345
 ] 

ASF GitHub Bot commented on DRILL-8364:
---

jnturton commented on PR #2714:
URL: https://github.com/apache/drill/pull/2714#issuecomment-1341006745

   > @jnturton I don't know. Unlike the regular storage plugins, which are 
pretty self-contained, to use a file system, you have to "register" it in the 
`FileSystemPlugin` class.
   
   Okay, let's leave that refactoring for a separate PR.
   




> Add Support for OAuth Enabled File Systems
> --
>
> Key: DRILL-8364
> URL: https://issues.apache.org/jira/browse/DRILL-8364
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently Drill supports reading from file systems such as HDFS, S3 and 
> others that use token based authentication.  This PR extends Drill's plugin 
> architecture so that Drill can connect with other file systems which use 
> OAuth 2.0 for authentication.
> This PR also adds support for Drill to query Box. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8364) Add Support for OAuth Enabled File Systems

2022-12-07 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644326#comment-17644326
 ] 

ASF GitHub Bot commented on DRILL-8364:
---

cgivre commented on PR #2714:
URL: https://github.com/apache/drill/pull/2714#issuecomment-1340938156

   > 
   
   @jnturton 
   I don't know.  Unlike the regular storage plugins, which are pretty 
self-contained, to use a file system, you have to "register" it in the 
`FileSystemPlugin` class. (See below).   I assumed, perhaps incorrectly, that 
new file systems had to be in the `java-exec` package.  I'm not a packaging 
expert, so do you know what we'd have to do in order to move these to `contrib`?
   
   
https://github.com/apache/drill/blob/53e4227650755607db961b758b7966b1d6d4582f/exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemPlugin.java#L93-L96
   




> Add Support for OAuth Enabled File Systems
> --
>
> Key: DRILL-8364
> URL: https://issues.apache.org/jira/browse/DRILL-8364
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently Drill supports reading from file systems such as HDFS, S3 and 
> others that use token based authentication.  This PR extends Drill's plugin 
> architecture so that Drill can connect with other file systems which use 
> OAuth 2.0 for authentication.
> This PR also adds support for Drill to query Box. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8364) Add Support for OAuth Enabled File Systems

2022-12-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644154#comment-17644154
 ] 

ASF GitHub Bot commented on DRILL-8364:
---

jnturton commented on PR #2714:
URL: https://github.com/apache/drill/pull/2714#issuecomment-1340417592

   Would it be possible, for the sake of structuring the code base, to move the 
Box and Dropbox filesystem implementations out to contrib/storage-box and 
contrib/storage-dropbox modules (while keeping the current Java package 
organsiation)?




> Add Support for OAuth Enabled File Systems
> --
>
> Key: DRILL-8364
> URL: https://issues.apache.org/jira/browse/DRILL-8364
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently Drill supports reading from file systems such as HDFS, S3 and 
> others that use token based authentication.  This PR extends Drill's plugin 
> architecture so that Drill can connect with other file systems which use 
> OAuth 2.0 for authentication.
> This PR also adds support for Drill to query Box. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8366) Late release of compressor memory in the Parquet writer

2022-12-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17644054#comment-17644054
 ] 

ASF GitHub Bot commented on DRILL-8366:
---

cgivre merged PR #2716:
URL: https://github.com/apache/drill/pull/2716




> Late release of compressor memory in the Parquet writer
> ---
>
> Key: DRILL-8366
> URL: https://issues.apache.org/jira/browse/DRILL-8366
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 1.20.3
>
>
> The Parquet writer waits until the end of the entire write before releasing 
> its compression codec factory. The factory in turn releases compressors which 
> release direct memory buffers used during compression. This deferred release 
> leads a build up of direct memory use and can cause large write jobs to fail. 
> The Parquet writer can instead release the abovementioned each time that a 
> file/row group is flushed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17643997#comment-17643997
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

cgivre commented on PR #2713:
URL: https://github.com/apache/drill/pull/2713#issuecomment-1339798220

   > @cgivre
   > 
   > > I guess my first question is whose permissions will these commands run 
under?
   > > Another thing to think about is making sure that users can't arbitrarily 
add this code somehow to a query.
   > 
   > They'll run under the Drill process user. That user doesn't need much 
access to the OS but it generally will have lots of access, possibly including 
write, to data storage.
   
   My biggest concerns would be that a user could execute malicious commands 
and escalate privileges or access things that they don't have access to.  
However, in order to enable/disable plugins, the user has to be an admin 
anyway, so I think it will be ok.   I'd say we should be ok as long as we 
provide a boot level option to disable it. 
   
   > 
   > > Another thing to think about is making sure that users can't arbitrarily 
add this code somehow to a query.
   > 
   > Can't table functions set format config options but not storage config 
options?
   
   You are correct. 




> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8366) Late release of compressor memory in the Parquet writer

2022-12-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17643926#comment-17643926
 ] 

ASF GitHub Bot commented on DRILL-8366:
---

jnturton opened a new pull request, #2716:
URL: https://github.com/apache/drill/pull/2716

   # [DRILL-8366](https://issues.apache.org/jira/browse/DRILL-8366): Late 
release of compressor memory in the Parquet writer
   
   ## Description
   
   The Parquet writer waits until the end of the entire write before releasing 
its compression codec factory. The factory in turn releases compressors which 
release direct memory buffers used during compression. This deferred release 
leads a build up of direct memory use and can cause large write jobs to fail. 
The Parquet writer can instead release the abovementioned each time that a 
file/row group is flushed.
   
   ## Documentation
   N/A
   
   ## Testing
   
   Manually confirm the release of allocated compression buffers after each 
flush in the debug log output.
   Manually monitor memory usage during a big Parquet write job.
   




> Late release of compressor memory in the Parquet writer
> ---
>
> Key: DRILL-8366
> URL: https://issues.apache.org/jira/browse/DRILL-8366
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - Parquet
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 1.20.3
>
>
> The Parquet writer waits until the end of the entire write before releasing 
> its compression codec factory. The factory in turn releases compressors which 
> release direct memory buffers used during compression. This deferred release 
> leads a build up of direct memory use and can cause large write jobs to fail. 
> The Parquet writer can instead release the abovementioned each time that a 
> file/row group is flushed.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17643851#comment-17643851
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

jnturton commented on PR #2713:
URL: https://github.com/apache/drill/pull/2713#issuecomment-1339255735

   @cgivre 
   > I guess my first question is whose permissions will these commands run 
under?  
   > Another thing to think about is making sure that users can't arbitrarily 
add this code somehow to a query.
   
   They'll run under the Drill process user. That user doesn't need much access 
to the OS but it generally will have lots of access, possibly including write, 
to data storage.
   
   > Another thing to think about is making sure that users can't arbitrarily 
add this code somehow to a query.
   
   Can't table functions set format config options but not storage config 
options?




> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17643807#comment-17643807
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

cgivre commented on PR #2713:
URL: https://github.com/apache/drill/pull/2713#issuecomment-1339182473

   @jnturton 
   I guess my first question is whose permissions will these commands run 
under?  Are we impersonating the user for them? 
   I do like the idea of adding a boot level option to turn this on and off. 
   
   Another thing to think about is making sure that users can't arbitrarily add 
this code somehow to a query.
   
   IE:
   ```sql
   SELECT * 
   FROM table(dfs.test.something (mountCommand => 'rm -rf *')
   ```
   
   
   




> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-06 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17643802#comment-17643802
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

jnturton commented on PR #2713:
URL: https://github.com/apache/drill/pull/2713#issuecomment-1339169730

   @cgivre @vvysotskyi before I come and fix the unit test failures here, what 
is a good road forward as far as security goes?
   
   1. Make this feature disabled by default with a boot option for enabling it 
and add a warning about arbitrary command execution.
   2. Decide that users and applications that want to run mount commands must 
handle that themselves externally and close this PR.




> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8365) HTTP Plugin Places Parameters in Wrong Place

2022-12-05 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17643319#comment-17643319
 ] 

ASF GitHub Bot commented on DRILL-8365:
---

cgivre merged PR #2715:
URL: https://github.com/apache/drill/pull/2715




> HTTP Plugin Places Parameters in Wrong Place
> 
>
> Key: DRILL-8365
> URL: https://issues.apache.org/jira/browse/DRILL-8365
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HTTP
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.20.3
>
>
> When the requireTail option is set to true, and pagination is enabled, the 
> HTTP plugin puts the required parameters in the wrong place in the URL.  This 
> PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8365) HTTP Plugin Places Parameters in Wrong Place

2022-12-04 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17643074#comment-17643074
 ] 

ASF GitHub Bot commented on DRILL-8365:
---

cgivre opened a new pull request, #2715:
URL: https://github.com/apache/drill/pull/2715

   # [DRILL-8365](https://issues.apache.org/jira/browse/DRILL-8365): HTTP 
Plugin Places Parameters in Wrong Place
   
   ## Description
   This PR fixes a bug when a user configures a HTTP plugin with the 
`requireTail` option.  The plugin was putting the URL parameters in the wrong 
place when pagination is used.  This fixes that.
   
   ## Documentation
   N/A
   
   ## Testing
   Added unit test and ran existing unit tests.




> HTTP Plugin Places Parameters in Wrong Place
> 
>
> Key: DRILL-8365
> URL: https://issues.apache.org/jira/browse/DRILL-8365
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Storage - HTTP
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.20.3
>
>
> When the requireTail option is set to true, and pagination is enabled, the 
> HTTP plugin puts the required parameters in the wrong place in the URL.  This 
> PR fixes that.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8364) Add Support for OAuth Enabled File Systems

2022-12-03 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642943#comment-17642943
 ] 

ASF GitHub Bot commented on DRILL-8364:
---

cgivre opened a new pull request, #2714:
URL: https://github.com/apache/drill/pull/2714

   # [DRILL-8364](https://issues.apache.org/jira/browse/DRILL-8364): Add 
Support for OAuth Enabled File Systems
   
   ## Description
   This PR adds support for Drill to query file systems which use OAuth for 
authorization and authentication.  This PR also adds support for Drill to query 
[Box.com](https://box.com). 
   
   ## Documentation
   See README for Box for documentation. 
   
   ## Testing
   Added unit tests and tested manually.




> Add Support for OAuth Enabled File Systems
> --
>
> Key: DRILL-8364
> URL: https://issues.apache.org/jira/browse/DRILL-8364
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> Currently Drill supports reading from file systems such as HDFS, S3 and 
> others that use token based authentication.  This PR extends Drill's plugin 
> architecture so that Drill can connect with other file systems which use 
> OAuth 2.0 for authentication.
> This PR also adds support for Drill to query Box. 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-12-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642681#comment-17642681
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

kmatt commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1335803778

   @cgivre @vvysotskyi Thanks, I missed the "will be" clause ;)




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-12-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642591#comment-17642591
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

cgivre commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1335460790

   @kmatt This hasn't been implemented yet.   That's why the query doesn't yet 
work.  @vvysotskyi is working on that. :-) 




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-12-02 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642588#comment-17642588
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

kmatt commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1335452201

   The version function seems not to parse:
   
   ```
   apache drill (dfs.delta)> select count(*) from 
table(dfs.delta.`delta_table`(type => 'delta'));
   ++
   | EXPR$0 |
   ++
   | 20 |
   ++
   1 row selected (0.157 seconds)
   
   apache drill (dfs.delta)> SELECT *
   2..semicolon> FROM table(dfs.delta.`delta_table`(type => 
'delta', version => 0));
   Error: VALIDATION ERROR: From line 2, column 22 to line 2, column 75: No 
match found for function signature delta_table(type => , version => 
)
   ```




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642321#comment-17642321
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

vvysotskyi commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1334839612

   Hi @kmatt, no, it is not supported yet, but will be added in the near 
future. The version will be specified using the table function. Here is the 
example query for it:
   ```sql
   SELECT *
   FROM table(dfs.delta.`/tmp/delta-table`(type => 'delta', version => 0));
   ```




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642248#comment-17642248
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

kmatt commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1334708491

   @vvysotskyi Does this support VERSION AS OF queries?
   
   
https://docs.delta.io/latest/quick-start.html#read-older-versions-of-data-using-time-travel
   
   Ex: `SELECT * FROM dfs.delta.`/tmp/delta-table` VERSION AS OF 0;`
   




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642177#comment-17642177
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

cgivre commented on code in PR #2713:
URL: https://github.com/apache/drill/pull/2713#discussion_r1037418363


##
contrib/storage-splunk/pom.xml:
##
@@ -42,7 +42,7 @@
 
   com.splunk
   splunk
-  1.9.1
+  1.9.2

Review Comment:
   Do we want to include the Splunk update on this PR?



##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemConfig.java:
##
@@ -53,18 +53,23 @@ public class FileSystemConfig extends StoragePluginConfig {
   public static final String NAME = "file";
 
   private final String connection;
+  private final String[] mountCommand, unmountCommand;

Review Comment:
   Is there a reason we're using `String[]` and not an `ArrayList` here?



##
exec/java-exec/src/main/java/org/apache/drill/exec/store/dfs/FileSystemPlugin.java:
##
@@ -282,4 +286,82 @@ public Set 
getOptimizerRules(OptimizerRulesContext optimiz
   public Configuration getFsConf() {
 return new Configuration(fsConf);
   }
+
+  /**
+   * Runs the configured mount command if the mounted flag is unset
+   * @return true if the configured mount command was executed
+   */
+  private synchronized boolean mount() {
+String[] mountCmd = config.getMountCommand();
+if (ArrayUtils.isEmpty(mountCmd)) {
+  return false;
+}
+try {
+  Process proc = Runtime.getRuntime().exec(mountCmd);
+  if (proc.waitFor() != 0) {
+String stderrOutput = IOUtils.toString(proc.getErrorStream(), 
StandardCharsets.UTF_8);
+throw new IOException(stderrOutput);
+  }
+  logger.info("The mount command for plugin {} succeeded.", getName());
+  return true;
+} catch (IOException | InterruptedException e) {
+  logger.error("The mount command for plugin {} failed.", getName(), e);
+  throw UserException.pluginError(e)
+.message("The mount command for plugin %s failed.", getName())
+.build(logger);
+}
+  }
+
+  /**
+   * Runs the configured unmount command if the mounted flag is set
+   * @return true if the configured unmount command was executed
+   */
+  private synchronized boolean unmount() {
+String[] unmountCmd = config.getUnmountCommand();
+if (ArrayUtils.isEmpty(unmountCmd)) {

Review Comment:
   See above comment about `String[]` vs `ArrayList`.





> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8359) Add mount and unmount command support to the filesystem plugin

2022-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8359?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17642074#comment-17642074
 ] 

ASF GitHub Bot commented on DRILL-8359:
---

jnturton opened a new pull request, #2713:
URL: https://github.com/apache/drill/pull/2713

   # [DRILL-8359](https://issues.apache.org/jira/browse/DRILL-8359): Add mount 
and unmount command support to the filesystem plugin
   
   ## Description
   
   Optional mount and unmount commands are added toi the filesystem plugin with 
the goal of enabling the dynamic definition of filesystem mounts in the storage 
configuration. It is mainly anticpiated that network and cloud filesystems that 
have FUSE drivers (sshfs, davfs, rclone, ...) will be used in this way but 
local device mounts and image/loop device mounts (ISO, IMG, squashfs, etc.) 
might also be of interest. Filesystems that can be mounted in this way become 
queryable by Drill without burden of dedicated storage plugin development.
   
   The provided commands are executed in their own processes by the host OS and 
run under the OS user that is running the Drill JVM. The mount command will be 
executed when an enabled plugin is initialised (something that is done lazily) 
and when it transitions from disabled to enabled. The provided unmount command 
will be executed whenever a plugin transitions from enabled to disabled and 
when the Drillbit shuts down while the plugin is enabled.
   
   Example using udisks on Linux to mount and unmount an image of an ext4 
filesystem.
   ```
   {
 "type" : "file",
 "connection" : "file:///",
 "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
udisksctl mount -b /dev/loop0" ],
 "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
udisksctl loop-delete -b /dev/loop0" ],
 "workspaces" : {
...
   ```
   
   ## Documentation
   New sections to be added to the filesystem doc page.
   
   ## Testing
   New unit test TestMountCommand.
   Manual test of mount commands through different sequences of startup, 
enable, disable, shutdown.
   




> Add mount and unmount command support to the filesystem plugin
> --
>
> Key: DRILL-8359
> URL: https://issues.apache.org/jira/browse/DRILL-8359
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - File
>Affects Versions: 1.20.2
>Reporter: James Turton
>Assignee: James Turton
>Priority: Minor
> Fix For: 2.0.0
>
>
> This Jira proposes optional mount and unmount commands in the filesystem 
> plugin with the goal of enabling the dynamic definition of filesystem mounts 
> in the storage configuration. It is mainly anticpiated that network and cloud 
> filesystems that have FUSE drivers (sshfs, davfs, rclone, ...) will be used 
> in this way but local device mounts and image/loop device mounts (ISO, IMG, 
> squashfs, etc.) might also be of interest. Filesystems that can be mounted in 
> this way become queryable by Drill cluster without burden of dedicated 
> storage plugin development.
> The provided commands are executed in their own processes by the host OS and 
> run under the OS user that is running the Drill JVM. The mount command will 
> be executed when an enabled plugin is initialised (something that is done 
> lazily) and whenever it transitions from disabled to enabled. The provided 
> unmount command will be executed whenever a plugin transitions from enabled 
> to disabled and when the Drillbit shuts down while the plugin has been 
> initialised and is enabled.
> Example using udisks on Linux to mount and unmount an image of an ext4 
> filesystem.
> {code:java}
> {
>   "type" : "file",
>   "connection" : "file:///",
>   "mountCommand" : [ "sh", "-c", "udisksctl loop-setup -f /tmp/test.img && 
> udisksctl mount -b /dev/loop0" ],
>   "unmountCommand" : [ "sh", "-c", "udisksctl unmount -b /dev/loop0 && 
> udisksctl loop-delete -b /dev/loop0" ],
>   "workspaces" : {
>  ...{code}



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8358) Storage plugin for querying other Apache Drill clusters

2022-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641892#comment-17641892
 ] 

ASF GitHub Bot commented on DRILL-8358:
---

jnturton commented on code in PR #2709:
URL: https://github.com/apache/drill/pull/2709#discussion_r1037061461


##
contrib/storage-drill/src/main/java/org/apache/drill/exec/store/drill/plugin/DrillStoragePluginConfig.java:
##
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.drill.plugin;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.calcite.avatica.ConnectStringParser;
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.common.config.DrillProperties;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.common.logical.security.CredentialsProvider;
+import org.apache.drill.common.logical.security.PlainCredentialsProvider;
+import org.apache.drill.exec.client.DrillClient;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.rpc.RpcException;
+
+import java.sql.SQLException;
+import java.util.Objects;
+import java.util.Optional;
+import java.util.Properties;
+
+@JsonTypeName(DrillStoragePluginConfig.NAME)
+public class DrillStoragePluginConfig extends StoragePluginConfig {
+  public static final String NAME = "drill";
+  public static final String CONNECTION_STRING_PREFIX = "jdbc:drill:";
+
+  private static final String DEFAULT_QUOTING_IDENTIFIER = "`";
+
+  private final String connection;
+  private final Properties properties;
+
+  @JsonCreator
+  public DrillStoragePluginConfig(
+  @JsonProperty("connection") String connection,
+  @JsonProperty("properties") Properties properties,
+  @JsonProperty("credentialsProvider") CredentialsProvider 
credentialsProvider) {
+super(getCredentialsProvider(credentialsProvider), credentialsProvider == 
null);
+this.connection = connection;
+this.properties = Optional.ofNullable(properties).orElse(new Properties());
+  }
+
+  private DrillStoragePluginConfig(DrillStoragePluginConfig that,
+CredentialsProvider credentialsProvider) {
+super(getCredentialsProvider(credentialsProvider),
+  credentialsProvider == null, that.authMode);
+this.connection = that.connection;
+this.properties = that.properties;
+  }
+
+  @JsonProperty("connection")
+  public String getConnection() {
+return connection;
+  }
+
+  @JsonProperty("properties")
+  public Properties getProperties() {
+return properties;
+  }
+
+  private static CredentialsProvider 
getCredentialsProvider(CredentialsProvider credentialsProvider) {
+return credentialsProvider != null ? credentialsProvider : 
PlainCredentialsProvider.EMPTY_CREDENTIALS_PROVIDER;
+  }
+
+  @JsonIgnore
+  public String getIdentifierQuoteString() {
+return properties.getProperty(DrillProperties.QUOTING_IDENTIFIERS, 
DEFAULT_QUOTING_IDENTIFIER);
+  }
+
+  @Override
+  public DrillStoragePluginConfig updateCredentialProvider(CredentialsProvider 
credentialsProvider) {
+return new DrillStoragePluginConfig(this, credentialsProvider);
+  }
+
+  @JsonIgnore
+  public DrillClient getDrillClient(String userName, BufferAllocator 
allocator) {
+try {
+  String urlSuffix = 
connection.substring(CONNECTION_STRING_PREFIX.length());
+  Properties props = ConnectStringParser.parse(urlSuffix, properties);
+  props.putAll(credentialsProvider.getUserCredentials(userName));

Review Comment:
   This getUserCredentials(String username) method is meant to fetch 
per-query-user credentials for plugins that are in user translation auth mode 
while the nullary method getUserCredentials() is meant for shared credentials. 
Only the plain and Vault providers currently support per-user credentials. You 
can see some logic for deciding which to call (via UsernamePasswordCredentials 
objects) in JdbcStorageConfig on line 142.
   
   Those APIs wound up being a 

[jira] [Commented] (DRILL-8358) Storage plugin for querying other Apache Drill clusters

2022-12-01 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641891#comment-17641891
 ] 

ASF GitHub Bot commented on DRILL-8358:
---

jnturton commented on code in PR #2709:
URL: https://github.com/apache/drill/pull/2709#discussion_r1037061461


##
contrib/storage-drill/src/main/java/org/apache/drill/exec/store/drill/plugin/DrillStoragePluginConfig.java:
##
@@ -0,0 +1,126 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.drill.plugin;
+
+import com.fasterxml.jackson.annotation.JsonCreator;
+import com.fasterxml.jackson.annotation.JsonIgnore;
+import com.fasterxml.jackson.annotation.JsonProperty;
+import com.fasterxml.jackson.annotation.JsonTypeName;
+import org.apache.calcite.avatica.ConnectStringParser;
+import org.apache.drill.common.config.DrillConfig;
+import org.apache.drill.common.config.DrillProperties;
+import org.apache.drill.common.logical.StoragePluginConfig;
+import org.apache.drill.common.logical.security.CredentialsProvider;
+import org.apache.drill.common.logical.security.PlainCredentialsProvider;
+import org.apache.drill.exec.client.DrillClient;
+import org.apache.drill.exec.memory.BufferAllocator;
+import org.apache.drill.exec.rpc.RpcException;
+
+import java.sql.SQLException;
+import java.util.Objects;
+import java.util.Optional;
+import java.util.Properties;
+
+@JsonTypeName(DrillStoragePluginConfig.NAME)
+public class DrillStoragePluginConfig extends StoragePluginConfig {
+  public static final String NAME = "drill";
+  public static final String CONNECTION_STRING_PREFIX = "jdbc:drill:";
+
+  private static final String DEFAULT_QUOTING_IDENTIFIER = "`";
+
+  private final String connection;
+  private final Properties properties;
+
+  @JsonCreator
+  public DrillStoragePluginConfig(
+  @JsonProperty("connection") String connection,
+  @JsonProperty("properties") Properties properties,
+  @JsonProperty("credentialsProvider") CredentialsProvider 
credentialsProvider) {
+super(getCredentialsProvider(credentialsProvider), credentialsProvider == 
null);
+this.connection = connection;
+this.properties = Optional.ofNullable(properties).orElse(new Properties());
+  }
+
+  private DrillStoragePluginConfig(DrillStoragePluginConfig that,
+CredentialsProvider credentialsProvider) {
+super(getCredentialsProvider(credentialsProvider),
+  credentialsProvider == null, that.authMode);
+this.connection = that.connection;
+this.properties = that.properties;
+  }
+
+  @JsonProperty("connection")
+  public String getConnection() {
+return connection;
+  }
+
+  @JsonProperty("properties")
+  public Properties getProperties() {
+return properties;
+  }
+
+  private static CredentialsProvider 
getCredentialsProvider(CredentialsProvider credentialsProvider) {
+return credentialsProvider != null ? credentialsProvider : 
PlainCredentialsProvider.EMPTY_CREDENTIALS_PROVIDER;
+  }
+
+  @JsonIgnore
+  public String getIdentifierQuoteString() {
+return properties.getProperty(DrillProperties.QUOTING_IDENTIFIERS, 
DEFAULT_QUOTING_IDENTIFIER);
+  }
+
+  @Override
+  public DrillStoragePluginConfig updateCredentialProvider(CredentialsProvider 
credentialsProvider) {
+return new DrillStoragePluginConfig(this, credentialsProvider);
+  }
+
+  @JsonIgnore
+  public DrillClient getDrillClient(String userName, BufferAllocator 
allocator) {
+try {
+  String urlSuffix = 
connection.substring(CONNECTION_STRING_PREFIX.length());
+  Properties props = ConnectStringParser.parse(urlSuffix, properties);
+  props.putAll(credentialsProvider.getUserCredentials(userName));

Review Comment:
   This getUserCredentials(String username) method is meant to fetch 
per-query-user credentials for plugins that are in user translation auth mode 
while the nullary method getUserCredentials() is meant for shared credentials. 
Only the plan and Vault providers currently support per-user credentials. You 
can see some logic for deciding which to call  (via UsernamePasswordCredentials 
objects) in JdbcStorageConfig on line 142.
   
   Those APIs wound up being a 

[jira] [Commented] (DRILL-8363) upgrade postgresql to 42.4.3 due to security issue

2022-11-30 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641569#comment-17641569
 ] 

ASF GitHub Bot commented on DRILL-8363:
---

cgivre merged PR #2712:
URL: https://github.com/apache/drill/pull/2712




> upgrade postgresql to 42.4.3 due to security issue
> --
>
> Key: DRILL-8363
> URL: https://issues.apache.org/jira/browse/DRILL-8363
> Project: Apache Drill
>  Issue Type: Task
>  Components: Storage - JDBC
>Reporter: PJ Fanning
>Priority: Major
>
> https://github.com/advisories/GHSA-562r-vg33-8x8h



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17641049#comment-17641049
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

kmatt commented on PR #2702:
URL: https://github.com/apache/drill/pull/2702#issuecomment-1331605371

   On Windows 10 `git clone` fails due to a path length in this patch. Repo 
clones successfully on Debian 11.
   
   ```
   git clone https://github.com/apache/drill.git
   Cloning into 'drill'...
   remote: Enumerating objects: 156537, done.
   remote: Counting objects: 100% (1323/1323), done.
   remote: Compressing objects: 100% (723/723), done.
   remote: Total 156537 (delta 322), reused 1119 (delta 218), pack-reused 
155214Receiving objects: 100% (156537/156537), 62.00 MiB | 11.15 MiBReceiving 
objects: 100% (156537/156537), 65.97 MiB | 11.24 MiB/s, done.
   
   Resolving deltas: 100% (79075/79075), done.
   fatal: cannot create directory at 
'contrib/format-deltalake/src/test/resources/data-reader-partition-values/as_int=0/as_long=0/as_byte=0/as_short=0/as_boolean=true/as_float=0.0/as_double=0.0/as_string=0/as_string_lit_null=null/as_date=2021-09-08/as_timestamp=2021-09-08
 11%3A11%3A11': Filename too long
   warning: Clone succeeded, but checkout failed.
   You can inspect what was checked out with 'git status'
   and retry with 'git restore --source=HEAD :/'
   ```
   




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8362) upgrade excel-streaming-reader v4.0.5

2022-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640736#comment-17640736
 ] 

ASF GitHub Bot commented on DRILL-8362:
---

cgivre merged PR #2711:
URL: https://github.com/apache/drill/pull/2711




> upgrade excel-streaming-reader v4.0.5
> -
>
> Key: DRILL-8362
> URL: https://issues.apache.org/jira/browse/DRILL-8362
> Project: Apache Drill
>  Issue Type: Task
>Reporter: PJ Fanning
>Priority: Major
>
> A few small issues have been fixed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8362) upgrade excel-streaming-reader v4.0.5

2022-11-29 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640631#comment-17640631
 ] 

ASF GitHub Bot commented on DRILL-8362:
---

pjfanning opened a new pull request, #2711:
URL: https://github.com/apache/drill/pull/2711

   ## Description
   
   A couple of small bugs fixed. Ones that may not actually affect Drill usage 
but still tidier to upgrade.
   
   ## Documentation
   (Please describe user-visible changes similar to what should appear in the 
Drill documentation.)
   
   ## Testing
   (Please describe how this PR has been tested.)
   




> upgrade excel-streaming-reader v4.0.5
> -
>
> Key: DRILL-8362
> URL: https://issues.apache.org/jira/browse/DRILL-8362
> Project: Apache Drill
>  Issue Type: Task
>Reporter: PJ Fanning
>Priority: Major
>
> A few small issues have been fixed



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8360) Add Provided Schema for XML Reader

2022-11-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640171#comment-17640171
 ] 

ASF GitHub Bot commented on DRILL-8360:
---

cgivre merged PR #2710:
URL: https://github.com/apache/drill/pull/2710




> Add Provided Schema for XML Reader
> --
>
> Key: DRILL-8360
> URL: https://issues.apache.org/jira/browse/DRILL-8360
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Format - XML
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> The XML reader does not support provisioned schema.  This PR adds that 
> support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8360) Add Provided Schema for XML Reader

2022-11-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17640102#comment-17640102
 ] 

ASF GitHub Bot commented on DRILL-8360:
---

cgivre commented on code in PR #2710:
URL: https://github.com/apache/drill/pull/2710#discussion_r1033690003


##
contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLReader.java:
##
@@ -428,8 +435,67 @@ private void writeFieldData(String fieldName, String 
fieldValue, TupleWriter wri
   index = writer.addColumn(colSchema);
 }
 ScalarWriter colWriter = writer.scalar(index);
+ColumnMetadata columnMetadata = writer.tupleSchema().metadata(index);
+MinorType dataType = columnMetadata.schema().getType().getMinorType();
+String dateFormat;
+
+// Write the values depending on their data type.  This only applies to 
scalar fields.
 if (fieldValue != null && (currentState != xmlState.ROW_ENDED && 
currentState != xmlState.FIELD_ENDED)) {
-  colWriter.setString(fieldValue);
+  switch (dataType) {
+case BIT:
+  colWriter.setBoolean(Boolean.parseBoolean(fieldValue));
+  break;
+case TINYINT:
+case SMALLINT:
+case INT:
+  colWriter.setInt(Integer.parseInt(fieldValue));
+  break;
+case BIGINT:
+  colWriter.setLong(Long.parseLong(fieldValue));
+  break;
+case FLOAT4:
+case FLOAT8:
+  colWriter.setDouble(Double.parseDouble(fieldValue));
+  break;
+case DATE:
+  dateFormat = columnMetadata.property("drill.format");
+  LocalDate localDate;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+localDate = LocalDate.parse(fieldValue);
+  } else {
+localDate = LocalDate.parse(fieldValue, 
DateTimeFormatter.ofPattern(dateFormat));
+  }
+  colWriter.setDate(localDate);
+  break;
+case TIME:
+  dateFormat = columnMetadata.property("drill.format");
+  LocalTime localTime;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+localTime = LocalTime.parse(fieldValue);
+  } else {
+localTime = LocalTime.parse(fieldValue, 
DateTimeFormatter.ofPattern(dateFormat));
+  }
+  colWriter.setTime(localTime);
+  break;
+case TIMESTAMP:
+  dateFormat = columnMetadata.property("drill.format");
+  Instant timestamp = null;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+timestamp = Instant.parse(fieldValue);
+  } else {
+try {
+  SimpleDateFormat simpleDateFormat = new 
SimpleDateFormat(dateFormat);
+  Date parsedDate = simpleDateFormat.parse(fieldValue);
+  timestamp = Instant.ofEpochMilli(parsedDate.getTime());
+} catch (ParseException e) {

Review Comment:
   @jnturton Thanks for the review comments.  I fixed this here and also in the 
PDF reader.





> Add Provided Schema for XML Reader
> --
>
> Key: DRILL-8360
> URL: https://issues.apache.org/jira/browse/DRILL-8360
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Format - XML
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> The XML reader does not support provisioned schema.  This PR adds that 
> support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8360) Add Provided Schema for XML Reader

2022-11-28 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17639904#comment-17639904
 ] 

ASF GitHub Bot commented on DRILL-8360:
---

jnturton commented on PR #2710:
URL: https://github.com/apache/drill/pull/2710#issuecomment-1328783306

   > It seems like we're getting random failures in the Splunk unit tests as 
well.
   
   Were you able to capture any output? The most recent run didn't reveal 
anything.




> Add Provided Schema for XML Reader
> --
>
> Key: DRILL-8360
> URL: https://issues.apache.org/jira/browse/DRILL-8360
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Format - XML
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> The XML reader does not support provisioned schema.  This PR adds that 
> support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8360) Add Provided Schema for XML Reader

2022-11-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17639797#comment-17639797
 ] 

ASF GitHub Bot commented on DRILL-8360:
---

jnturton commented on code in PR #2710:
URL: https://github.com/apache/drill/pull/2710#discussion_r1033143646


##
contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLReader.java:
##
@@ -428,8 +435,67 @@ private void writeFieldData(String fieldName, String 
fieldValue, TupleWriter wri
   index = writer.addColumn(colSchema);
 }
 ScalarWriter colWriter = writer.scalar(index);
+ColumnMetadata columnMetadata = writer.tupleSchema().metadata(index);
+MinorType dataType = columnMetadata.schema().getType().getMinorType();
+String dateFormat;
+
+// Write the values depending on their data type.  This only applies to 
scalar fields.
 if (fieldValue != null && (currentState != xmlState.ROW_ENDED && 
currentState != xmlState.FIELD_ENDED)) {
-  colWriter.setString(fieldValue);
+  switch (dataType) {
+case BIT:
+  colWriter.setBoolean(Boolean.parseBoolean(fieldValue));
+  break;
+case TINYINT:
+case SMALLINT:
+case INT:
+  colWriter.setInt(Integer.parseInt(fieldValue));
+  break;
+case BIGINT:
+  colWriter.setLong(Long.parseLong(fieldValue));
+  break;
+case FLOAT4:
+case FLOAT8:
+  colWriter.setDouble(Double.parseDouble(fieldValue));
+  break;
+case DATE:
+  dateFormat = columnMetadata.property("drill.format");
+  LocalDate localDate;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+localDate = LocalDate.parse(fieldValue);
+  } else {
+localDate = LocalDate.parse(fieldValue, 
DateTimeFormatter.ofPattern(dateFormat));
+  }
+  colWriter.setDate(localDate);
+  break;
+case TIME:
+  dateFormat = columnMetadata.property("drill.format");
+  LocalTime localTime;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+localTime = LocalTime.parse(fieldValue);
+  } else {
+localTime = LocalTime.parse(fieldValue, 
DateTimeFormatter.ofPattern(dateFormat));
+  }
+  colWriter.setTime(localTime);
+  break;
+case TIMESTAMP:
+  dateFormat = columnMetadata.property("drill.format");
+  Instant timestamp = null;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+timestamp = Instant.parse(fieldValue);
+  } else {
+try {
+  SimpleDateFormat simpleDateFormat = new 
SimpleDateFormat(dateFormat);
+  Date parsedDate = simpleDateFormat.parse(fieldValue);
+  timestamp = Instant.ofEpochMilli(parsedDate.getTime());
+} catch (ParseException e) {

Review Comment:
   Not only dates and times but numerics and strings when invalid UTF-8 
sequences are encountered! I created 
[DRILL-8361](https://issues.apache.org/jira/browse/DRILL-8361) for this.
   
   I see we slipped up in the PDF plugin and we have two places where invalid 
data swallowing is present. Let's make both of them (XML and PDF) consistent 
with the rest of Drill (do not swallow invalid data) in this PR, and do the 
nulling of invalid data properly in 8361?





> Add Provided Schema for XML Reader
> --
>
> Key: DRILL-8360
> URL: https://issues.apache.org/jira/browse/DRILL-8360
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Format - XML
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> The XML reader does not support provisioned schema.  This PR adds that 
> support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8360) Add Provided Schema for XML Reader

2022-11-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17639704#comment-17639704
 ] 

ASF GitHub Bot commented on DRILL-8360:
---

cgivre commented on PR #2710:
URL: https://github.com/apache/drill/pull/2710#issuecomment-1328301591

   It seems like we're getting random failures in the Splunk unit tests as well.




> Add Provided Schema for XML Reader
> --
>
> Key: DRILL-8360
> URL: https://issues.apache.org/jira/browse/DRILL-8360
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Format - XML
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> The XML reader does not support provisioned schema.  This PR adds that 
> support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8360) Add Provided Schema for XML Reader

2022-11-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17639703#comment-17639703
 ] 

ASF GitHub Bot commented on DRILL-8360:
---

cgivre commented on code in PR #2710:
URL: https://github.com/apache/drill/pull/2710#discussion_r1032983292


##
contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLReader.java:
##
@@ -428,8 +435,67 @@ private void writeFieldData(String fieldName, String 
fieldValue, TupleWriter wri
   index = writer.addColumn(colSchema);
 }
 ScalarWriter colWriter = writer.scalar(index);
+ColumnMetadata columnMetadata = writer.tupleSchema().metadata(index);
+MinorType dataType = columnMetadata.schema().getType().getMinorType();
+String dateFormat;
+
+// Write the values depending on their data type.  This only applies to 
scalar fields.
 if (fieldValue != null && (currentState != xmlState.ROW_ENDED && 
currentState != xmlState.FIELD_ENDED)) {
-  colWriter.setString(fieldValue);
+  switch (dataType) {
+case BIT:
+  colWriter.setBoolean(Boolean.parseBoolean(fieldValue));
+  break;
+case TINYINT:
+case SMALLINT:
+case INT:
+  colWriter.setInt(Integer.parseInt(fieldValue));
+  break;
+case BIGINT:
+  colWriter.setLong(Long.parseLong(fieldValue));
+  break;
+case FLOAT4:
+case FLOAT8:
+  colWriter.setDouble(Double.parseDouble(fieldValue));
+  break;
+case DATE:
+  dateFormat = columnMetadata.property("drill.format");
+  LocalDate localDate;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+localDate = LocalDate.parse(fieldValue);
+  } else {
+localDate = LocalDate.parse(fieldValue, 
DateTimeFormatter.ofPattern(dateFormat));
+  }
+  colWriter.setDate(localDate);
+  break;
+case TIME:
+  dateFormat = columnMetadata.property("drill.format");
+  LocalTime localTime;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+localTime = LocalTime.parse(fieldValue);
+  } else {
+localTime = LocalTime.parse(fieldValue, 
DateTimeFormatter.ofPattern(dateFormat));
+  }
+  colWriter.setTime(localTime);
+  break;
+case TIMESTAMP:
+  dateFormat = columnMetadata.property("drill.format");
+  Instant timestamp = null;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+timestamp = Instant.parse(fieldValue);
+  } else {
+try {
+  SimpleDateFormat simpleDateFormat = new 
SimpleDateFormat(dateFormat);
+  Date parsedDate = simpleDateFormat.parse(fieldValue);
+  timestamp = Instant.ofEpochMilli(parsedDate.getTime());
+} catch (ParseException e) {

Review Comment:
   @jnturton In principle I agree.  The code I used for date parsing was 
borrowed from elsewhere in Drill, so at least the behavior is consistent if not 
ideal.  Should I create another JIRA to introduce a global/session option for 
behavior on date/time casts?





> Add Provided Schema for XML Reader
> --
>
> Key: DRILL-8360
> URL: https://issues.apache.org/jira/browse/DRILL-8360
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Format - XML
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> The XML reader does not support provisioned schema.  This PR adds that 
> support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8360) Add Provided Schema for XML Reader

2022-11-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17639685#comment-17639685
 ] 

ASF GitHub Bot commented on DRILL-8360:
---

jnturton commented on code in PR #2710:
URL: https://github.com/apache/drill/pull/2710#discussion_r1032980510


##
contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLReader.java:
##
@@ -428,8 +435,67 @@ private void writeFieldData(String fieldName, String 
fieldValue, TupleWriter wri
   index = writer.addColumn(colSchema);
 }
 ScalarWriter colWriter = writer.scalar(index);
+ColumnMetadata columnMetadata = writer.tupleSchema().metadata(index);
+MinorType dataType = columnMetadata.schema().getType().getMinorType();
+String dateFormat;
+
+// Write the values depending on their data type.  This only applies to 
scalar fields.
 if (fieldValue != null && (currentState != xmlState.ROW_ENDED && 
currentState != xmlState.FIELD_ENDED)) {
-  colWriter.setString(fieldValue);
+  switch (dataType) {
+case BIT:
+  colWriter.setBoolean(Boolean.parseBoolean(fieldValue));
+  break;
+case TINYINT:
+case SMALLINT:
+case INT:
+  colWriter.setInt(Integer.parseInt(fieldValue));
+  break;
+case BIGINT:
+  colWriter.setLong(Long.parseLong(fieldValue));
+  break;
+case FLOAT4:
+case FLOAT8:
+  colWriter.setDouble(Double.parseDouble(fieldValue));
+  break;
+case DATE:
+  dateFormat = columnMetadata.property("drill.format");
+  LocalDate localDate;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+localDate = LocalDate.parse(fieldValue);
+  } else {
+localDate = LocalDate.parse(fieldValue, 
DateTimeFormatter.ofPattern(dateFormat));
+  }
+  colWriter.setDate(localDate);
+  break;
+case TIME:
+  dateFormat = columnMetadata.property("drill.format");
+  LocalTime localTime;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+localTime = LocalTime.parse(fieldValue);
+  } else {
+localTime = LocalTime.parse(fieldValue, 
DateTimeFormatter.ofPattern(dateFormat));
+  }
+  colWriter.setTime(localTime);
+  break;
+case TIMESTAMP:
+  dateFormat = columnMetadata.property("drill.format");
+  Instant timestamp = null;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+timestamp = Instant.parse(fieldValue);
+  } else {
+try {
+  SimpleDateFormat simpleDateFormat = new 
SimpleDateFormat(dateFormat);
+  Date parsedDate = simpleDateFormat.parse(fieldValue);
+  timestamp = Instant.ofEpochMilli(parsedDate.getTime());
+} catch (ParseException e) {

Review Comment:
   Maybe this option is better as a global that is overridable at the session 
level come to think of it. But until we have such an option I think the default 
behaviour so far is that we fail on invalid data and we shouldn't muddy things 
up.





> Add Provided Schema for XML Reader
> --
>
> Key: DRILL-8360
> URL: https://issues.apache.org/jira/browse/DRILL-8360
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Format - XML
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> The XML reader does not support provisioned schema.  This PR adds that 
> support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8360) Add Provided Schema for XML Reader

2022-11-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17639678#comment-17639678
 ] 

ASF GitHub Bot commented on DRILL-8360:
---

jnturton commented on code in PR #2710:
URL: https://github.com/apache/drill/pull/2710#discussion_r1032969131


##
contrib/format-xml/src/main/java/org/apache/drill/exec/store/xml/XMLReader.java:
##
@@ -428,8 +435,67 @@ private void writeFieldData(String fieldName, String 
fieldValue, TupleWriter wri
   index = writer.addColumn(colSchema);
 }
 ScalarWriter colWriter = writer.scalar(index);
+ColumnMetadata columnMetadata = writer.tupleSchema().metadata(index);
+MinorType dataType = columnMetadata.schema().getType().getMinorType();
+String dateFormat;
+
+// Write the values depending on their data type.  This only applies to 
scalar fields.
 if (fieldValue != null && (currentState != xmlState.ROW_ENDED && 
currentState != xmlState.FIELD_ENDED)) {
-  colWriter.setString(fieldValue);
+  switch (dataType) {
+case BIT:
+  colWriter.setBoolean(Boolean.parseBoolean(fieldValue));
+  break;
+case TINYINT:
+case SMALLINT:
+case INT:
+  colWriter.setInt(Integer.parseInt(fieldValue));
+  break;
+case BIGINT:
+  colWriter.setLong(Long.parseLong(fieldValue));
+  break;
+case FLOAT4:
+case FLOAT8:
+  colWriter.setDouble(Double.parseDouble(fieldValue));
+  break;
+case DATE:
+  dateFormat = columnMetadata.property("drill.format");
+  LocalDate localDate;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+localDate = LocalDate.parse(fieldValue);
+  } else {
+localDate = LocalDate.parse(fieldValue, 
DateTimeFormatter.ofPattern(dateFormat));
+  }
+  colWriter.setDate(localDate);
+  break;
+case TIME:
+  dateFormat = columnMetadata.property("drill.format");
+  LocalTime localTime;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+localTime = LocalTime.parse(fieldValue);
+  } else {
+localTime = LocalTime.parse(fieldValue, 
DateTimeFormatter.ofPattern(dateFormat));
+  }
+  colWriter.setTime(localTime);
+  break;
+case TIMESTAMP:
+  dateFormat = columnMetadata.property("drill.format");
+  Instant timestamp = null;
+  if (Strings.isNullOrEmpty(dateFormat)) {
+timestamp = Instant.parse(fieldValue);
+  } else {
+try {
+  SimpleDateFormat simpleDateFormat = new 
SimpleDateFormat(dateFormat);
+  Date parsedDate = simpleDateFormat.parse(fieldValue);
+  timestamp = Instant.ofEpochMilli(parsedDate.getTime());
+} catch (ParseException e) {

Review Comment:
   I think we should operate in one of two definite modes controlled by an 
option that we try to standardise across plugins.
   
   1. Invalid data fails hard and fast. Your results will not be silently 
distorted.
   2. Invalid data is swallowed and null is emitted. You know what you're doing 
and you'll provide appropriate logic for the nulls later.
   
   Users that want a default substituted have the simple `ifnull(x,-1)` to 
combine with (2). My opinion is that if we don't have a `nullInvalidData: true` 
option set then we should continue to fail the query on the spot, as we do 
elsewhere.





> Add Provided Schema for XML Reader
> --
>
> Key: DRILL-8360
> URL: https://issues.apache.org/jira/browse/DRILL-8360
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Format - XML
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> The XML reader does not support provisioned schema.  This PR adds that 
> support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8360) Add Provided Schema for XML Reader

2022-11-27 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8360?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17639671#comment-17639671
 ] 

ASF GitHub Bot commented on DRILL-8360:
---

cgivre opened a new pull request, #2710:
URL: https://github.com/apache/drill/pull/2710

   # [DRILL-8360](https://issues.apache.org/jira/browse/DRILL-8360): Add 
Provided Schema for XML Reader
   
   ## Description
   This PR adds several enhancements to XML functionality:
   
   * Allows for provided schema for XML files
   * Allows for provided schema for XML API requests
   * Allows for XML to be used in post bodies.
   
   This PR deprecates the `xmlDataLevel` in the HTTP plugin in favor of a new 
`xmlOptions` class which has the data level.  It does not remove the 
`xmlDataLevel` as that would be a breaking change.

   Also to note, at this time, the provided schema functionality only works 
with scalar data types.  In the future we can add complex type support such as 
lists and maps.
   
   ## Documentation
   Updated appropriate README files.
   
   ## Testing
   Added various unit tests.




> Add Provided Schema for XML Reader
> --
>
> Key: DRILL-8360
> URL: https://issues.apache.org/jira/browse/DRILL-8360
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Format - XML
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> The XML reader does not support provisioned schema.  This PR adds that 
> support.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8358) Storage plugin for querying other Apache Drill clusters

2022-11-22 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8358?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17637588#comment-17637588
 ] 

ASF GitHub Bot commented on DRILL-8358:
---

vvysotskyi opened a new pull request, #2709:
URL: https://github.com/apache/drill/pull/2709

   # [DRILL-8358](https://issues.apache.org/jira/browse/DRILL-8358): Storage 
plugin for querying other Apache Drill clusters
   
   ## Description
   Using native client to query other drill clusters. Added logic to do various 
pushdowns when possible.
   Fixed adding extra project for the case of star columns.
   Fixed ignoring column with empty name column for excel format.
   
   ## Documentation
   See README.md
   
   ## Testing
   Tested manually, added UT.
   




> Storage plugin for querying other Apache Drill clusters
> ---
>
> Key: DRILL-8358
> URL: https://issues.apache.org/jira/browse/DRILL-8358
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-11-21 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636733#comment-17636733
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

cgivre merged PR #2702:
URL: https://github.com/apache/drill/pull/2702




> Format plugin for Delta Lake
> 
>
> Key: DRILL-8353
> URL: https://issues.apache.org/jira/browse/DRILL-8353
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.20.2
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: Future
>
>
> Implement format plugin for Delta Lake.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (DRILL-8353) Format plugin for Delta Lake

2022-11-19 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17636136#comment-17636136
 ] 

ASF GitHub Bot commented on DRILL-8353:
---

vvysotskyi commented on code in PR #2702:
URL: https://github.com/apache/drill/pull/2702#discussion_r1027065550


##
contrib/format-deltalake/src/main/java/org/apache/drill/exec/store/delta/plan/DrillExprToDeltaTranslator.java:
##
@@ -0,0 +1,246 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.drill.exec.store.delta.plan;
+
+import io.delta.standalone.expressions.And;
+import io.delta.standalone.expressions.EqualTo;
+import io.delta.standalone.expressions.Expression;
+import io.delta.standalone.expressions.GreaterThan;
+import io.delta.standalone.expressions.GreaterThanOrEqual;
+import io.delta.standalone.expressions.IsNotNull;
+import io.delta.standalone.expressions.IsNull;
+import io.delta.standalone.expressions.LessThan;
+import io.delta.standalone.expressions.LessThanOrEqual;
+import io.delta.standalone.expressions.Literal;
+import io.delta.standalone.expressions.Not;
+import io.delta.standalone.expressions.Or;
+import io.delta.standalone.expressions.Predicate;
+import io.delta.standalone.types.StructType;
+import org.apache.drill.common.FunctionNames;
+import org.apache.drill.common.expression.FunctionCall;
+import org.apache.drill.common.expression.LogicalExpression;
+import org.apache.drill.common.expression.PathSegment;
+import org.apache.drill.common.expression.SchemaPath;
+import org.apache.drill.common.expression.ValueExpressions;
+import org.apache.drill.common.expression.visitors.AbstractExprVisitor;
+
+public class DrillExprToDeltaTranslator extends 
AbstractExprVisitor {
+
+  private final StructType structType;
+
+  public DrillExprToDeltaTranslator(StructType structType) {
+this.structType = structType;
+  }
+
+  @Override
+  public Expression visitFunctionCall(FunctionCall call, Void value) {
+try {
+  return visitFunctionCall(call);
+} catch (Exception e) {
+  return null;
+}
+  }
+
+  private Predicate visitFunctionCall(FunctionCall call) {
+switch (call.getName()) {
+  case FunctionNames.AND: {
+Expression left = call.arg(0).accept(this, null);
+Expression right = call.arg(1).accept(this, null);
+if (left != null && right != null) {
+  return new And(left, right);
+}
+return null;
+  }
+  case FunctionNames.OR: {
+Expression left = call.arg(0).accept(this, null);
+Expression right = call.arg(1).accept(this, null);
+if (left != null && right != null) {
+  return new Or(left, right);
+}
+return null;
+  }
+  case FunctionNames.NOT: {
+Expression expression = call.arg(0).accept(this, null);
+if (expression != null) {
+  return new Not(expression);
+}
+return null;
+  }
+  case FunctionNames.IS_NULL: {
+LogicalExpression arg = call.arg(0);
+if (arg instanceof SchemaPath) {
+  String name = getPath((SchemaPath) arg);
+  return new IsNull(structType.column(name));
+}
+return null;
+  }
+  case FunctionNames.IS_NOT_NULL: {
+LogicalExpression arg = call.arg(0);
+if (arg instanceof SchemaPath) {
+  String name = getPath((SchemaPath) arg);
+  return new IsNotNull(structType.column(name));
+}
+return null;
+  }
+  case FunctionNames.LT: {
+LogicalExpression nameRef = call.arg(0);
+Expression expression = call.arg(1).accept(this, null);
+if (nameRef instanceof SchemaPath) {
+  String name = getPath((SchemaPath) nameRef);
+  return new LessThan(structType.column(name), expression);
+}
+return null;
+  }
+  case FunctionNames.LE: {
+LogicalExpression nameRef = call.arg(0);
+Expression expression = call.arg(1).accept(this, null);
+if (nameRef instanceof SchemaPath) {
+  String name = getPath((SchemaPath) nameRef);
+  return new LessThanOrEqual(structType.column(name), 

[jira] [Commented] (DRILL-8340) Add Additional Date Manipulation Functions (Part 1)

2022-11-18 Thread ASF GitHub Bot (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-8340?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17635976#comment-17635976
 ] 

ASF GitHub Bot commented on DRILL-8340:
---

cgivre closed pull request #2689: DRILL-8340: Add Additional Date Manipulation 
Functions
URL: https://github.com/apache/drill/pull/2689




> Add Additional Date Manipulation Functions (Part 1)
> ---
>
> Key: DRILL-8340
> URL: https://issues.apache.org/jira/browse/DRILL-8340
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Functions - Drill
>Affects Versions: 1.20.2
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 2.0.0
>
>
> This PR adds several utility functions to facilitate working with dates and 
> times.  These are modeled after the date/time functionality in MySQL.
> Specifically this adds:
>  * YEARWEEK():  Returns an int of year week. IE (202002)
>  * TIME_STAMP():  Converts most anything that looks like a date 
> string into a timestamp.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


<    2   3   4   5   6   7   8   9   10   11   >