[GitHub] [drill] jnturton commented on pull request #2610: DRILL-8268: Fix Hadoop 2 and Netty lib exclusions, REST mem limiter disabled by default

2022-07-23 Thread GitBox


jnturton commented on PR #2610:
URL: https://github.com/apache/drill/pull/2610#issuecomment-1193118355

   @luocooong the main goal of this PR is to solve issues in RC0 of Drill 
1.20.2 and produce RC1. 
   
   Unrelated to that, I can expand a bit more on my reasoning for the commit 
disabling the heap memory usage limiter in the REST server by default. Since we 
stream rather than buffer REST API query results now, we no longer have an 
excuse for runaway heap usage. Further, while preventing the next REST query 
from running can save the Drillbit from an OOM death, when the user complains 
that they cannot run their query an admin is very likely to restart that 
Drillbit _anyway_. So we probably only delayed the inevitable even though the 
landing was made a bit softer and other processing on the Drillbit may have 
been able to continue.
   
   Since now the only things that can cause runaway heap usage (that I can see) 
under a constant load are leaks, unecessary object references in Java's case, I 
think we should bite the bullet and fail hard with OOM when we exhaust the 
heap. The Drillbit can easily be reincarnated upon its death by systemd or 
another supervisor, which is preferable to having it limp on declining queries 
arriving over REST. We also get to stop applying our own heap management rule 
which may interact badly with the GC in use. E.g. if we start turning REST 
queries away at 85% heap usage but 90% heap usage would have stirred the user's 
GC to do a deep clean (in reality I think they all kick in considerably earlier 
but I'm really making a design argument here) then we are working against our 
own JVM. Lastly any ensuing bug reports overtly containing "OutOfMemoryError" 
are more likely, I believe, to send us hunting heap memory leaks.
   
   cc @paul-rogers.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2585: DRILL-8235: Add Storage Plugin for Google Sheets

2022-07-23 Thread GitBox


cgivre commented on code in PR #2585:
URL: https://github.com/apache/drill/pull/2585#discussion_r928189718


##
exec/java-exec/src/main/java/org/apache/drill/exec/store/security/OAuthTokenCredentials.java:
##
@@ -0,0 +1,82 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.security;
+
+import org.apache.drill.common.logical.security.CredentialsProvider;
+
+import java.util.Map;
+
+public class OAuthTokenCredentials {

Review Comment:
   Good catch!  I had been working on this for some time, and I think I created 
that before we merged the final version of OAuth support.  In any event, the 
duplicate class has been removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2585: DRILL-8235: Add Storage Plugin for Google Sheets

2022-07-23 Thread GitBox


cgivre commented on code in PR #2585:
URL: https://github.com/apache/drill/pull/2585#discussion_r928190871


##
contrib/storage-googlesheets/src/main/java/org/apache/drill/exec/store/googlesheets/columns/GoogleSheetsDateColumnWriter.java:
##
@@ -0,0 +1,45 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.googlesheets.columns;
+
+import org.apache.commons.lang3.StringUtils;
+import org.apache.drill.exec.physical.resultSet.RowSetLoader;
+
+import java.time.LocalDate;
+
+public class GoogleSheetsDateColumnWriter extends GoogleSheetsColumnWriter {

Review Comment:
   Done!
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on a diff in pull request #2585: DRILL-8235: Add Storage Plugin for Google Sheets

2022-07-23 Thread GitBox


cgivre commented on code in PR #2585:
URL: https://github.com/apache/drill/pull/2585#discussion_r928194017


##
contrib/storage-googlesheets/src/test/java/org/apache/drill/exec/store/googlesheets/TestGoogleSheetsWriter.java:
##
@@ -0,0 +1,135 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.drill.exec.store.googlesheets;
+
+import com.fasterxml.jackson.databind.ObjectMapper;
+import org.apache.drill.categories.RowSetTest;
+import org.apache.drill.common.util.DrillFileUtils;
+import org.apache.drill.exec.oauth.PersistentTokenTable;
+import org.apache.drill.exec.physical.rowSet.RowSet;
+import org.apache.drill.exec.store.StoragePluginRegistry;
+import org.apache.drill.exec.store.StoragePluginRegistry.PluginException;
+import org.apache.drill.shaded.guava.com.google.common.base.Charsets;
+import org.apache.drill.shaded.guava.com.google.common.io.Files;
+import org.apache.drill.test.ClusterFixture;
+import org.apache.drill.test.ClusterTest;
+import org.apache.drill.test.QueryBuilder.QuerySummary;
+import org.junit.BeforeClass;
+import org.junit.Ignore;
+import org.junit.Test;
+import org.junit.experimental.categories.Category;
+
+import java.nio.file.Paths;
+import java.util.ArrayList;
+import java.util.Arrays;
+import java.util.List;
+import java.util.Map;
+
+import static org.junit.Assert.assertTrue;
+import static org.junit.Assert.fail;
+
+@Category(RowSetTest.class)
+@Ignore("These tests require a live Google Sheets connection.  Please run 
manually.")
+public class TestGoogleSheetsWriter extends ClusterTest {
+  private static final String AUTH_URI = 
"https://accounts.google.com/o/oauth2/auth";;
+  private static final String TOKEN_URI = 
"https://oauth2.googleapis.com/token";;
+  private static final List REDIRECT_URI = new 
ArrayList<>(Arrays.asList("urn:ietf:wg:oauth:2.0:oob", "http://localhost";));
+
+  private static StoragePluginRegistry pluginRegistry;
+  private static String accessToken;
+  private static String refreshToken;
+
+  @BeforeClass
+  public static void init() throws Exception {
+ClusterTest.startCluster(ClusterFixture.builder(dirTestWatcher));
+dirTestWatcher.copyResourceToRoot(Paths.get(""));
+
+String oauthJson = 
Files.asCharSource(DrillFileUtils.getResourceAsFile("/tokens/oauth_tokens.json"),
 Charsets.UTF_8).read();
+
+ObjectMapper mapper = new ObjectMapper();
+Map tokenMap = mapper.readValue(oauthJson, Map.class);
+
+String clientID = tokenMap.get("client_id");
+String clientSecret = tokenMap.get("client_secret");
+accessToken = tokenMap.get("access_token");
+refreshToken = tokenMap.get("refresh_token");
+
+pluginRegistry = cluster.drillbit().getContext().getStorage();
+GoogleSheetsStoragePluginConfig config = 
GoogleSheetsStoragePluginConfig.builder()
+  .clientID(clientID)
+  .clientSecret(clientSecret)
+  .redirectUris(REDIRECT_URI)
+  .authUri(AUTH_URI)
+  .tokenUri(TOKEN_URI)
+  .build();
+
+config.setEnabled(true);
+pluginRegistry.validatedPut("googlesheets", config);
+  }
+
+  @Test
+  public void testBasicCTAS() throws Exception {
+try {
+  initializeTokens();
+} catch (PluginException e) {
+  fail(e.getMessage());
+}
+
+String query = "CREATE TABLE googlesheets.`test_sheet`.`test_table` (ID, 
NAME) AS " +
+  "SELECT * FROM (VALUES(1,2), (3,4))";
+// Create the table and insert the values
+QuerySummary insertResults = queryBuilder().sql(query).run();
+assertTrue(insertResults.succeeded());
+  }
+
+  @Test
+  public void testCTASFromFile() throws Exception {
+try {
+  initializeTokens();
+} catch (PluginException e) {
+  fail(e.getMessage());
+}
+
+/*String query = "CREATE TABLE googlesheets.`test_sheet`.`test_table` (ID, 
NAME) AS " +
+  "SELECT * FROM (VALUES(1,2), (3,4))";*/
+   String sql = "SELECT * FROM table(cp.`data/Drill_Test_Data.xlsx` (type => 
'excel', sheetName => 'MixedSheet'))";

Review Comment:
   I removed this and added a comment about it.  With the current 
implementation, when you execute a CTAS query, you do so with the desired file 
name. IE:
   
   ```sql
   CREATE TABLE googlesheets.myfile AS
   SELECT 
   F

[GitHub] [drill] cgivre commented on a diff in pull request #2585: DRILL-8235: Add Storage Plugin for Google Sheets

2022-07-23 Thread GitBox


cgivre commented on code in PR #2585:
URL: https://github.com/apache/drill/pull/2585#discussion_r928194222


##
logical/src/main/java/org/apache/drill/common/logical/StoragePluginConfig.java:
##
@@ -33,6 +34,7 @@
 
 @JsonTypeInfo(use = JsonTypeInfo.Id.NAME, property = "type")
 @JsonInclude(JsonInclude.Include.NON_DEFAULT)
+@JsonFormat(with = JsonFormat.Feature.ACCEPT_CASE_INSENSITIVE_PROPERTIES)

Review Comment:
   I can add this to the generic FormatPlugin super class.  Should that be a 
separate PR? 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [drill] cgivre commented on pull request #2585: DRILL-8235: Add Storage Plugin for Google Sheets

2022-07-23 Thread GitBox


cgivre commented on PR #2585:
URL: https://github.com/apache/drill/pull/2585#issuecomment-1193245110

   @jnturton Thank you very much for your review of this beast.  I believe I 
have addressed all your comments, as well as fixed the final bugs with the 
writer.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org