[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-11-30 Thread arina-ielchiieva
GitHub user arina-ielchiieva opened a pull request:

https://github.com/apache/drill/pull/672

DRILL-5085: Add / update description for dynamic UDFs directories in …

…drill-env.sh and drill-module.conf

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/arina-ielchiieva/drill DRILL-5085

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/672.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #672


commit 6a67ea1a036054e7c81dc3a3339626dc142a8219
Author: Arina Ielchiieva 
Date:   2016-11-30T15:30:35Z

DRILL-5085: Add / update description for dynamic UDFs directories in 
drill-env.sh and drill-module.conf




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-11-30 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r90268383
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -45,7 +45,7 @@ drill.client: {
   supports-complex-types: true
 }
 
-// Directory is used as base for temporary storage of Dynamic UDF jars.
+// Directory is used as root for temporary storage of Dynamic UDF jars.
--- End diff --

Where does the user configure the suffix added to this directory to get the 
temp dir for a given cluster? For example, should we have something like:

drill.cluster-temp-dir: "${drill.tmp-dir}/${drill.exec.zk.root}"

Note that tmp itself is a shared resource. If it is in /tmp, then multiple 
clusters will use it. If it is in $DRILL_HOME/tmp, then still multiple clusters 
will use it. Only if it is configured as $DRILL_SITE/tmp can we be guaranteed 
only one cluster will use the directory (because only one cluster can use each 
$DRILL_SITE dir -- that's how $DRILL_SITE is defined.)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-11-30 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r90266159
  
--- Diff: distribution/src/resources/drill-env.sh ---
@@ -141,4 +141,9 @@
 # Arguments passed to sqlline (the Drill shell) at all times: whether
 # Drill is embedded in Sqlline or not.
 
-#export DRILL_SHELL_JAVA_OPTS="..."
\ No newline at end of file
+#export DRILL_SHELL_JAVA_OPTS="..."
+
+# Location Drill should use for temporary files, such as downloaded 
dynamic UDFs jars.
+# Set to "/tmp" by default.
+#
+# export DRILL_TMP_DIR="..."
--- End diff --

Nit: add a newline after this line.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-11-30 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r90267374
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -207,11 +207,12 @@ drill.exec: {
   // Set this property if custom absolute root should be used for 
remote directories, ex: root: "/app/drill".
   // root: "",
 
-  // Relative base directory for all udf directories (local and 
remote).
+  // Base directory for all udf directories (local and remote).
   base: ${drill.exec.zk.root}"/udf",
 
-  // Relative path to local udf directory.
-  // Generated temporary directory is used as root unless 
${DRILL_TMP_DIR} or ${drill.tmp-dir} is set.
+  // Relative local udf directory. Drill temporary directory is used 
as root for this directory.
+  // Drill temporary directory is set to generated temporary directory
+  // unless ${DRILL_TMP_DIR} or ${drill.tmp-dir} is set.
--- End diff --

Still not sure this entirely makes sense. This is a local file system 
directory. We describe ow the default is based on DRILL_TMP_DIR, etc. That all 
make sense.

But the actual default value is based on the base directory in the file 
system configured by fs, which might be HDFS.

And, since we specify a value here, the `drill.tmp.dir` will never be used.

Do we really mean:
```
local: "${drill-tmp-dir}/udfs"
```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-01 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r90483802
  
--- Diff: distribution/src/resources/drill-env.sh ---
@@ -141,4 +141,9 @@
 # Arguments passed to sqlline (the Drill shell) at all times: whether
 # Drill is embedded in Sqlline or not.
 
-#export DRILL_SHELL_JAVA_OPTS="..."
\ No newline at end of file
+#export DRILL_SHELL_JAVA_OPTS="..."
+
+# Location Drill should use for temporary files, such as downloaded 
dynamic UDFs jars.
+# Set to "/tmp" by default.
+#
+# export DRILL_TMP_DIR="..."
--- End diff --

Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-01 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r90484226
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -207,11 +207,12 @@ drill.exec: {
   // Set this property if custom absolute root should be used for 
remote directories, ex: root: "/app/drill".
   // root: "",
 
-  // Relative base directory for all udf directories (local and 
remote).
+  // Base directory for all udf directories (local and remote).
   base: ${drill.exec.zk.root}"/udf",
 
-  // Relative path to local udf directory.
-  // Generated temporary directory is used as root unless 
${DRILL_TMP_DIR} or ${drill.tmp-dir} is set.
+  // Relative local udf directory. Drill temporary directory is used 
as root for this directory.
+  // Drill temporary directory is set to generated temporary directory
+  // unless ${DRILL_TMP_DIR} or ${drill.tmp-dir} is set.
--- End diff --

Updated to local: "${drill.cluster-tmp-dir}/udf/local".


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-02 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r90688304
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/registry/RemoteFunctionRegistry.java
 ---
@@ -189,6 +188,7 @@ private void prepareStores(PersistentStoreProvider 
storeProvider, ClusterCoordin
* if not set, uses user home directory instead.
*/
   private void prepareAreas(DrillConfig config) {
+logger.info("Preparing three remote udf areas: staging, registry and 
tmp.");
--- End diff --

Although we suggested keeping the file path stuff as-is, the new logging 
messages are great; let's keep them.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-02 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r90687564
  
--- Diff: distribution/src/resources/drill-override-example.conf ---
@@ -170,7 +170,17 @@ drill.exec: {
 threadpool_size: 8,
 decode_threadpool_size: 1
   },
-  debug.error_on_leak: true
+  debug.error_on_leak: true,
+  udf: {
--- End diff --

Comment?

# Settings for Dynamic UDFs. See (link to docs)


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-02 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r90689224
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -45,11 +45,13 @@ drill.client: {
   supports-complex-types: true
 }
 
-// Directory is used as base for temporary storage of Dynamic UDF jars.
-// Set this property if you want to have custom temporary directory, 
instead of generated at runtime.
-// By default ${DRILL_TMP_DIR} is used if set.
-// drill.tmp-dir: "/tmp"
-// drill.tmp-dir: ${?DRILL_TMP_DIR}
+// Location Drill uses for temporary files, such as downloaded dynamic 
UDFs jars.
--- End diff --

Can we leave it that, if blank, it will use the system-generated directory 
as you have in 1.9? This means we'll still need your getTmpDir method. To do 
the check for blank.

Also, as it turns out, we already have a temp dir setting used elsewhere. 
Look in drill-override-example.conf under drill.sys.store.provider.local.path. 
It uses "/tmp/drill". Then it adds subdirs for storage plugins and what-not.

Can we use "/tmp/drill/udf" as the base temp storage location for udfs to 
be consistent? Then we'd add the cluster-id, etc. as in your "cluster-temp-dir" 
below.

And, yes, we have lots of places where we specify the temp file system 
(UDFs, the one cited above, the (unused) cache.hazel, the temp dirs for 
spilling, ...). We do need to create a project (later) to rationalize all of 
this...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-02 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r90687630
  
--- Diff: distribution/src/resources/drill-override-example.conf ---
@@ -170,7 +170,17 @@ drill.exec: {
 threadpool_size: 8,
 decode_threadpool_size: 1
   },
-  debug.error_on_leak: true
+  debug.error_on_leak: true,
+  udf: {
+retry-attempts: 10,
--- End diff --

Comment to explain this? Just lift wording from the docs.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-02 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r90688097
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java
 ---
@@ -377,14 +374,12 @@ private ScanResult scan(ClassLoader classLoader, Path 
path, URL[] urls) throws I
* Creates local udf directory, if it doesn't exist.
* Checks if local udf directory is a directory and if current 
application has write rights on it.
* Attempts to clean up local udf directory in case jars were left after 
previous drillbit run.
-   * Local udf directory path is concatenated from drill temporary 
directory and ${drill.exec.udf.directory.local}.
*
* @param config drill config
* @return path to local udf directory
*/
   private Path getLocalUdfDir(DrillConfig config) {
-tmpDir = getTmpDir(config);
-File udfDir = new File(tmpDir, 
config.getString(ExecConstants.UDF_DIRECTORY_LOCAL));
+File udfDir = new 
File(config.getString(ExecConstants.UDF_DIRECTORY_LOCAL));
--- End diff --

Although we talked about using the new system you've implemented here, we 
have to consider backward compatibility.

Since the original behavior is already visible to users in Drill 1.8, I 
think we need to leave your original design.

At some point, we'll need to rationalize how Drill handles temp files and 
storage in DFS. But, until then, your 1.8 design is fine.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-06 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r91078578
  
--- Diff: distribution/src/resources/drill-override-example.conf ---
@@ -170,7 +170,17 @@ drill.exec: {
 threadpool_size: 8,
 decode_threadpool_size: 1
   },
-  debug.error_on_leak: true
+  debug.error_on_leak: true,
+  udf: {
+retry-attempts: 10,
--- End diff --

Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-06 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r91085335
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java
 ---
@@ -377,14 +374,12 @@ private ScanResult scan(ClassLoader classLoader, Path 
path, URL[] urls) throws I
* Creates local udf directory, if it doesn't exist.
* Checks if local udf directory is a directory and if current 
application has write rights on it.
* Attempts to clean up local udf directory in case jars were left after 
previous drillbit run.
-   * Local udf directory path is concatenated from drill temporary 
directory and ${drill.exec.udf.directory.local}.
*
* @param config drill config
* @return path to local udf directory
*/
   private Path getLocalUdfDir(DrillConfig config) {
-tmpDir = getTmpDir(config);
-File udfDir = new File(tmpDir, 
config.getString(ExecConstants.UDF_DIRECTORY_LOCAL));
+File udfDir = new 
File(config.getString(ExecConstants.UDF_DIRECTORY_LOCAL));
--- End diff --

Ok, reverted usage of generated temporary directory logic.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-06 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r91085385
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/registry/RemoteFunctionRegistry.java
 ---
@@ -189,6 +188,7 @@ private void prepareStores(PersistentStoreProvider 
storeProvider, ClusterCoordin
* if not set, uses user home directory instead.
*/
   private void prepareAreas(DrillConfig config) {
+logger.info("Preparing three remote udf areas: staging, registry and 
tmp.");
--- End diff --

Sure.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-06 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r91085531
  
--- Diff: exec/java-exec/src/main/resources/drill-module.conf ---
@@ -45,11 +45,13 @@ drill.client: {
   supports-complex-types: true
 }
 
-// Directory is used as base for temporary storage of Dynamic UDF jars.
-// Set this property if you want to have custom temporary directory, 
instead of generated at runtime.
-// By default ${DRILL_TMP_DIR} is used if set.
-// drill.tmp-dir: "/tmp"
-// drill.tmp-dir: ${?DRILL_TMP_DIR}
+// Location Drill uses for temporary files, such as downloaded dynamic 
UDFs jars.
--- End diff --

1. Reverted generated directory logic.
2. "/tmp/drill/udf" + cluster-id is not good idea, since local udf 
directory has similar location as remote. And remote directory is generated as 
cluster-id + /udf. So I suggest to keep it as is. Local registry is returned to 
use its previous composition logic.
3. Removed "cluster-temp-dir", we don't need it for now.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-06 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/672#discussion_r91077966
  
--- Diff: distribution/src/resources/drill-override-example.conf ---
@@ -170,7 +170,17 @@ drill.exec: {
 threadpool_size: 8,
 decode_threadpool_size: 1
   },
-  debug.error_on_leak: true
+  debug.error_on_leak: true,
+  udf: {
--- End diff --

Done.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #672: DRILL-5085: Add / update description for dynamic UD...

2016-12-12 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/672


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---