[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-03-02 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/701


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-02-24 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/701#discussion_r102921712
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java
 ---
@@ -260,76 +293,101 @@ public RemoteFunctionRegistry 
getRemoteFunctionRegistry() {
   }
 
   /**
-   * Attempts to load and register functions from remote function registry.
-   * First checks if there is no missing jars.
-   * If yes, enters synchronized block to prevent other loading the same 
jars.
-   * Again re-checks if there are no missing jars in case someone has 
already loaded them (double-check lock).
-   * If there are still missing jars, first copies jars to local udf area 
and prepares {@link JarScan} for each jar.
-   * Jar registration timestamp represented in milliseconds is used as 
suffix.
-   * Then registers all jars at the same time. Returns true when finished.
-   * In case if any errors during jars coping or registration, logs errors 
and proceeds.
+   * Purpose of this method is to synchronize remote and local function 
registries if needed
+   * and to inform if function registry was changed after given version.
*
-   * If no missing jars are found, checks current local registry version.
-   * Returns false if versions match, true otherwise.
+   * To make synchronization as much light-weigh as possible, first only 
versions of both registries are checked
+   * without any locking. If synchronization is needed, enters 
synchronized block to prevent others loading the same jars.
+   * The need of synchronization is checked again (double-check lock) 
before comparing jars.
+   * If any missing jars are found, they are downloaded to local udf area, 
each is wrapped into {@link JarScan}.
+   * Once jar download is finished, all missing jars are registered in one 
batch.
+   * In case if any errors during jars download / registration, these 
errors are logged.
*
-   * @param version local function registry version
-   * @return true if new jars were registered or local function registry 
version is different, false otherwise
+   * During registration local function registry is updated with remote 
function registry version it is synced with.
+   * When at least one jar of the missing jars failed to download / 
register,
+   * local function registry version are not updated but jars that where 
successfully downloaded / registered
+   * are added to local function registry.
+   *
+   * If synchronization between remote and local function registry was not 
needed,
+   * checks if given registry version matches latest sync version
+   * to inform if function registry was changed after given version.
+   *
+   * @param version remote function registry local function registry was 
based on
+   * @return true if remote and local function registries were 
synchronized after given version
*/
-  public boolean loadRemoteFunctions(long version) {
-List missingJars = getMissingJars(remoteFunctionRegistry, 
localFunctionRegistry);
-if (!missingJars.isEmpty()) {
+  public boolean syncWithRemoteRegistry(long version) {
+if 
(doSyncFunctionRegistries(remoteFunctionRegistry.getRegistryVersion(), 
localFunctionRegistry.getVersion())) {
   synchronized (this) {
-missingJars = getMissingJars(remoteFunctionRegistry, 
localFunctionRegistry);
-if (!missingJars.isEmpty()) {
-  logger.info("Starting dynamic UDFs lazy-init process.\n" +
-  "The following jars are going to be downloaded and 
registered locally: " + missingJars);
+long localRegistryVersion = localFunctionRegistry.getVersion();
+if 
(doSyncFunctionRegistries(remoteFunctionRegistry.getRegistryVersion(), 
localRegistryVersion))  {
+  DataChangeVersion remoteVersion = new DataChangeVersion();
+  List missingJars = 
getMissingJars(this.remoteFunctionRegistry, localFunctionRegistry, 
remoteVersion);
   List jars = Lists.newArrayList();
-  for (String jarName : missingJars) {
-Path binary = null;
-Path source = null;
-URLClassLoader classLoader = null;
-try {
-  binary = copyJarToLocal(jarName, remoteFunctionRegistry);
-  source = copyJarToLocal(JarUtil.getSourceName(jarName), 
remoteFunctionRegistry);
-  URL[] urls = {binary.toUri().toURL(), 
source.toUri().toURL()};
-  classLoader = new URLClassLoader(urls);
-  ScanResult scanResult = scan(classLoader, binary, urls);
-  localFunctionRegistry.validate(jarName, scanResult);
-  jars.add(new JarScan(jarName, scanResult, classLoader));
-} catch

[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-02-24 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/701#discussion_r102919942
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java
 ---
@@ -260,76 +293,101 @@ public RemoteFunctionRegistry 
getRemoteFunctionRegistry() {
   }
 
   /**
-   * Attempts to load and register functions from remote function registry.
-   * First checks if there is no missing jars.
-   * If yes, enters synchronized block to prevent other loading the same 
jars.
-   * Again re-checks if there are no missing jars in case someone has 
already loaded them (double-check lock).
-   * If there are still missing jars, first copies jars to local udf area 
and prepares {@link JarScan} for each jar.
-   * Jar registration timestamp represented in milliseconds is used as 
suffix.
-   * Then registers all jars at the same time. Returns true when finished.
-   * In case if any errors during jars coping or registration, logs errors 
and proceeds.
+   * Purpose of this method is to synchronize remote and local function 
registries if needed
+   * and to inform if function registry was changed after given version.
*
-   * If no missing jars are found, checks current local registry version.
-   * Returns false if versions match, true otherwise.
+   * To make synchronization as much light-weigh as possible, first only 
versions of both registries are checked
+   * without any locking. If synchronization is needed, enters 
synchronized block to prevent others loading the same jars.
+   * The need of synchronization is checked again (double-check lock) 
before comparing jars.
+   * If any missing jars are found, they are downloaded to local udf area, 
each is wrapped into {@link JarScan}.
+   * Once jar download is finished, all missing jars are registered in one 
batch.
+   * In case if any errors during jars download / registration, these 
errors are logged.
*
-   * @param version local function registry version
-   * @return true if new jars were registered or local function registry 
version is different, false otherwise
+   * During registration local function registry is updated with remote 
function registry version it is synced with.
+   * When at least one jar of the missing jars failed to download / 
register,
+   * local function registry version are not updated but jars that where 
successfully downloaded / registered
+   * are added to local function registry.
+   *
+   * If synchronization between remote and local function registry was not 
needed,
+   * checks if given registry version matches latest sync version
+   * to inform if function registry was changed after given version.
+   *
+   * @param version remote function registry local function registry was 
based on
+   * @return true if remote and local function registries were 
synchronized after given version
*/
-  public boolean loadRemoteFunctions(long version) {
-List missingJars = getMissingJars(remoteFunctionRegistry, 
localFunctionRegistry);
-if (!missingJars.isEmpty()) {
+  public boolean syncWithRemoteRegistry(long version) {
+if 
(doSyncFunctionRegistries(remoteFunctionRegistry.getRegistryVersion(), 
localFunctionRegistry.getVersion())) {
   synchronized (this) {
-missingJars = getMissingJars(remoteFunctionRegistry, 
localFunctionRegistry);
-if (!missingJars.isEmpty()) {
-  logger.info("Starting dynamic UDFs lazy-init process.\n" +
-  "The following jars are going to be downloaded and 
registered locally: " + missingJars);
+long localRegistryVersion = localFunctionRegistry.getVersion();
+if 
(doSyncFunctionRegistries(remoteFunctionRegistry.getRegistryVersion(), 
localRegistryVersion))  {
+  DataChangeVersion remoteVersion = new DataChangeVersion();
+  List missingJars = 
getMissingJars(this.remoteFunctionRegistry, localFunctionRegistry, 
remoteVersion);
   List jars = Lists.newArrayList();
-  for (String jarName : missingJars) {
-Path binary = null;
-Path source = null;
-URLClassLoader classLoader = null;
-try {
-  binary = copyJarToLocal(jarName, remoteFunctionRegistry);
-  source = copyJarToLocal(JarUtil.getSourceName(jarName), 
remoteFunctionRegistry);
-  URL[] urls = {binary.toUri().toURL(), 
source.toUri().toURL()};
-  classLoader = new URLClassLoader(urls);
-  ScanResult scanResult = scan(classLoader, binary, urls);
-  localFunctionRegistry.validate(jarName, scanResult);
-  jars.add(new JarScan(jarName, scanResult, classLoader));
-} catch

[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-02-24 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/701#discussion_r102920190
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java
 ---
@@ -260,76 +293,101 @@ public RemoteFunctionRegistry 
getRemoteFunctionRegistry() {
   }
 
   /**
-   * Attempts to load and register functions from remote function registry.
-   * First checks if there is no missing jars.
-   * If yes, enters synchronized block to prevent other loading the same 
jars.
-   * Again re-checks if there are no missing jars in case someone has 
already loaded them (double-check lock).
-   * If there are still missing jars, first copies jars to local udf area 
and prepares {@link JarScan} for each jar.
-   * Jar registration timestamp represented in milliseconds is used as 
suffix.
-   * Then registers all jars at the same time. Returns true when finished.
-   * In case if any errors during jars coping or registration, logs errors 
and proceeds.
+   * Purpose of this method is to synchronize remote and local function 
registries if needed
+   * and to inform if function registry was changed after given version.
*
-   * If no missing jars are found, checks current local registry version.
-   * Returns false if versions match, true otherwise.
+   * To make synchronization as much light-weigh as possible, first only 
versions of both registries are checked
+   * without any locking. If synchronization is needed, enters 
synchronized block to prevent others loading the same jars.
+   * The need of synchronization is checked again (double-check lock) 
before comparing jars.
+   * If any missing jars are found, they are downloaded to local udf area, 
each is wrapped into {@link JarScan}.
+   * Once jar download is finished, all missing jars are registered in one 
batch.
+   * In case if any errors during jars download / registration, these 
errors are logged.
*
-   * @param version local function registry version
-   * @return true if new jars were registered or local function registry 
version is different, false otherwise
+   * During registration local function registry is updated with remote 
function registry version it is synced with.
+   * When at least one jar of the missing jars failed to download / 
register,
+   * local function registry version are not updated but jars that where 
successfully downloaded / registered
+   * are added to local function registry.
+   *
+   * If synchronization between remote and local function registry was not 
needed,
+   * checks if given registry version matches latest sync version
+   * to inform if function registry was changed after given version.
+   *
+   * @param version remote function registry local function registry was 
based on
+   * @return true if remote and local function registries were 
synchronized after given version
*/
-  public boolean loadRemoteFunctions(long version) {
-List missingJars = getMissingJars(remoteFunctionRegistry, 
localFunctionRegistry);
-if (!missingJars.isEmpty()) {
+  public boolean syncWithRemoteRegistry(long version) {
+if 
(doSyncFunctionRegistries(remoteFunctionRegistry.getRegistryVersion(), 
localFunctionRegistry.getVersion())) {
   synchronized (this) {
-missingJars = getMissingJars(remoteFunctionRegistry, 
localFunctionRegistry);
-if (!missingJars.isEmpty()) {
-  logger.info("Starting dynamic UDFs lazy-init process.\n" +
-  "The following jars are going to be downloaded and 
registered locally: " + missingJars);
+long localRegistryVersion = localFunctionRegistry.getVersion();
+if 
(doSyncFunctionRegistries(remoteFunctionRegistry.getRegistryVersion(), 
localRegistryVersion))  {
+  DataChangeVersion remoteVersion = new DataChangeVersion();
+  List missingJars = 
getMissingJars(this.remoteFunctionRegistry, localFunctionRegistry, 
remoteVersion);
   List jars = Lists.newArrayList();
-  for (String jarName : missingJars) {
-Path binary = null;
-Path source = null;
-URLClassLoader classLoader = null;
-try {
-  binary = copyJarToLocal(jarName, remoteFunctionRegistry);
-  source = copyJarToLocal(JarUtil.getSourceName(jarName), 
remoteFunctionRegistry);
-  URL[] urls = {binary.toUri().toURL(), 
source.toUri().toURL()};
-  classLoader = new URLClassLoader(urls);
-  ScanResult scanResult = scan(classLoader, binary, urls);
-  localFunctionRegistry.validate(jarName, scanResult);
-  jars.add(new JarScan(jarName, scanResult, classLoader));
-} catch

[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-02-24 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/701#discussion_r102921207
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java
 ---
@@ -140,27 +142,39 @@ public void register(DrillOperatorTable 
operatorTable) {
   }
 
   /**
-   * Using the given functionResolver
-   * finds Drill function implementation for given 
functionCall.
-   * If function implementation was not found,
-   * loads all missing remote functions and tries to find Drill 
implementation one more time.
+   * First attempts to finds the Drill function implementation that 
matches the name, arg types and return type.
+   * If exact function implementation was not found,
+   * syncs local function registry with remote function registry if needed
+   * and tries to find function implementation one more time
--- End diff --

As agreed, we'll gone leave this implementation till MVCC will be 
implemented. MVCC will ensure we have up-to-date function registry before query 
planning or execution stages.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-02-24 Thread arina-ielchiieva
Github user arina-ielchiieva commented on a diff in the pull request:

https://github.com/apache/drill/pull/701#discussion_r102920945
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java
 ---
@@ -50,13 +47,56 @@
   private DrillSqlWorker() {
   }
 
+  /**
+   * Converts sql query string into query physical plan.
+   *
+   * @param context query context
+   * @param sql sql query
+   * @return query physical plan
+   */
   public static PhysicalPlan getPlan(QueryContext context, String sql) 
throws SqlParseException, ValidationException,
   ForemanSetupException {
 return getPlan(context, sql, null);
   }
 
+  /**
+   * Converts sql query string into query physical plan.
+   * In case of any errors (that might occur due to missing function 
implementation),
+   * checks if local function registry should be synchronized with remote 
function registry.
+   * If sync took place, reloads drill operator table
+   * (since functions were added to / removed from local function registry)
+   * and attempts to converts sql query string into query physical plan 
one more time.
+   *
+   * @param context query context
+   * @param sql sql query
+   * @param textPlan text plan
+   * @return query physical plan
+   */
   public static PhysicalPlan getPlan(QueryContext context, String sql, 
Pointer textPlan)
   throws ForemanSetupException {
+Pointer textPlanCopy = textPlan == null ? null : new 
Pointer<>(textPlan.value);
+try {
+  return getQueryPlan(context, sql, textPlan);
+} catch (Exception e) {
--- End diff --

Not to be engaged into enumerating possible exceptions (which can added in 
the code later), we are checking if remote and local function registries are in 
sync on any error. Such check may only affect on queries that will fail anyway. 
So if failure time will take a little bit longer, it won't make significant 
difference. But this will be removed when MVCC (multi-version concurrency 
control) will be implemented: when query will fail, we won't check if our 
function registry is up-to-date, since MVCC will ensure we have before query 
startup.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-02-06 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/701#discussion_r99649541
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java
 ---
@@ -260,76 +293,101 @@ public RemoteFunctionRegistry 
getRemoteFunctionRegistry() {
   }
 
   /**
-   * Attempts to load and register functions from remote function registry.
-   * First checks if there is no missing jars.
-   * If yes, enters synchronized block to prevent other loading the same 
jars.
-   * Again re-checks if there are no missing jars in case someone has 
already loaded them (double-check lock).
-   * If there are still missing jars, first copies jars to local udf area 
and prepares {@link JarScan} for each jar.
-   * Jar registration timestamp represented in milliseconds is used as 
suffix.
-   * Then registers all jars at the same time. Returns true when finished.
-   * In case if any errors during jars coping or registration, logs errors 
and proceeds.
+   * Purpose of this method is to synchronize remote and local function 
registries if needed
+   * and to inform if function registry was changed after given version.
*
-   * If no missing jars are found, checks current local registry version.
-   * Returns false if versions match, true otherwise.
+   * To make synchronization as much light-weigh as possible, first only 
versions of both registries are checked
+   * without any locking. If synchronization is needed, enters 
synchronized block to prevent others loading the same jars.
+   * The need of synchronization is checked again (double-check lock) 
before comparing jars.
+   * If any missing jars are found, they are downloaded to local udf area, 
each is wrapped into {@link JarScan}.
+   * Once jar download is finished, all missing jars are registered in one 
batch.
+   * In case if any errors during jars download / registration, these 
errors are logged.
*
-   * @param version local function registry version
-   * @return true if new jars were registered or local function registry 
version is different, false otherwise
+   * During registration local function registry is updated with remote 
function registry version it is synced with.
+   * When at least one jar of the missing jars failed to download / 
register,
+   * local function registry version are not updated but jars that where 
successfully downloaded / registered
+   * are added to local function registry.
+   *
+   * If synchronization between remote and local function registry was not 
needed,
+   * checks if given registry version matches latest sync version
+   * to inform if function registry was changed after given version.
+   *
+   * @param version remote function registry local function registry was 
based on
+   * @return true if remote and local function registries were 
synchronized after given version
*/
-  public boolean loadRemoteFunctions(long version) {
-List missingJars = getMissingJars(remoteFunctionRegistry, 
localFunctionRegistry);
-if (!missingJars.isEmpty()) {
+  public boolean syncWithRemoteRegistry(long version) {
+if 
(doSyncFunctionRegistries(remoteFunctionRegistry.getRegistryVersion(), 
localFunctionRegistry.getVersion())) {
   synchronized (this) {
-missingJars = getMissingJars(remoteFunctionRegistry, 
localFunctionRegistry);
-if (!missingJars.isEmpty()) {
-  logger.info("Starting dynamic UDFs lazy-init process.\n" +
-  "The following jars are going to be downloaded and 
registered locally: " + missingJars);
+long localRegistryVersion = localFunctionRegistry.getVersion();
+if 
(doSyncFunctionRegistries(remoteFunctionRegistry.getRegistryVersion(), 
localRegistryVersion))  {
+  DataChangeVersion remoteVersion = new DataChangeVersion();
+  List missingJars = 
getMissingJars(this.remoteFunctionRegistry, localFunctionRegistry, 
remoteVersion);
   List jars = Lists.newArrayList();
-  for (String jarName : missingJars) {
-Path binary = null;
-Path source = null;
-URLClassLoader classLoader = null;
-try {
-  binary = copyJarToLocal(jarName, remoteFunctionRegistry);
-  source = copyJarToLocal(JarUtil.getSourceName(jarName), 
remoteFunctionRegistry);
-  URL[] urls = {binary.toUri().toURL(), 
source.toUri().toURL()};
-  classLoader = new URLClassLoader(urls);
-  ScanResult scanResult = scan(classLoader, binary, urls);
-  localFunctionRegistry.validate(jarName, scanResult);
-  jars.add(new JarScan(jarName, scanResult, classLoader));
-} catch (Exce

[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-02-06 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/701#discussion_r99454180
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java
 ---
@@ -178,22 +192,41 @@ private String functionReplacement(FunctionCall 
functionCall) {
   }
 
   /**
-   * Find the Drill function implementation that matches the name, arg 
types and return type.
-   * If exact function implementation was not found,
-   * loads all missing remote functions and tries to find Drill 
implementation one more time.
+   * Finds the Drill function implementation that matches the name, arg 
types and return type.
+   *
+   * @param name function name
+   * @param argTypes input parameters types
+   * @param returnType function return type
+   * @return exactly matching function holder
--- End diff --

Thanks for adding the Javadoc! Very helpful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-02-06 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/701#discussion_r99650129
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java
 ---
@@ -50,13 +47,56 @@
   private DrillSqlWorker() {
   }
 
+  /**
+   * Converts sql query string into query physical plan.
+   *
+   * @param context query context
+   * @param sql sql query
+   * @return query physical plan
+   */
   public static PhysicalPlan getPlan(QueryContext context, String sql) 
throws SqlParseException, ValidationException,
   ForemanSetupException {
 return getPlan(context, sql, null);
   }
 
+  /**
+   * Converts sql query string into query physical plan.
+   * In case of any errors (that might occur due to missing function 
implementation),
+   * checks if local function registry should be synchronized with remote 
function registry.
+   * If sync took place, reloads drill operator table
+   * (since functions were added to / removed from local function registry)
+   * and attempts to converts sql query string into query physical plan 
one more time.
+   *
+   * @param context query context
+   * @param sql sql query
+   * @param textPlan text plan
+   * @return query physical plan
+   */
--- End diff --

Thanks for adding the Javadoc!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-02-06 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/701#discussion_r99650496
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/DrillSqlWorker.java
 ---
@@ -50,13 +47,56 @@
   private DrillSqlWorker() {
   }
 
+  /**
+   * Converts sql query string into query physical plan.
+   *
+   * @param context query context
+   * @param sql sql query
+   * @return query physical plan
+   */
   public static PhysicalPlan getPlan(QueryContext context, String sql) 
throws SqlParseException, ValidationException,
   ForemanSetupException {
 return getPlan(context, sql, null);
   }
 
+  /**
+   * Converts sql query string into query physical plan.
+   * In case of any errors (that might occur due to missing function 
implementation),
+   * checks if local function registry should be synchronized with remote 
function registry.
+   * If sync took place, reloads drill operator table
+   * (since functions were added to / removed from local function registry)
+   * and attempts to converts sql query string into query physical plan 
one more time.
+   *
+   * @param context query context
+   * @param sql sql query
+   * @param textPlan text plan
+   * @return query physical plan
+   */
   public static PhysicalPlan getPlan(QueryContext context, String sql, 
Pointer textPlan)
   throws ForemanSetupException {
+Pointer textPlanCopy = textPlan == null ? null : new 
Pointer<>(textPlan.value);
+try {
+  return getQueryPlan(context, sql, textPlan);
+} catch (Exception e) {
--- End diff --

Should we be more specific in the error we catch? Wouldn't this mean that, 
even for a simple syntax error, we'd resync and retry? Can we catch only the 
specific function error of interest?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-02-06 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/701#discussion_r99649785
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java
 ---
@@ -260,76 +293,101 @@ public RemoteFunctionRegistry 
getRemoteFunctionRegistry() {
   }
 
   /**
-   * Attempts to load and register functions from remote function registry.
-   * First checks if there is no missing jars.
-   * If yes, enters synchronized block to prevent other loading the same 
jars.
-   * Again re-checks if there are no missing jars in case someone has 
already loaded them (double-check lock).
-   * If there are still missing jars, first copies jars to local udf area 
and prepares {@link JarScan} for each jar.
-   * Jar registration timestamp represented in milliseconds is used as 
suffix.
-   * Then registers all jars at the same time. Returns true when finished.
-   * In case if any errors during jars coping or registration, logs errors 
and proceeds.
+   * Purpose of this method is to synchronize remote and local function 
registries if needed
+   * and to inform if function registry was changed after given version.
*
-   * If no missing jars are found, checks current local registry version.
-   * Returns false if versions match, true otherwise.
+   * To make synchronization as much light-weigh as possible, first only 
versions of both registries are checked
+   * without any locking. If synchronization is needed, enters 
synchronized block to prevent others loading the same jars.
+   * The need of synchronization is checked again (double-check lock) 
before comparing jars.
+   * If any missing jars are found, they are downloaded to local udf area, 
each is wrapped into {@link JarScan}.
+   * Once jar download is finished, all missing jars are registered in one 
batch.
+   * In case if any errors during jars download / registration, these 
errors are logged.
*
-   * @param version local function registry version
-   * @return true if new jars were registered or local function registry 
version is different, false otherwise
+   * During registration local function registry is updated with remote 
function registry version it is synced with.
+   * When at least one jar of the missing jars failed to download / 
register,
+   * local function registry version are not updated but jars that where 
successfully downloaded / registered
+   * are added to local function registry.
+   *
+   * If synchronization between remote and local function registry was not 
needed,
+   * checks if given registry version matches latest sync version
+   * to inform if function registry was changed after given version.
+   *
+   * @param version remote function registry local function registry was 
based on
+   * @return true if remote and local function registries were 
synchronized after given version
*/
-  public boolean loadRemoteFunctions(long version) {
-List missingJars = getMissingJars(remoteFunctionRegistry, 
localFunctionRegistry);
-if (!missingJars.isEmpty()) {
+  public boolean syncWithRemoteRegistry(long version) {
+if 
(doSyncFunctionRegistries(remoteFunctionRegistry.getRegistryVersion(), 
localFunctionRegistry.getVersion())) {
   synchronized (this) {
-missingJars = getMissingJars(remoteFunctionRegistry, 
localFunctionRegistry);
-if (!missingJars.isEmpty()) {
-  logger.info("Starting dynamic UDFs lazy-init process.\n" +
-  "The following jars are going to be downloaded and 
registered locally: " + missingJars);
+long localRegistryVersion = localFunctionRegistry.getVersion();
+if 
(doSyncFunctionRegistries(remoteFunctionRegistry.getRegistryVersion(), 
localRegistryVersion))  {
+  DataChangeVersion remoteVersion = new DataChangeVersion();
+  List missingJars = 
getMissingJars(this.remoteFunctionRegistry, localFunctionRegistry, 
remoteVersion);
   List jars = Lists.newArrayList();
-  for (String jarName : missingJars) {
-Path binary = null;
-Path source = null;
-URLClassLoader classLoader = null;
-try {
-  binary = copyJarToLocal(jarName, remoteFunctionRegistry);
-  source = copyJarToLocal(JarUtil.getSourceName(jarName), 
remoteFunctionRegistry);
-  URL[] urls = {binary.toUri().toURL(), 
source.toUri().toURL()};
-  classLoader = new URLClassLoader(urls);
-  ScanResult scanResult = scan(classLoader, binary, urls);
-  localFunctionRegistry.validate(jarName, scanResult);
-  jars.add(new JarScan(jarName, scanResult, classLoader));
-} catch (Exce

[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-02-06 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/701#discussion_r99454556
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java
 ---
@@ -260,76 +293,101 @@ public RemoteFunctionRegistry 
getRemoteFunctionRegistry() {
   }
 
   /**
-   * Attempts to load and register functions from remote function registry.
-   * First checks if there is no missing jars.
-   * If yes, enters synchronized block to prevent other loading the same 
jars.
-   * Again re-checks if there are no missing jars in case someone has 
already loaded them (double-check lock).
-   * If there are still missing jars, first copies jars to local udf area 
and prepares {@link JarScan} for each jar.
-   * Jar registration timestamp represented in milliseconds is used as 
suffix.
-   * Then registers all jars at the same time. Returns true when finished.
-   * In case if any errors during jars coping or registration, logs errors 
and proceeds.
+   * Purpose of this method is to synchronize remote and local function 
registries if needed
+   * and to inform if function registry was changed after given version.
*
-   * If no missing jars are found, checks current local registry version.
-   * Returns false if versions match, true otherwise.
+   * To make synchronization as much light-weigh as possible, first only 
versions of both registries are checked
+   * without any locking. If synchronization is needed, enters 
synchronized block to prevent others loading the same jars.
+   * The need of synchronization is checked again (double-check lock) 
before comparing jars.
+   * If any missing jars are found, they are downloaded to local udf area, 
each is wrapped into {@link JarScan}.
+   * Once jar download is finished, all missing jars are registered in one 
batch.
+   * In case if any errors during jars download / registration, these 
errors are logged.
*
-   * @param version local function registry version
-   * @return true if new jars were registered or local function registry 
version is different, false otherwise
+   * During registration local function registry is updated with remote 
function registry version it is synced with.
+   * When at least one jar of the missing jars failed to download / 
register,
+   * local function registry version are not updated but jars that where 
successfully downloaded / registered
+   * are added to local function registry.
+   *
+   * If synchronization between remote and local function registry was not 
needed,
+   * checks if given registry version matches latest sync version
+   * to inform if function registry was changed after given version.
+   *
+   * @param version remote function registry local function registry was 
based on
+   * @return true if remote and local function registries were 
synchronized after given version
*/
-  public boolean loadRemoteFunctions(long version) {
-List missingJars = getMissingJars(remoteFunctionRegistry, 
localFunctionRegistry);
-if (!missingJars.isEmpty()) {
+  public boolean syncWithRemoteRegistry(long version) {
+if 
(doSyncFunctionRegistries(remoteFunctionRegistry.getRegistryVersion(), 
localFunctionRegistry.getVersion())) {
   synchronized (this) {
-missingJars = getMissingJars(remoteFunctionRegistry, 
localFunctionRegistry);
-if (!missingJars.isEmpty()) {
-  logger.info("Starting dynamic UDFs lazy-init process.\n" +
-  "The following jars are going to be downloaded and 
registered locally: " + missingJars);
+long localRegistryVersion = localFunctionRegistry.getVersion();
+if 
(doSyncFunctionRegistries(remoteFunctionRegistry.getRegistryVersion(), 
localRegistryVersion))  {
+  DataChangeVersion remoteVersion = new DataChangeVersion();
+  List missingJars = 
getMissingJars(this.remoteFunctionRegistry, localFunctionRegistry, 
remoteVersion);
   List jars = Lists.newArrayList();
-  for (String jarName : missingJars) {
-Path binary = null;
-Path source = null;
-URLClassLoader classLoader = null;
-try {
-  binary = copyJarToLocal(jarName, remoteFunctionRegistry);
-  source = copyJarToLocal(JarUtil.getSourceName(jarName), 
remoteFunctionRegistry);
-  URL[] urls = {binary.toUri().toURL(), 
source.toUri().toURL()};
-  classLoader = new URLClassLoader(urls);
-  ScanResult scanResult = scan(classLoader, binary, urls);
-  localFunctionRegistry.validate(jarName, scanResult);
-  jars.add(new JarScan(jarName, scanResult, classLoader));
-} catch (Exce

[GitHub] drill pull request #701: DRILL-4963: Fix issues with dynamically loaded over...

2017-02-06 Thread paul-rogers
Github user paul-rogers commented on a diff in the pull request:

https://github.com/apache/drill/pull/701#discussion_r99454038
  
--- Diff: 
exec/java-exec/src/main/java/org/apache/drill/exec/expr/fn/FunctionImplementationRegistry.java
 ---
@@ -140,27 +142,39 @@ public void register(DrillOperatorTable 
operatorTable) {
   }
 
   /**
-   * Using the given functionResolver
-   * finds Drill function implementation for given 
functionCall.
-   * If function implementation was not found,
-   * loads all missing remote functions and tries to find Drill 
implementation one more time.
+   * First attempts to finds the Drill function implementation that 
matches the name, arg types and return type.
+   * If exact function implementation was not found,
+   * syncs local function registry with remote function registry if needed
+   * and tries to find function implementation one more time
--- End diff --

While this sounds pretty good, consider a possible condition. Suppose a 
user consistently uses an overloaded method. Every one of those queries will 
need to check with ZK. Drill is supposed to handle many concurrent queries. 
Each of those will trigger the update. Soon, we'll be pounding on ZK hundreds 
of times per second.

The "not found" case was fine to force a sync since a user would not, 
presumably, continually issue such queries if the function really were 
undefined. But, the overloaded function case is possible, and can lead to 
performance issues.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---