[jira] [Commented] (DRILL-7242) Query with range predicate hits IOBE when accessing histogram buckets
[ https://issues.apache.org/jira/browse/DRILL-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839009#comment-16839009 ] ASF GitHub Bot commented on DRILL-7242: --- gparai commented on pull request #1785: DRILL-7242: Handle additional boundary cases and compute better estim… URL: https://github.com/apache/drill/pull/1785#discussion_r283598318 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/DrillStatsTable.java ## @@ -483,7 +483,7 @@ public static ObjectMapper getMapper() { Map statisticsValues = new HashMap<>(); Double ndv = statsProvider.getNdv(fieldName); if (ndv != null) { -statisticsValues.put(ColumnStatisticsKind.NVD, ndv); +statisticsValues.put(ColumnStatisticsKind.NDV, ndv); Review comment: I thought I had fixed the typo in an earlier PR. Not sure why it is still there. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Query with range predicate hits IOBE when accessing histogram buckets > - > > Key: DRILL-7242 > URL: https://issues.apache.org/jira/browse/DRILL-7242 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.16.0 >Reporter: Aman Sinha >Assignee: Aman Sinha >Priority: Major > Fix For: 1.17.0 > > > Following query hits an IOBE during histogram access: (make sure to run > ANALYZE command before running this query): > {noformat} > select 1 from dfs.tmp.employee where store_id > 24; > Caused by: java.lang.ArrayIndexOutOfBoundsException: 11 > at > org.apache.drill.exec.planner.common.NumericEquiDepthHistogram.getSelectedRows(NumericEquiDepthHistogram.java:215) > ~[drill-java-exec-1.16.0.0-mapr.jar:1.16.0.0-mapr] > at > org.apache.drill.exec.planner.common.NumericEquiDepthHistogram.estimatedSelectivity(NumericEquiDepthHistogram.java:130) > ~[drill-java-exec-1.16.0.0-mapr.jar:1.16.0.0-mapr] > at > org.apache.drill.exec.planner.cost.DrillRelMdSelectivity.computeRangeSelectivity(DrillRelMd > {noformat} > Here, 24.0 is the end point of the last histogram bucket and the boundary > condition is not being correctly handled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7242) Query with range predicate hits IOBE when accessing histogram buckets
[ https://issues.apache.org/jira/browse/DRILL-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839000#comment-16839000 ] ASF GitHub Bot commented on DRILL-7242: --- amansinha100 commented on pull request #1785: DRILL-7242: Handle additional boundary cases and compute better estim… URL: https://github.com/apache/drill/pull/1785#discussion_r283596709 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/planner/common/NumericEquiDepthHistogram.java ## @@ -178,101 +185,136 @@ public Double estimatedSelectivity(final RexNode columnFilter, final long totalR return currentRange; } - private long getSelectedRows(final Range range) { -final int numBuckets = buckets.length - 1; + @VisibleForTesting + protected long getSelectedRows(final Range range) { double startBucketFraction = 1.0; double endBucketFraction = 1.0; long numRows = 0; int result; Double lowValue = null; Double highValue = null; -final int first = 0; -final int last = buckets.length - 1; -int startBucket = first; -int endBucket = last; +final int firstStartPointIndex = 0; +final int lastEndPointIndex = buckets.length - 1; +int startBucket = firstStartPointIndex; +int endBucket = lastEndPointIndex - 1; if (range.hasLowerBound()) { lowValue = (Double) range.lowerEndpoint(); - // if low value is greater than the end point of the last bucket then none of the rows qualify - if (lowValue.compareTo(buckets[last]) > 0) { + // if low value is greater than the end point of the last bucket or if it is equal but the range is open (i.e + // predicate is of type > 5 where 5 is the end point of last bucket) then none of the rows qualify + result = lowValue.compareTo(buckets[lastEndPointIndex]); + if (result > 0 || result == 0 && range.lowerBoundType() == BoundType.OPEN) { return 0; } - - result = lowValue.compareTo(buckets[first]); + result = lowValue.compareTo(buckets[firstStartPointIndex]); // if low value is less than or equal to the first bucket's start point then start with the first bucket and all // rows in first bucket are included if (result <= 0) { -startBucket = first; +startBucket = firstStartPointIndex; startBucketFraction = 1.0; } else { // Use a simplified logic where we treat > and >= the same when computing selectivity since the // difference is going to be very small for reasonable sized data sets -startBucket = getContainingBucket(lowValue, numBuckets); +startBucket = getContainingBucket(lowValue, lastEndPointIndex, true); + // expecting start bucket to be >= 0 since other conditions have been handled previously Preconditions.checkArgument(startBucket >= 0, "Expected start bucket id >= 0"); -startBucketFraction = ((double) (buckets[startBucket + 1] - lowValue)) / (buckets[startBucket + 1] - buckets[startBucket]); + + if (buckets[startBucket + 1].doubleValue() == buckets[startBucket].doubleValue()) { + // if start and end points of the bucket are the same, consider entire bucket + startBucketFraction = 1.0; + } else { + startBucketFraction = ((double) (buckets[startBucket + 1] - lowValue)) / (buckets[startBucket + 1] - buckets[startBucket]); +} } } if (range.hasUpperBound()) { highValue = (Double) range.upperEndpoint(); - // if the high value is less than the start point of the first bucket then none of the rows qualify - if (highValue.compareTo(buckets[first]) < 0) { + // if the high value is less than the start point of the first bucket or if it is equal but the range is open (i.e + // predicate is of type < 1 where 1 is the start point of the first bucket) then none of the rows qualify + result = highValue.compareTo(buckets[firstStartPointIndex]); + if (result < 0 || (result == 0 && range.upperBoundType() == BoundType.OPEN)) { return 0; } - result = highValue.compareTo(buckets[last]); + result = highValue.compareTo(buckets[lastEndPointIndex]); // if high value is greater than or equal to the last bucket's end point then include the last bucket and all rows in // last bucket qualify if (result >= 0) { -endBucket = last; +endBucket = lastEndPointIndex - 1; endBucketFraction = 1.0; } else { // Use a simplified logic where we treat < and <= the same when computing selectivity since the // difference is going to be very small for reasonable sized data sets -endBucket = getContainingBucket(highValue, numBuckets); +endBucket = getContainingBucket(highValue, lastEndPointIndex, false); + // expecting end bucket to be >= 0 since other
[jira] [Commented] (DRILL-7242) Query with range predicate hits IOBE when accessing histogram buckets
[ https://issues.apache.org/jira/browse/DRILL-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16839001#comment-16839001 ] ASF GitHub Bot commented on DRILL-7242: --- amansinha100 commented on issue #1785: DRILL-7242: Handle additional boundary cases and compute better estim… URL: https://github.com/apache/drill/pull/1785#issuecomment-492045583 @gparai I have addressed your review comments. Could you pls take another look ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Query with range predicate hits IOBE when accessing histogram buckets > - > > Key: DRILL-7242 > URL: https://issues.apache.org/jira/browse/DRILL-7242 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.16.0 >Reporter: Aman Sinha >Assignee: Aman Sinha >Priority: Major > Fix For: 1.17.0 > > > Following query hits an IOBE during histogram access: (make sure to run > ANALYZE command before running this query): > {noformat} > select 1 from dfs.tmp.employee where store_id > 24; > Caused by: java.lang.ArrayIndexOutOfBoundsException: 11 > at > org.apache.drill.exec.planner.common.NumericEquiDepthHistogram.getSelectedRows(NumericEquiDepthHistogram.java:215) > ~[drill-java-exec-1.16.0.0-mapr.jar:1.16.0.0-mapr] > at > org.apache.drill.exec.planner.common.NumericEquiDepthHistogram.estimatedSelectivity(NumericEquiDepthHistogram.java:130) > ~[drill-java-exec-1.16.0.0-mapr.jar:1.16.0.0-mapr] > at > org.apache.drill.exec.planner.cost.DrillRelMdSelectivity.computeRangeSelectivity(DrillRelMd > {noformat} > Here, 24.0 is the end point of the last histogram bucket and the boundary > condition is not being correctly handled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7242) Query with range predicate hits IOBE when accessing histogram buckets
[ https://issues.apache.org/jira/browse/DRILL-7242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838999#comment-16838999 ] ASF GitHub Bot commented on DRILL-7242: --- amansinha100 commented on pull request #1785: DRILL-7242: Handle additional boundary cases and compute better estim… URL: https://github.com/apache/drill/pull/1785#discussion_r283596694 ## File path: exec/java-exec/src/test/java/org/apache/drill/exec/planner/common/TestNumericEquiDepthHistogram.java ## @@ -0,0 +1,111 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.planner.common; + +import org.apache.drill.categories.PlannerTest; + +import org.apache.drill.shaded.guava.com.google.common.collect.BoundType; +import org.junit.Test; +import org.junit.experimental.categories.Category; +import org.junit.Assert; +import org.apache.drill.shaded.guava.com.google.common.collect.Range; + + +@Category(PlannerTest.class) +public class TestNumericEquiDepthHistogram { + + @Test + public void testHistogramWithUniqueEndpoints() throws Exception { +int numBuckets = 10; +int numRowsPerBucket = 250; + +// init array with numBuckets + 1 values +Double[] buckets = {1.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0}; + +NumericEquiDepthHistogram histogram = new NumericEquiDepthHistogram(numBuckets); + +for (int i = 0; i < buckets.length; i++) { + histogram.setBucketValue(i, buckets[i]); +} +histogram.setNumRowsPerBucket(numRowsPerBucket); + +// Range: <= 1.0 Review comment: Done This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Query with range predicate hits IOBE when accessing histogram buckets > - > > Key: DRILL-7242 > URL: https://issues.apache.org/jira/browse/DRILL-7242 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.16.0 >Reporter: Aman Sinha >Assignee: Aman Sinha >Priority: Major > Fix For: 1.17.0 > > > Following query hits an IOBE during histogram access: (make sure to run > ANALYZE command before running this query): > {noformat} > select 1 from dfs.tmp.employee where store_id > 24; > Caused by: java.lang.ArrayIndexOutOfBoundsException: 11 > at > org.apache.drill.exec.planner.common.NumericEquiDepthHistogram.getSelectedRows(NumericEquiDepthHistogram.java:215) > ~[drill-java-exec-1.16.0.0-mapr.jar:1.16.0.0-mapr] > at > org.apache.drill.exec.planner.common.NumericEquiDepthHistogram.estimatedSelectivity(NumericEquiDepthHistogram.java:130) > ~[drill-java-exec-1.16.0.0-mapr.jar:1.16.0.0-mapr] > at > org.apache.drill.exec.planner.cost.DrillRelMdSelectivity.computeRangeSelectivity(DrillRelMd > {noformat} > Here, 24.0 is the end point of the last histogram bucket and the boundary > condition is not being correctly handled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7256) Query over empty Hive tables fails, we will need to print heap usagePercent details in error message
[ https://issues.apache.org/jira/browse/DRILL-7256?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Khurram Faraaz reassigned DRILL-7256: - Assignee: Khurram Faraaz > Query over empty Hive tables fails, we will need to print heap usagePercent > details in error message > > > Key: DRILL-7256 > URL: https://issues.apache.org/jira/browse/DRILL-7256 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.15.0 >Reporter: Khurram Faraaz >Assignee: Khurram Faraaz >Priority: Major > > The below query from Drill's web UI on Hive tables failed due to not enough > heap memory to run this query. > It fails intermittently from Drill web UI, and note that the two Hive tables > used in the query are empty, meaning they have no data in them. The query > does not fail when run from sqlline. > The error message does not provide information about the usagePercent of heap. > It will be useful to provide heap usagePercent information as part of the > error message in QueryWrapper.java when usagePercent > > HEAP_MEMORY_FAILURE_THRESHOLD > Drill 1.15.0 > Failing query. > {noformat} > SELECT a.event_id > FROM hive.cust_bhsf_ce_blob a, hive.t_fct_clinical_event b > where > a.event_id=b.event_id > and a.blob_contents not like '%dd:contenttype="TESTS"%' > and b.EVENT_RELATIONSHIP_CD='B' > and b.EVENT_CLASS_CD in ('DOC') > and b.entry_mode_cd='Web' > and b.RECORD_STATUS_CD='Active' > and b.RESULT_STATUS_CD ='Auth (Verified)' > and substring(b.valid_until_dt_tm,1,10) >='2017-12-30' > and substring(b.event_end_date,1,10) >='2018-01-01' > {noformat} > Stack trace from drillbit.log > {noformat} > 2019-05-09 16:25:58,472 [qtp1934687-790] ERROR > o.a.d.e.server.rest.QueryResources - Query from Web UI Failed > org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: There is > not enough heap memory to run this query using the web interface. > Please try a query with fewer columns or with a filter or limit condition to > limit the data returned. > You can also try an ODBC/JDBC client. > [Error Id: 91668f42-d88e-426b-b1fe-c0d042700500 ] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) > ~[drill-common-1.15.0.5-mapr.jar:1.15.0.5-mapr] > at org.apache.drill.exec.server.rest.QueryWrapper.run(QueryWrapper.java:103) > ~[drill-java-exec-1.15.0.5-mapr.jar:1.15.0.5-mapr] > at > org.apache.drill.exec.server.rest.QueryResources.submitQueryJSON(QueryResources.java:72) > ~[drill-java-exec-1.15.0.5-mapr.jar:1.15.0.5-mapr] > at > org.apache.drill.exec.server.rest.QueryResources.submitQuery(QueryResources.java:87) > ~[drill-java-exec-1.15.0.5-mapr.jar:1.15.0.5-mapr] > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > ~[na:1.8.0_151] > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > ~[na:1.8.0_151] > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > ~[na:1.8.0_151] > at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_151] > at > org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81) > [jersey-server-2.8.jar:na] > at > org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:151) > [jersey-server-2.8.jar:na] > at > org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:171) > [jersey-server-2.8.jar:na] > at > org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:195) > [jersey-server-2.8.jar:na] > at > org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:104) > [jersey-server-2.8.jar:na] > at > org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:387) > [jersey-server-2.8.jar:na] > at > org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:331) > [jersey-server-2.8.jar:na] > at > org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:103) > [jersey-server-2.8.jar:na] > at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:269) > [jersey-server-2.8.jar:na] > at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) > [jersey-common-2.8.jar:na] > at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) > [jersey-common-2.8.jar:na] > at org.glassfish.jersey.internal.Errors.process(Errors.java:315) >
[jira] [Created] (DRILL-7256) Query over empty Hive tables fails, we will need to print heap usagePercent details in error message
Khurram Faraaz created DRILL-7256: - Summary: Query over empty Hive tables fails, we will need to print heap usagePercent details in error message Key: DRILL-7256 URL: https://issues.apache.org/jira/browse/DRILL-7256 Project: Apache Drill Issue Type: Bug Components: Execution - Flow Affects Versions: 1.15.0 Reporter: Khurram Faraaz The below query from Drill's web UI on Hive tables failed due to not enough heap memory to run this query. It fails intermittently from Drill web UI, and note that the two Hive tables used in the query are empty, meaning they have no data in them. The query does not fail when run from sqlline. The error message does not provide information about the usagePercent of heap. It will be useful to provide heap usagePercent information as part of the error message in QueryWrapper.java when usagePercent > HEAP_MEMORY_FAILURE_THRESHOLD Drill 1.15.0 Failing query. {noformat} SELECT a.event_id FROM hive.cust_bhsf_ce_blob a, hive.t_fct_clinical_event b where a.event_id=b.event_id and a.blob_contents not like '%dd:contenttype="TESTS"%' and b.EVENT_RELATIONSHIP_CD='B' and b.EVENT_CLASS_CD in ('DOC') and b.entry_mode_cd='Web' and b.RECORD_STATUS_CD='Active' and b.RESULT_STATUS_CD ='Auth (Verified)' and substring(b.valid_until_dt_tm,1,10) >='2017-12-30' and substring(b.event_end_date,1,10) >='2018-01-01' {noformat} Stack trace from drillbit.log {noformat} 2019-05-09 16:25:58,472 [qtp1934687-790] ERROR o.a.d.e.server.rest.QueryResources - Query from Web UI Failed org.apache.drill.common.exceptions.UserException: RESOURCE ERROR: There is not enough heap memory to run this query using the web interface. Please try a query with fewer columns or with a filter or limit condition to limit the data returned. You can also try an ODBC/JDBC client. [Error Id: 91668f42-d88e-426b-b1fe-c0d042700500 ] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) ~[drill-common-1.15.0.5-mapr.jar:1.15.0.5-mapr] at org.apache.drill.exec.server.rest.QueryWrapper.run(QueryWrapper.java:103) ~[drill-java-exec-1.15.0.5-mapr.jar:1.15.0.5-mapr] at org.apache.drill.exec.server.rest.QueryResources.submitQueryJSON(QueryResources.java:72) ~[drill-java-exec-1.15.0.5-mapr.jar:1.15.0.5-mapr] at org.apache.drill.exec.server.rest.QueryResources.submitQuery(QueryResources.java:87) ~[drill-java-exec-1.15.0.5-mapr.jar:1.15.0.5-mapr] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[na:1.8.0_151] at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) ~[na:1.8.0_151] at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) ~[na:1.8.0_151] at java.lang.reflect.Method.invoke(Method.java:498) ~[na:1.8.0_151] at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke(ResourceMethodInvocationHandlerFactory.java:81) [jersey-server-2.8.jar:na] at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$1.run(AbstractJavaResourceMethodDispatcher.java:151) [jersey-server-2.8.jar:na] at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke(AbstractJavaResourceMethodDispatcher.java:171) [jersey-server-2.8.jar:na] at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch(JavaResourceMethodDispatcherProvider.java:195) [jersey-server-2.8.jar:na] at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch(AbstractJavaResourceMethodDispatcher.java:104) [jersey-server-2.8.jar:na] at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke(ResourceMethodInvoker.java:387) [jersey-server-2.8.jar:na] at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:331) [jersey-server-2.8.jar:na] at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply(ResourceMethodInvoker.java:103) [jersey-server-2.8.jar:na] at org.glassfish.jersey.server.ServerRuntime$1.run(ServerRuntime.java:269) [jersey-server-2.8.jar:na] at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) [jersey-common-2.8.jar:na] at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) [jersey-common-2.8.jar:na] at org.glassfish.jersey.internal.Errors.process(Errors.java:315) [jersey-common-2.8.jar:na] at org.glassfish.jersey.internal.Errors.process(Errors.java:297) [jersey-common-2.8.jar:na] at org.glassfish.jersey.internal.Errors.process(Errors.java:267) [jersey-common-2.8.jar:na] at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:297) [jersey-common-2.8.jar:na] at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:252) [jersey-server-2.8.jar:na] at
[jira] [Commented] (DRILL-7191) RM blobs persistence in Zookeeper for Distributed RM
[ https://issues.apache.org/jira/browse/DRILL-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838875#comment-16838875 ] ASF GitHub Bot commented on DRILL-7191: --- HanumathRao commented on pull request #1762: [DRILL-7191 / DRILL-7026]: RM state blob persistence in Zookeeper and Integration of Distributed queue configuration with Planner URL: https://github.com/apache/drill/pull/1762#discussion_r283541289 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/resourcemgr/rmblobmgr/RMConsistentBlobStoreManager.java ## @@ -0,0 +1,354 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.resourcemgr.rmblobmgr; + +import avro.shaded.com.google.common.annotations.VisibleForTesting; +import com.fasterxml.jackson.core.JsonGenerator; +import com.fasterxml.jackson.core.JsonParser; +import com.fasterxml.jackson.databind.ObjectMapper; +import com.fasterxml.jackson.databind.SerializationFeature; +import com.fasterxml.jackson.databind.module.SimpleModule; +import org.apache.curator.framework.recipes.locks.InterProcessMutex; +import org.apache.drill.common.scanner.persistence.ScanResult; +import org.apache.drill.exec.coord.zk.ZKClusterCoordinator; +import org.apache.drill.exec.exception.StoreException; +import org.apache.drill.exec.resourcemgr.NodeResources; +import org.apache.drill.exec.resourcemgr.NodeResources.NodeResourcesDe; +import org.apache.drill.exec.resourcemgr.config.QueryQueueConfig; +import org.apache.drill.exec.resourcemgr.rmblobmgr.exception.LeaderChangeException; +import org.apache.drill.exec.resourcemgr.rmblobmgr.exception.RMBlobUpdateException; +import org.apache.drill.exec.resourcemgr.rmblobmgr.exception.ResourceUnavailableException; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.ClusterStateBlob; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.ForemanQueueUsageBlob; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.ForemanResourceUsage; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.ForemanResourceUsage.ForemanResourceUsageDe; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.QueueLeadershipBlob; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.RMStateBlob; +import org.apache.drill.exec.server.DrillbitContext; +import org.apache.drill.exec.store.sys.PersistentStoreConfig; +import org.apache.drill.exec.store.sys.store.ZookeeperTransactionalPersistenceStore; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +/** + * RM state blobs manager which does all the update to the blobs under a global lock and in transactional manner. + * Since the blobs are updated by multiple Drillbit at same time to maintain the strongly consistent information in + * these blobs it uses a global lock shared across all the Drillbits. + */ +public class RMConsistentBlobStoreManager implements RMBlobStoreManager { + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(RMConsistentBlobStoreManager.class); + + private static final String RM_BLOBS_ROOT = "rm/blobs"; + + private static final String RM_LOCK_ROOT = "/rm/locks"; + + private static final String RM_BLOB_GLOBAL_LOCK_NAME = "/rm_blob_lock"; + + private static final String RM_BLOB_SER_DE_NAME = "RMStateBlobSerDeModules"; + + public static final int RM_STATE_BLOB_VERSION = 1; + + private static final int MAX_ACQUIRE_RETRY = 3; + + private final ZookeeperTransactionalPersistenceStore rmBlobStore; + + private final InterProcessMutex globalBlobMutex; + + private final DrillbitContext context; + + private final ObjectMapper serDeMapper; + + private final Map rmStateBlobs; + + private final StringBuilder exceptionStringBuilder = new StringBuilder(); + + public RMConsistentBlobStoreManager(DrillbitContext context, Collection leafQueues) throws +StoreException { +try { + this.context = context; + this.serDeMapper = initializeMapper(context.getClasspathScan()); + this.rmBlobStore =
[jira] [Commented] (DRILL-7177) Format Plugin for Excel Files
[ https://issues.apache.org/jira/browse/DRILL-7177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838827#comment-16838827 ] ASF GitHub Bot commented on DRILL-7177: --- cgivre commented on issue #1749: DRILL-7177: Format Plugin for Excel Files URL: https://github.com/apache/drill/pull/1749#issuecomment-491950505 @kkhatua This plugin will only work with Excel files from Excel 2007 or greater. (The XML Excel files). This does do data type detection, however with limits. All numeric columns are interpreted as doubles. If a column is recognized as a time, the plugin will map those to a Drill date/time column. The plugin will execute formulas. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Format Plugin for Excel Files > - > > Key: DRILL-7177 > URL: https://issues.apache.org/jira/browse/DRILL-7177 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Labels: doc-impacting > Fix For: 1.17.0 > > > This pull request adds the functionality which enables Drill to query > Microsoft Excel files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7191) RM blobs persistence in Zookeeper for Distributed RM
[ https://issues.apache.org/jira/browse/DRILL-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838816#comment-16838816 ] ASF GitHub Bot commented on DRILL-7191: --- HanumathRao commented on pull request #1762: [DRILL-7191 / DRILL-7026]: RM state blob persistence in Zookeeper and Integration of Distributed queue configuration with Planner URL: https://github.com/apache/drill/pull/1762#discussion_r283490423 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/resourcemgr/rmblobmgr/RMConsistentBlobStoreManager.java ## @@ -0,0 +1,354 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.resourcemgr.rmblobmgr; + +import avro.shaded.com.google.common.annotations.VisibleForTesting; +import com.fasterxml.jackson.core.JsonGenerator; +import com.fasterxml.jackson.core.JsonParser; +import com.fasterxml.jackson.databind.ObjectMapper; +import com.fasterxml.jackson.databind.SerializationFeature; +import com.fasterxml.jackson.databind.module.SimpleModule; +import org.apache.curator.framework.recipes.locks.InterProcessMutex; +import org.apache.drill.common.scanner.persistence.ScanResult; +import org.apache.drill.exec.coord.zk.ZKClusterCoordinator; +import org.apache.drill.exec.exception.StoreException; +import org.apache.drill.exec.resourcemgr.NodeResources; +import org.apache.drill.exec.resourcemgr.NodeResources.NodeResourcesDe; +import org.apache.drill.exec.resourcemgr.config.QueryQueueConfig; +import org.apache.drill.exec.resourcemgr.rmblobmgr.exception.LeaderChangeException; +import org.apache.drill.exec.resourcemgr.rmblobmgr.exception.RMBlobUpdateException; +import org.apache.drill.exec.resourcemgr.rmblobmgr.exception.ResourceUnavailableException; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.ClusterStateBlob; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.ForemanQueueUsageBlob; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.ForemanResourceUsage; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.ForemanResourceUsage.ForemanResourceUsageDe; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.QueueLeadershipBlob; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.RMStateBlob; +import org.apache.drill.exec.server.DrillbitContext; +import org.apache.drill.exec.store.sys.PersistentStoreConfig; +import org.apache.drill.exec.store.sys.store.ZookeeperTransactionalPersistenceStore; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +/** + * RM state blobs manager which does all the update to the blobs under a global lock and in transactional manner. + * Since the blobs are updated by multiple Drillbit at same time to maintain the strongly consistent information in + * these blobs it uses a global lock shared across all the Drillbits. + */ +public class RMConsistentBlobStoreManager implements RMBlobStoreManager { + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(RMConsistentBlobStoreManager.class); + + private static final String RM_BLOBS_ROOT = "rm/blobs"; Review comment: do we need to start this path with "/" ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > RM blobs persistence in Zookeeper for Distributed RM > > > Key: DRILL-7191 > URL: https://issues.apache.org/jira/browse/DRILL-7191 > Project: Apache Drill > Issue Type: Sub-task > Components: Server, Query Planning Optimization >Affects Versions: 1.17.0 >Reporter: Hanumath Rao Maduri >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.17.0 > > > Changes to support storing
[jira] [Commented] (DRILL-7191) RM blobs persistence in Zookeeper for Distributed RM
[ https://issues.apache.org/jira/browse/DRILL-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838818#comment-16838818 ] ASF GitHub Bot commented on DRILL-7191: --- HanumathRao commented on pull request #1762: [DRILL-7191 / DRILL-7026]: RM state blob persistence in Zookeeper and Integration of Distributed queue configuration with Planner URL: https://github.com/apache/drill/pull/1762#discussion_r283490473 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZookeeperClient.java ## @@ -301,6 +324,54 @@ public void put(final String path, final byte[] data, DataChangeVersion version) } } + public void putAsTransaction(Map pathsWithData) { +putAsTransaction(pathsWithData, null); + } + + /** + * Puts the given sets of blob and their data's in a transactional manner. It expects all the blob path to exist + * before calling this api. + * @param pathsWithData - map of blob paths to update and the final data + * @param version - version holder + */ + public void putAsTransaction(Map pathsWithData, DataChangeVersion version) { +Preconditions.checkNotNull(pathsWithData, "paths and their data to write as transaction is missing"); +List targetPaths = new ArrayList<>(); +CuratorTransaction transaction = curator.inTransaction(); +long totalDataBytes = 0; + +try { + for (Map.Entry entry : pathsWithData.entrySet()) { +final String target = PathUtils.join(root, entry.getKey()); + +if (version != null) { + transaction = transaction.setData().withVersion(version.getVersion()).forPath(target, entry.getValue()).and(); +} else { + transaction = transaction.setData().forPath(target, entry.getValue()).and(); +} +targetPaths.add(target); +totalDataBytes += entry.getValue().length; + } + + // If total set operator payload is greater than 1MB then curator set operation will fail Review comment: greater than or equal to? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > RM blobs persistence in Zookeeper for Distributed RM > > > Key: DRILL-7191 > URL: https://issues.apache.org/jira/browse/DRILL-7191 > Project: Apache Drill > Issue Type: Sub-task > Components: Server, Query Planning Optimization >Affects Versions: 1.17.0 >Reporter: Hanumath Rao Maduri >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.17.0 > > > Changes to support storing UUID for each Drillbit Service Instance locally to > be used by planner and execution layer. This UUID is used to uniquely > identify a Drillbit and register Drillbit information in the RM StateBlobs. > Introduced a PersistentStore named ZookeeperTransactionalPersistenceStore > with Transactional capabilities using Zookeeper Transactional API’s. This is > used for updating RM State blobs as all the updates need to happen in > transactional manner. Added RMStateBlobs definition and support for serde to > Zookeeper. > Implementation for DistributedRM and its corresponding QueryRM apis and state > management. > Updated the state management of Query in Foreman so that same Foreman object > can be submitted multiple times. Also introduced concept of 2 maps keeping > track of waiting and running queries. These were done to support for async > admit protocol which will be needed with Distributed RM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7191) RM blobs persistence in Zookeeper for Distributed RM
[ https://issues.apache.org/jira/browse/DRILL-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838815#comment-16838815 ] ASF GitHub Bot commented on DRILL-7191: --- HanumathRao commented on pull request #1762: [DRILL-7191 / DRILL-7026]: RM state blob persistence in Zookeeper and Integration of Distributed queue configuration with Planner URL: https://github.com/apache/drill/pull/1762#discussion_r283490400 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/resourcemgr/rmblobmgr/RMConsistentBlobStoreManager.java ## @@ -0,0 +1,354 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.resourcemgr.rmblobmgr; + +import avro.shaded.com.google.common.annotations.VisibleForTesting; +import com.fasterxml.jackson.core.JsonGenerator; +import com.fasterxml.jackson.core.JsonParser; +import com.fasterxml.jackson.databind.ObjectMapper; +import com.fasterxml.jackson.databind.SerializationFeature; +import com.fasterxml.jackson.databind.module.SimpleModule; +import org.apache.curator.framework.recipes.locks.InterProcessMutex; +import org.apache.drill.common.scanner.persistence.ScanResult; +import org.apache.drill.exec.coord.zk.ZKClusterCoordinator; +import org.apache.drill.exec.exception.StoreException; +import org.apache.drill.exec.resourcemgr.NodeResources; +import org.apache.drill.exec.resourcemgr.NodeResources.NodeResourcesDe; +import org.apache.drill.exec.resourcemgr.config.QueryQueueConfig; +import org.apache.drill.exec.resourcemgr.rmblobmgr.exception.LeaderChangeException; +import org.apache.drill.exec.resourcemgr.rmblobmgr.exception.RMBlobUpdateException; +import org.apache.drill.exec.resourcemgr.rmblobmgr.exception.ResourceUnavailableException; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.ClusterStateBlob; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.ForemanQueueUsageBlob; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.ForemanResourceUsage; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.ForemanResourceUsage.ForemanResourceUsageDe; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.QueueLeadershipBlob; +import org.apache.drill.exec.resourcemgr.rmblobmgr.rmblob.RMStateBlob; +import org.apache.drill.exec.server.DrillbitContext; +import org.apache.drill.exec.store.sys.PersistentStoreConfig; +import org.apache.drill.exec.store.sys.store.ZookeeperTransactionalPersistenceStore; + +import java.util.ArrayList; +import java.util.Collection; +import java.util.HashMap; +import java.util.Iterator; +import java.util.List; +import java.util.Map; +import java.util.stream.Collectors; + +/** + * RM state blobs manager which does all the update to the blobs under a global lock and in transactional manner. + * Since the blobs are updated by multiple Drillbit at same time to maintain the strongly consistent information in + * these blobs it uses a global lock shared across all the Drillbits. + */ +public class RMConsistentBlobStoreManager implements RMBlobStoreManager { + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(RMConsistentBlobStoreManager.class); + + private static final String RM_BLOBS_ROOT = "rm/blobs"; + + private static final String RM_LOCK_ROOT = "/rm/locks"; + + private static final String RM_BLOB_GLOBAL_LOCK_NAME = "/rm_blob_lock"; + + private static final String RM_BLOB_SER_DE_NAME = "RMStateBlobSerDeModules"; + + public static final int RM_STATE_BLOB_VERSION = 1; + + private static final int MAX_ACQUIRE_RETRY = 3; + + private final ZookeeperTransactionalPersistenceStore rmBlobStore; + + private final InterProcessMutex globalBlobMutex; + + private final DrillbitContext context; + + private final ObjectMapper serDeMapper; + + private final Map rmStateBlobs; + + private final StringBuilder exceptionStringBuilder = new StringBuilder(); + + public RMConsistentBlobStoreManager(DrillbitContext context, Collection leafQueues) throws +StoreException { +try { + this.context = context; + this.serDeMapper = initializeMapper(context.getClasspathScan()); + this.rmBlobStore =
[jira] [Commented] (DRILL-7191) RM blobs persistence in Zookeeper for Distributed RM
[ https://issues.apache.org/jira/browse/DRILL-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838778#comment-16838778 ] ASF GitHub Bot commented on DRILL-7191: --- HanumathRao commented on pull request #1762: [DRILL-7191 / DRILL-7026]: RM state blob persistence in Zookeeper and Integration of Distributed queue configuration with Planner URL: https://github.com/apache/drill/pull/1762#discussion_r283471547 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZookeeperClient.java ## @@ -301,6 +324,54 @@ public void put(final String path, final byte[] data, DataChangeVersion version) } } + public void putAsTransaction(Map pathsWithData) { +putAsTransaction(pathsWithData, null); + } + + /** + * Puts the given sets of blob and their data's in a transactional manner. It expects all the blob path to exist + * before calling this api. + * @param pathsWithData - map of blob paths to update and the final data + * @param version - version holder + */ + public void putAsTransaction(Map pathsWithData, DataChangeVersion version) { Review comment: Do we currently use non null version. If not then can you please mention it in the comment that this is needed for future use. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > RM blobs persistence in Zookeeper for Distributed RM > > > Key: DRILL-7191 > URL: https://issues.apache.org/jira/browse/DRILL-7191 > Project: Apache Drill > Issue Type: Sub-task > Components: Server, Query Planning Optimization >Affects Versions: 1.17.0 >Reporter: Hanumath Rao Maduri >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.17.0 > > > Changes to support storing UUID for each Drillbit Service Instance locally to > be used by planner and execution layer. This UUID is used to uniquely > identify a Drillbit and register Drillbit information in the RM StateBlobs. > Introduced a PersistentStore named ZookeeperTransactionalPersistenceStore > with Transactional capabilities using Zookeeper Transactional API’s. This is > used for updating RM State blobs as all the updates need to happen in > transactional manner. Added RMStateBlobs definition and support for serde to > Zookeeper. > Implementation for DistributedRM and its corresponding QueryRM apis and state > management. > Updated the state management of Query in Foreman so that same Foreman object > can be submitted multiple times. Also introduced concept of 2 maps keeping > track of waiting and running queries. These were done to support for async > admit protocol which will be needed with Distributed RM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7191) RM blobs persistence in Zookeeper for Distributed RM
[ https://issues.apache.org/jira/browse/DRILL-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838777#comment-16838777 ] ASF GitHub Bot commented on DRILL-7191: --- HanumathRao commented on pull request #1762: [DRILL-7191 / DRILL-7026]: RM state blob persistence in Zookeeper and Integration of Distributed queue configuration with Planner URL: https://github.com/apache/drill/pull/1762#discussion_r283471493 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZookeeperClient.java ## @@ -227,13 +230,33 @@ public void create(final String path) { final String target = PathUtils.join(root, path); try { - curator.create().withMode(mode).forPath(target); + curator.create().creatingParentsIfNeeded().withMode(mode).forPath(target); getCache().rebuildNode(target); } catch (final Exception e) { throw new DrillRuntimeException("unable to put ", e); } } + public void createAsTransaction(List paths) { +Preconditions.checkNotNull(paths, "no paths provided to create"); Review comment: Is it also better to check for empty list of paths? Also it might be good to add a comment for this function. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > RM blobs persistence in Zookeeper for Distributed RM > > > Key: DRILL-7191 > URL: https://issues.apache.org/jira/browse/DRILL-7191 > Project: Apache Drill > Issue Type: Sub-task > Components: Server, Query Planning Optimization >Affects Versions: 1.17.0 >Reporter: Hanumath Rao Maduri >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.17.0 > > > Changes to support storing UUID for each Drillbit Service Instance locally to > be used by planner and execution layer. This UUID is used to uniquely > identify a Drillbit and register Drillbit information in the RM StateBlobs. > Introduced a PersistentStore named ZookeeperTransactionalPersistenceStore > with Transactional capabilities using Zookeeper Transactional API’s. This is > used for updating RM State blobs as all the updates need to happen in > transactional manner. Added RMStateBlobs definition and support for serde to > Zookeeper. > Implementation for DistributedRM and its corresponding QueryRM apis and state > management. > Updated the state management of Query in Foreman so that same Foreman object > can be submitted multiple times. Also introduced concept of 2 maps keeping > track of waiting and running queries. These were done to support for async > admit protocol which will be needed with Distributed RM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7191) RM blobs persistence in Zookeeper for Distributed RM
[ https://issues.apache.org/jira/browse/DRILL-7191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838774#comment-16838774 ] ASF GitHub Bot commented on DRILL-7191: --- HanumathRao commented on pull request #1762: [DRILL-7191 / DRILL-7026]: RM state blob persistence in Zookeeper and Integration of Distributed queue configuration with Planner URL: https://github.com/apache/drill/pull/1762#discussion_r283471413 ## File path: exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java ## @@ -249,14 +253,19 @@ public RegistrationHandle update(RegistrationHandle handle, State state) { */ @Override public Collection getOnlineEndPoints() { -Collection runningEndPoints = new ArrayList<>(); -for (DrillbitEndpoint endpoint: endpoints){ - if(isDrillbitInState(endpoint, State.ONLINE)) { -runningEndPoints.add(endpoint); +return getOnlineEndpointsUUID().keySet(); + } + + @Override + public Map getOnlineEndpointsUUID() { Review comment: Looks like this is duplicate code similar to that of the LocalClusterCoordinator. Can you move this function into a common place and use it in both the places? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > RM blobs persistence in Zookeeper for Distributed RM > > > Key: DRILL-7191 > URL: https://issues.apache.org/jira/browse/DRILL-7191 > Project: Apache Drill > Issue Type: Sub-task > Components: Server, Query Planning Optimization >Affects Versions: 1.17.0 >Reporter: Hanumath Rao Maduri >Assignee: Sorabh Hamirwasia >Priority: Major > Fix For: 1.17.0 > > > Changes to support storing UUID for each Drillbit Service Instance locally to > be used by planner and execution layer. This UUID is used to uniquely > identify a Drillbit and register Drillbit information in the RM StateBlobs. > Introduced a PersistentStore named ZookeeperTransactionalPersistenceStore > with Transactional capabilities using Zookeeper Transactional API’s. This is > used for updating RM State blobs as all the updates need to happen in > transactional manner. Added RMStateBlobs definition and support for serde to > Zookeeper. > Implementation for DistributedRM and its corresponding QueryRM apis and state > management. > Updated the state management of Query in Foreman so that same Foreman object > can be submitted multiple times. Also introduced concept of 2 maps keeping > track of waiting and running queries. These were done to support for async > admit protocol which will be needed with Distributed RM. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7177) Format Plugin for Excel Files
[ https://issues.apache.org/jira/browse/DRILL-7177?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838760#comment-16838760 ] ASF GitHub Bot commented on DRILL-7177: --- kkhatua commented on issue #1749: DRILL-7177: Format Plugin for Excel Files URL: https://github.com/apache/drill/pull/1749#issuecomment-491924798 @cgivre This would be a nice add-on. Can you rebase your commit and resolve the conflicts? Also, are there limitations, such as versions, data type detection, etc? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Format Plugin for Excel Files > - > > Key: DRILL-7177 > URL: https://issues.apache.org/jira/browse/DRILL-7177 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.17.0 >Reporter: Charles Givre >Assignee: Charles Givre >Priority: Major > Labels: doc-impacting > Fix For: 1.17.0 > > > This pull request adds the functionality which enables Drill to query > Microsoft Excel files. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7222) Visualize estimated and actual row counts for a query
[ https://issues.apache.org/jira/browse/DRILL-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838752#comment-16838752 ] ASF GitHub Bot commented on DRILL-7222: --- kkhatua commented on issue #1779: DRILL-7222: Visualize estimated and actual row counts for a query URL: https://github.com/apache/drill/pull/1779#issuecomment-491922589 @arina-ielchiieva the feature allows an advanced user to compare estimated rows with the actual rowcounts for the query. Since the estimates are part of the Logical Plan Text , but in a scientific format, we need to parse it to represent it as a whole number with proper formatting (i.e. use of comma as a 1000s separator). I used a Github snippet since that had the most references and usage, rather than look for a (possible?) Javascript library for this feature. @sohami could you take a look as well? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Visualize estimated and actual row counts for a query > - > > Key: DRILL-7222 > URL: https://issues.apache.org/jira/browse/DRILL-7222 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.16.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Labels: doc-impacting, user-experience > Fix For: 1.17.0 > > > With statistics in place, it would be useful to have the *estimated* rowcount > along side the *actual* rowcount query profile's operator overview. > We can extract this from the Physical Plan section of the profile. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7255) Support nulls for all levels of nesting
Igor Guzenko created DRILL-7255: --- Summary: Support nulls for all levels of nesting Key: DRILL-7255 URL: https://issues.apache.org/jira/browse/DRILL-7255 Project: Apache Drill Issue Type: Sub-task Reporter: Igor Guzenko -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7255) Support nulls for all levels of nesting
[ https://issues.apache.org/jira/browse/DRILL-7255?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Guzenko reassigned DRILL-7255: --- Assignee: Igor Guzenko > Support nulls for all levels of nesting > --- > > Key: DRILL-7255 > URL: https://issues.apache.org/jira/browse/DRILL-7255 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Igor Guzenko >Assignee: Igor Guzenko >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7254) Read Hive union w/o nulls
Igor Guzenko created DRILL-7254: --- Summary: Read Hive union w/o nulls Key: DRILL-7254 URL: https://issues.apache.org/jira/browse/DRILL-7254 Project: Apache Drill Issue Type: Sub-task Reporter: Igor Guzenko -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7252) Read Hive map using canonical Map vector
[ https://issues.apache.org/jira/browse/DRILL-7252?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Guzenko reassigned DRILL-7252: --- Assignee: Igor Guzenko > Read Hive map using canonical Map vector > - > > Key: DRILL-7252 > URL: https://issues.apache.org/jira/browse/DRILL-7252 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Igor Guzenko >Assignee: Igor Guzenko >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7253) Read Hive struct w/o nulls
[ https://issues.apache.org/jira/browse/DRILL-7253?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Guzenko reassigned DRILL-7253: --- Assignee: Igor Guzenko > Read Hive struct w/o nulls > -- > > Key: DRILL-7253 > URL: https://issues.apache.org/jira/browse/DRILL-7253 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Igor Guzenko >Assignee: Igor Guzenko >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7254) Read Hive union w/o nulls
[ https://issues.apache.org/jira/browse/DRILL-7254?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Guzenko reassigned DRILL-7254: --- Assignee: Igor Guzenko > Read Hive union w/o nulls > - > > Key: DRILL-7254 > URL: https://issues.apache.org/jira/browse/DRILL-7254 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Igor Guzenko >Assignee: Igor Guzenko >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7253) Read Hive struct w/o nulls
Igor Guzenko created DRILL-7253: --- Summary: Read Hive struct w/o nulls Key: DRILL-7253 URL: https://issues.apache.org/jira/browse/DRILL-7253 Project: Apache Drill Issue Type: Sub-task Reporter: Igor Guzenko -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7252) Read Hive map using canonical Map vector
Igor Guzenko created DRILL-7252: --- Summary: Read Hive map using canonical Map vector Key: DRILL-7252 URL: https://issues.apache.org/jira/browse/DRILL-7252 Project: Apache Drill Issue Type: Sub-task Reporter: Igor Guzenko -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7097) Rename MapVector to StructVector
[ https://issues.apache.org/jira/browse/DRILL-7097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Guzenko updated DRILL-7097: Issue Type: Sub-task (was: Improvement) Parent: DRILL-3290 > Rename MapVector to StructVector > > > Key: DRILL-7097 > URL: https://issues.apache.org/jira/browse/DRILL-7097 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Igor Guzenko >Assignee: Igor Guzenko >Priority: Major > > For a long time Drill's MapVector was actually more suitable for representing > Struct data. And in Apache Arrow it was actually renamed to StructVector. To > align our code with Arrow and give space for planned implementation of > canonical Map (DRILL-7096) we need to rename existing MapVector and all > related classes. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (DRILL-7251) Read Hive array w/o nulls
[ https://issues.apache.org/jira/browse/DRILL-7251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Guzenko reassigned DRILL-7251: --- Assignee: Igor Guzenko > Read Hive array w/o nulls > - > > Key: DRILL-7251 > URL: https://issues.apache.org/jira/browse/DRILL-7251 > Project: Apache Drill > Issue Type: Sub-task > Components: Storage - Hive >Reporter: Igor Guzenko >Assignee: Igor Guzenko >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7096) Develop vector for canonical Map
[ https://issues.apache.org/jira/browse/DRILL-7096?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Guzenko updated DRILL-7096: Issue Type: Sub-task (was: Improvement) Parent: DRILL-3290 > Develop vector for canonical Map > - > > Key: DRILL-7096 > URL: https://issues.apache.org/jira/browse/DRILL-7096 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Igor Guzenko >Assignee: Bohdan Kazydub >Priority: Major > > Canonical Map datatype can be represented using combination of three > value vectors: > keysVector - vector for storing keys of each map > valuesVector - vector for storing values of each map > offsetsVector - vector for storing of start indexes of next each map > So it's not very hard to create such Map vector, but there is a major issue > with such map representation. It's hard to search maps values by key in such > vector, need to investigate some advanced techniques to make such search > efficient. Or find other more suitable options to represent map datatype in > world of vectors. > After question about maps, Apache Arrow developers responded that for Java > they don't have real Map vector, for now they just have logical Map type > definition where they define Map like: List< Struct value:value_type> >. So implementation of value vector would be useful for > Arrow too. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7251) Read Hive array w/o nulls
Igor Guzenko created DRILL-7251: --- Summary: Read Hive array w/o nulls Key: DRILL-7251 URL: https://issues.apache.org/jira/browse/DRILL-7251 Project: Apache Drill Issue Type: Sub-task Components: Storage - Hive Reporter: Igor Guzenko -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7222) Visualize estimated and actual row counts for a query
[ https://issues.apache.org/jira/browse/DRILL-7222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16838537#comment-16838537 ] ASF GitHub Bot commented on DRILL-7222: --- arina-ielchiieva commented on issue #1779: DRILL-7222: Visualize estimated and actual row counts for a query URL: https://github.com/apache/drill/pull/1779#issuecomment-49182 @kkhatua could you please request someone to review JS part. I am not a JS expert but the changes does not look good to me. I am not sure why we need such advanced number parsing so feedback from someone who knows what is going on will be very helpful. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Visualize estimated and actual row counts for a query > - > > Key: DRILL-7222 > URL: https://issues.apache.org/jira/browse/DRILL-7222 > Project: Apache Drill > Issue Type: Improvement > Components: Web Server >Affects Versions: 1.16.0 >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Labels: doc-impacting, user-experience > Fix For: 1.17.0 > > > With statistics in place, it would be useful to have the *estimated* rowcount > along side the *actual* rowcount query profile's operator overview. > We can extract this from the Physical Plan section of the profile. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7206) Tuning hash join code using primitive int
[ https://issues.apache.org/jira/browse/DRILL-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7206: Reviewer: Boaz Ben-Zvi > Tuning hash join code using primitive int > - > > Key: DRILL-7206 > URL: https://issues.apache.org/jira/browse/DRILL-7206 > Project: Apache Drill > Issue Type: Improvement >Reporter: weijie.tong >Assignee: weijie.tong >Priority: Major > Labels: ready-to-commit > Fix For: 1.17.0 > > > This issue is try to tune the HashJoin implementation codes. At right or full > join type , the HashJoinProbe will create a List member to record > the unmatched composed indexes. It can be replaced by a primitive > IntArrayList without box/unbox operation ,also benefits the GC if there's > more unmatched items. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7206) Tuning hash join code using primitive int
[ https://issues.apache.org/jira/browse/DRILL-7206?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7206: Labels: ready-to-commit (was: ) > Tuning hash join code using primitive int > - > > Key: DRILL-7206 > URL: https://issues.apache.org/jira/browse/DRILL-7206 > Project: Apache Drill > Issue Type: Improvement >Reporter: weijie.tong >Assignee: weijie.tong >Priority: Major > Labels: ready-to-commit > Fix For: 1.17.0 > > > This issue is try to tune the HashJoin implementation codes. At right or full > join type , the HashJoinProbe will create a List member to record > the unmatched composed indexes. It can be replaced by a primitive > IntArrayList without box/unbox operation ,also benefits the GC if there's > more unmatched items. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7250) Query with CTE fails when its name matches to the table name without access
Volodymyr Vysotskyi created DRILL-7250: -- Summary: Query with CTE fails when its name matches to the table name without access Key: DRILL-7250 URL: https://issues.apache.org/jira/browse/DRILL-7250 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 When impersonation is enabled, and for example, we have {{lineitem}} table with permissions {{750}} which is owned by {{user0_1:group0_1}} and {{user2_1}} don't have access to it. The following query: {code:sql} use mini_dfs_plugin.user0_1; with lineitem as (SELECT 1 as a) select * from lineitem {code} submitted from {{user2_1}} fails with the following error: {noformat} java.lang.Exception: org.apache.hadoop.security.AccessControlException: Permission denied: user=user2_1, access=READ_EXECUTE, inode="/user/user0_1/lineitem":user0_1:group0_1:drwxr-x--- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:317) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:229) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:199) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1752) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1736) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:1710) at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getListingInt(FSDirStatAndListingOp.java:70) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4432) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:999) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:646) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2213) at ...(:0) ~[na:na] at org.apache.drill.exec.util.FileSystemUtil.listRecursive(FileSystemUtil.java:253) ~[classes/:na] at org.apache.drill.exec.util.FileSystemUtil.list(FileSystemUtil.java:208) ~[classes/:na] at org.apache.drill.exec.util.FileSystemUtil.listFiles(FileSystemUtil.java:104) ~[classes/:na] at org.apache.drill.exec.util.DrillFileSystemUtil.listFiles(DrillFileSystemUtil.java:86) ~[classes/:na] at org.apache.drill.exec.store.dfs.FileSelection.minusDirectories(FileSelection.java:178) ~[classes/:na] at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.detectEmptySelection(WorkspaceSchemaFactory.java:669) ~[classes/:na] at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:633) ~[classes/:na] at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:283) ~[classes/:na] at org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.getNewEntry(ExpandingConcurrentMap.java:96) ~[classes/:na] at org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.get(ExpandingConcurrentMap.java:90) ~[classes/:na] at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.getTable(WorkspaceSchemaFactory.java:439) ~[classes/:na] at org.apache.calcite.jdbc.SimpleCalciteSchema.getImplicitTable(SimpleCalciteSchema.java:83) ~[calcite-core-1.18.0-drill-r1.jar:1.18.0-drill-r1] at org.apache.calcite.jdbc.CalciteSchema.getTable(CalciteSchema.java:286) ~[calcite-core-1.18.0-drill-r1.jar:1.18.0-drill-r1] at org.apache.calcite.sql.validate.SqlValidatorUtil.getTableEntryFrom(SqlValidatorUtil.java:1046) ~[calcite-core-1.18.0-drill-r1.jar:1.18.0-drill-r1] at org.apache.calcite.sql.validate.SqlValidatorUtil.getTableEntry(SqlValidatorUtil.java:1003) ~[calcite-core-1.18.0-drill-r1.jar:1.18.0-drill-r1] at
[jira] [Updated] (DRILL-4782) TO_TIME function cannot separate time from date time string
[ https://issues.apache.org/jira/browse/DRILL-4782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Igor Guzenko updated DRILL-4782: Labels: ready-to-commit (was: ) > TO_TIME function cannot separate time from date time string > --- > > Key: DRILL-4782 > URL: https://issues.apache.org/jira/browse/DRILL-4782 > Project: Apache Drill > Issue Type: Improvement > Components: Server >Affects Versions: 1.6.0, 1.7.0 > Environment: CentOS 7 >Reporter: Matt Keranen >Assignee: Dmytriy Grinchenko >Priority: Minor > Labels: ready-to-commit > Fix For: 1.17.0 > > > TO_TIME('2016-03-03 00:00', ''-MM-dd HH:mm') returns "05:14:46.656" > instead of the expected "00:00:00" > Adding and additional split does work as expected: TO_TIME(SPLIT('2016-03-03 > 00:00', ' ')[1], 'HH:mm') -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7196) Queries are still runnable on disabled plugins
[ https://issues.apache.org/jira/browse/DRILL-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytriy Grinchenko updated DRILL-7196: -- Description: The issue were partially addressed with the DRILL-6732 task. However when issuing select request with full path to the table, select still able to process it regardless of the plugin state: {code} 0: jdbc:drill:zk=local> select * from `mongo.my_binaries`.`master.files`; 0: jdbc:drill:zk=local> select md5, chunkSize from `mongo.ambari_binaries`.`master.files`; +---++ |md5| chunkSize | +---++ | 4e30dc965bb8aa064af4f2a9608fa7ae | 261120 | | b223e44347e63fcae296e8432755e07c | 261120 | | 2720c5e10e739142d1b28996349a1d7e | 261120 | +---++ 0: jdbc:drill:zk=local> show schemas; +-+ | SCHEMA_NAME | +-+ | cp.default | | dfs.default | | dfs.root| | dfs.test| | dfs.tmp | | hive.default| | information_schema | | sys | +-+ 8 rows selected (0.127 seconds) {code} was: The issue were partially addressed with the DRILL-6732 task. However when issuing select request with full path to the table, select still able to process it regardless of the plugin state: {code} 0: jdbc:drill:zk=local> select * from `mongo.my_binaries`.`master.files`; +--+---+++--+ | _id|md5 | chunkSize | length |uploadDate| +--+---+++--+ | | 4e30dc965bb8aa064af4f2a9608fa7ae | 261120 | 291830238 | 2019-04-24 07:16:22.355 | | | b223e44347e63fcae296e8432755e07c | 261120 | 649511 | 2019-05-08 06:59:33.618 | | | 2720c5e10e739142d1b28996349a1d7e | 261120 | 395283140 | 2019-05-10 08:23:27.041 | +--+---+++--+ 0: jdbc:drill:zk=local> show schemas; +-+ | SCHEMA_NAME | +-+ | cp.default | | dfs.default | | dfs.root| | dfs.test| | dfs.tmp | | hive.default| | information_schema | | sys | +-+ 8 rows selected (0.127 seconds) {code} > Queries are still runnable on disabled plugins > -- > > Key: DRILL-7196 > URL: https://issues.apache.org/jira/browse/DRILL-7196 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.12.0 >Reporter: Dmytriy Grinchenko >Assignee: Dmytriy Grinchenko >Priority: Major > Fix For: 1.17.0 > > > The issue were partially addressed with the DRILL-6732 task. However when > issuing select request with full path to the table, select still able to > process it regardless of the plugin state: > {code} > 0: jdbc:drill:zk=local> select * from `mongo.my_binaries`.`master.files`; > 0: jdbc:drill:zk=local> select md5, chunkSize from > `mongo.ambari_binaries`.`master.files`; > +---++ > |md5| chunkSize | > +---++ > | 4e30dc965bb8aa064af4f2a9608fa7ae | 261120 | > | b223e44347e63fcae296e8432755e07c | 261120 | > | 2720c5e10e739142d1b28996349a1d7e | 261120 | > +---++ > 0: jdbc:drill:zk=local> show schemas; > +-+ > | SCHEMA_NAME | > +-+ > | cp.default | > | dfs.default | > | dfs.root| > | dfs.test| > | dfs.tmp | > | hive.default| > | information_schema | > | sys | > +-+ > 8 rows selected (0.127 seconds) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7196) Queries are still runnable on disabled plugins
[ https://issues.apache.org/jira/browse/DRILL-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytriy Grinchenko updated DRILL-7196: -- Description: The issue were partially addressed with the DRILL-6732 task. However when issuing select request with full path to the table, select still able to process it regardless of the plugin state: {code} 0: jdbc:drill:zk=local> select * from `mongo.my_binaries`.`master.files`; +---++ |md5| chunkSize | +---++ | 4e30dc965bb8aa064af4f2a9608fa7ae | 261120 | | b223e44347e63fcae296e8432755e07c | 261120 | | 2720c5e10e739142d1b28996349a1d7e | 261120 | +---++ 0: jdbc:drill:zk=local> show schemas; +-+ | SCHEMA_NAME | +-+ | cp.default | | dfs.default | | dfs.root| | dfs.test| | dfs.tmp | | hive.default| | information_schema | | sys | +-+ 8 rows selected (0.127 seconds) {code} was: The issue were partially addressed with the DRILL-6732 task. However when issuing select request with full path to the table, select still able to process it regardless of the plugin state: {code} 0: jdbc:drill:zk=local> select * from `mongo.my_binaries`.`master.files`; 0: jdbc:drill:zk=local> select md5, chunkSize from `mongo.ambari_binaries`.`master.files`; +---++ |md5| chunkSize | +---++ | 4e30dc965bb8aa064af4f2a9608fa7ae | 261120 | | b223e44347e63fcae296e8432755e07c | 261120 | | 2720c5e10e739142d1b28996349a1d7e | 261120 | +---++ 0: jdbc:drill:zk=local> show schemas; +-+ | SCHEMA_NAME | +-+ | cp.default | | dfs.default | | dfs.root| | dfs.test| | dfs.tmp | | hive.default| | information_schema | | sys | +-+ 8 rows selected (0.127 seconds) {code} > Queries are still runnable on disabled plugins > -- > > Key: DRILL-7196 > URL: https://issues.apache.org/jira/browse/DRILL-7196 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.12.0 >Reporter: Dmytriy Grinchenko >Assignee: Dmytriy Grinchenko >Priority: Major > Fix For: 1.17.0 > > > The issue were partially addressed with the DRILL-6732 task. However when > issuing select request with full path to the table, select still able to > process it regardless of the plugin state: > {code} > 0: jdbc:drill:zk=local> select * from `mongo.my_binaries`.`master.files`; > +---++ > |md5| chunkSize | > +---++ > | 4e30dc965bb8aa064af4f2a9608fa7ae | 261120 | > | b223e44347e63fcae296e8432755e07c | 261120 | > | 2720c5e10e739142d1b28996349a1d7e | 261120 | > +---++ > 0: jdbc:drill:zk=local> show schemas; > +-+ > | SCHEMA_NAME | > +-+ > | cp.default | > | dfs.default | > | dfs.root| > | dfs.test| > | dfs.tmp | > | hive.default| > | information_schema | > | sys | > +-+ > 8 rows selected (0.127 seconds) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7196) Queries are still runnable on disabled plugins
[ https://issues.apache.org/jira/browse/DRILL-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytriy Grinchenko updated DRILL-7196: -- Description: The issue were partially addressed with the DRILL-6732 task. However when issuing select request with full path to the table, select still able to process it regardless of the plugin state: {code} 0: jdbc:drill:zk=local> select * from `mongo.my_binaries`.`master.files`; +--+---+++--+ | _id|md5 | chunkSize | length |uploadDate| +--+---+++--+ | | 4e30dc965bb8aa064af4f2a9608fa7ae | 261120 | 291830238 | 2019-04-24 07:16:22.355 | | | b223e44347e63fcae296e8432755e07c | 261120 | 649511 | 2019-05-08 06:59:33.618 | | | 2720c5e10e739142d1b28996349a1d7e | 261120 | 395283140 | 2019-05-10 08:23:27.041 | +--+---+++--+ 0: jdbc:drill:zk=local> show schemas; +-+ | SCHEMA_NAME | +-+ | cp.default | | dfs.default | | dfs.root| | dfs.test| | dfs.tmp | | hive.default| | information_schema | | sys | +-+ 8 rows selected (0.127 seconds) {code} was: Currently Drillbit will return exception like below: {code} Error: SYSTEM ERROR: AssertionError: Rule's description should be unique; existing rule=TestRuleName; new rule=TestRuleName {code} This error message is not representative about the real failure cause. Instead we need to provide more clear exception, for example: {code} "Unable to execute query using disabled storage plugin." {code} > Queries are still runnable on disabled plugins > -- > > Key: DRILL-7196 > URL: https://issues.apache.org/jira/browse/DRILL-7196 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.12.0 >Reporter: Dmytriy Grinchenko >Assignee: Dmytriy Grinchenko >Priority: Major > Fix For: 1.17.0 > > > The issue were partially addressed with the DRILL-6732 task. However when > issuing select request with full path to the table, select still able to > process it regardless of the plugin state: > {code} > 0: jdbc:drill:zk=local> select * from `mongo.my_binaries`.`master.files`; > +--+---+++--+ > | _id|md5 > | chunkSize | length |uploadDate| > +--+---+++--+ > | | 4e30dc965bb8aa064af4f2a9608fa7ae > | 261120 | 291830238 | 2019-04-24 07:16:22.355 | > | | b223e44347e63fcae296e8432755e07c > | 261120 | 649511 | 2019-05-08 06:59:33.618 | > | | 2720c5e10e739142d1b28996349a1d7e > | 261120 | 395283140 | 2019-05-10 08:23:27.041 | > +--+---+++--+ > 0: jdbc:drill:zk=local> show schemas; > +-+ > | SCHEMA_NAME | > +-+ > | cp.default | > | dfs.default | > | dfs.root| > | dfs.test| > | dfs.tmp | > | hive.default| > | information_schema | > | sys | > +-+ > 8 rows selected (0.127 seconds) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-7196) Queries are still runnable on disabled plugins
[ https://issues.apache.org/jira/browse/DRILL-7196?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dmytriy Grinchenko updated DRILL-7196: -- Summary: Queries are still runnable on disabled plugins (was: Queries are still runable when plugin ) > Queries are still runnable on disabled plugins > -- > > Key: DRILL-7196 > URL: https://issues.apache.org/jira/browse/DRILL-7196 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.12.0 >Reporter: Dmytriy Grinchenko >Assignee: Dmytriy Grinchenko >Priority: Major > Fix For: 1.17.0 > > > Currently Drillbit will return exception like below: > {code} > Error: SYSTEM ERROR: AssertionError: Rule's description should be unique; > existing rule=TestRuleName; new rule=TestRuleName > {code} > This error message is not representative about the real failure cause. > Instead we need to provide more clear exception, for example: > {code} "Unable to execute query using disabled storage plugin." > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)