[jira] [Commented] (DRILL-4430) Unable to execute drill jdbc from jboss container
[ https://issues.apache.org/jira/browse/DRILL-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166769#comment-15166769 ] abhishek agrawal commented on DRILL-4430: - 20:30:25,430 INFO [org.apache.drill.common.config.DrillConfig] (http--0.0.0.0-8080-1) Configuration and plugin file(s) identified in 78ms. Base Configuration: - vfs:/content/Services.war/WEB-INF/lib/drill-common-1.5.0.jar/drill-default.conf Intermediate Configuration and Plugin files, in order of precedence: - vfs:/content/Services.war/WEB-INF/lib/drill-common-1.5.0.jar/drill-module.conf - vfs:/content/Services.war/WEB-INF/lib/drill-memory-base-1.5.0.jar/drill-module.conf - vfs:/content/Services.war/WEB-INF/lib/drill-logical-1.5.0.jar/drill-module.conf - vfs:/content/Services.war/WEB-INF/lib/drill-java-exec-1.5.0.jar/drill-module.conf 20:30:25,452 INFO [org.apache.drill.common.config.DrillConfig] (http--0.0.0.0-8080-1) Configuration and plugin file(s) identified in 9ms. Base Configuration: - vfs:/content/Services.war/WEB-INF/lib/drill-common-1.5.0.jar/drill-default.conf Intermediate Configuration and Plugin files, in order of precedence: - vfs:/content/Services.war/WEB-INF/lib/drill-common-1.5.0.jar/drill-module.conf - vfs:/content/Services.war/WEB-INF/lib/drill-memory-base-1.5.0.jar/drill-module.conf - vfs:/content/Services.war/WEB-INF/lib/drill-logical-1.5.0.jar/drill-module.conf - vfs:/content/Services.war/WEB-INF/lib/drill-java-exec-1.5.0.jar/drill-module.conf 20:30:25,585 INFO [org.apache.drill.common.config.DrillConfig] (http--0.0.0.0-8080-1) Configuration and plugin file(s) identified in 13ms. Base Configuration: - vfs:/content/Services.war/WEB-INF/lib/drill-common-1.5.0.jar/drill-default.conf Intermediate Configuration and Plugin files, in order of precedence: - vfs:/content/Services.war/WEB-INF/lib/drill-common-1.5.0.jar/drill-module.conf - vfs:/content/Services.war/WEB-INF/lib/drill-memory-base-1.5.0.jar/drill-module.conf - vfs:/content/Services.war/WEB-INF/lib/drill-logical-1.5.0.jar/drill-module.conf - vfs:/content/Services.war/WEB-INF/lib/drill-java-exec-1.5.0.jar/drill-module.conf 20:30:25,633 INFO [org.apache.curator.framework.imps.CuratorFrameworkImpl] (http--0.0.0.0-8080-1) Starting 20:30:25,641 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:zookeeper.version=3.4.6-1569965, built on 02/20/2014 09:09 GMT 20:30:25,641 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:host.name=INBBRDSSVM265 20:30:25,641 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:java.version=1.7.0_91 20:30:25,642 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:java.vendor=Oracle Corporation 20:30:25,642 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:java.home=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.91.x86_64/jre 20:30:25,643 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:java.class.path=/u01/CIIRetail/retailuser/installedSoftware/jboss-as-7.1.1.Final/jboss-modules.jar 20:30:25,643 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/lib 20:30:25,644 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:java.io.tmpdir=/tmp 20:30:25,644 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:java.compiler= 20:30:25,644 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:os.name=Linux 20:30:25,645 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:os.arch=amd64 20:30:25,645 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:os.version=2.6.32-573.8.1.el6.x86_64 20:30:25,645 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:user.name=retailuser 20:30:25,646 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:user.home=/u01/CIIRetail/retailuser 20:30:25,647 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Client environment:user.dir=/u01/CIIRetail/retailuser/installedSoftware/jboss-as-7.1.1.Final/bin 20:30:25,648 INFO [org.apache.zookeeper.ZooKeeper] (http--0.0.0.0-8080-1) Initiating client connection, connectString=172.18.129.2:2181 sessionTimeout=6 watcher=org.apache.curator.ConnectionState@42d1a0c9 20:30:25,690 WARN [org.apache.zookeeper.client.ZooKeeperSaslClient] (http--0.0.0.0-8080-1-SendThread(172.18.129.2:2181)) Could not login: the client is being asked for a password, but the Zookeeper client code does not currently support obtaining a password from the user. Make sure that the client is
[jira] [Resolved] (DRILL-4394) Can’t build the custom functions for Apache Drill 1.5.0
[ https://issues.apache.org/jira/browse/DRILL-4394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Altekruse resolved DRILL-4394. Resolution: Fixed Assignee: Jason Altekruse > Can’t build the custom functions for Apache Drill 1.5.0 > --- > > Key: DRILL-4394 > URL: https://issues.apache.org/jira/browse/DRILL-4394 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.5.0 >Reporter: Kumiko Yada >Assignee: Jason Altekruse >Priority: Critical > > I tried to build the custom functions for Drill 1.5.0, but I got the below > error: > Failure to find org.apache.drill.exec:drill-java-exec:jar:1.5.0 in > http://repo.maven.apache.org/maven2 was cached in the local repository. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4435) Add YARN jar required for running Drill on cluster with Kerberos
Jason Altekruse created DRILL-4435: -- Summary: Add YARN jar required for running Drill on cluster with Kerberos Key: DRILL-4435 URL: https://issues.apache.org/jira/browse/DRILL-4435 Project: Apache Drill Issue Type: Improvement Reporter: Jason Altekruse Assignee: Jason Altekruse As described here, Drill currently requires adding a YARN jar to the classpath to run on Kerberos. If it doesn't conflict with any jars currently included with Drill we should just include this in the distribution to make this work out of the box. http://www.dremio.com/blog/securing-sql-on-hadoop-part-2-installing-and-configuring-drill/ -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3584) Drill Kerberos HDFS Support / Documentation
[ https://issues.apache.org/jira/browse/DRILL-3584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166597#comment-15166597 ] Jason Altekruse commented on DRILL-3584: It looks like actually getting Drill working on this setup requires including a YARN jar, we should just include this with Drill by default so this works out of the box. > Drill Kerberos HDFS Support / Documentation > --- > > Key: DRILL-3584 > URL: https://issues.apache.org/jira/browse/DRILL-3584 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.1.0 >Reporter: Hari Sekhon >Assignee: Jacques Nadeau >Priority: Blocker > > I'm trying to find Drill docs for Kerberos support for secure HDFS clusters > and it doesn't appear to be well tested / supported / documented yet. > This product is Dead-on-Arrival if it doesn't integrate well with secure > Hadoop clusters, specifically HDFS + Kerberos (plus obviously secure > kerberized Hive/HCatalog etc.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3584) Drill Kerberos HDFS Support / Documentation
[ https://issues.apache.org/jira/browse/DRILL-3584?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166595#comment-15166595 ] Jason Altekruse commented on DRILL-3584: [~ngriffith] Wrote a nice pair of blogs on this, we might want to put some of the information here in the Drill docs as well. http://www.dremio.com/blog/securing-sql-on-hadoop-part-1-installing-cdh-and-kerberos/ http://www.dremio.com/blog/securing-sql-on-hadoop-part-2-installing-and-configuring-drill/ > Drill Kerberos HDFS Support / Documentation > --- > > Key: DRILL-3584 > URL: https://issues.apache.org/jira/browse/DRILL-3584 > Project: Apache Drill > Issue Type: New Feature >Affects Versions: 1.1.0 >Reporter: Hari Sekhon >Assignee: Jacques Nadeau >Priority: Blocker > > I'm trying to find Drill docs for Kerberos support for secure HDFS clusters > and it doesn't appear to be well tested / supported / documented yet. > This product is Dead-on-Arrival if it doesn't integrate well with secure > Hadoop clusters, specifically HDFS + Kerberos (plus obviously secure > kerberized Hive/HCatalog etc.) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-3229) Create a new EmbeddedVector
[ https://issues.apache.org/jira/browse/DRILL-3229?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Altekruse resolved DRILL-3229. Resolution: Fixed Fix Version/s: (was: Future) 1.4.0 > Create a new EmbeddedVector > --- > > Key: DRILL-3229 > URL: https://issues.apache.org/jira/browse/DRILL-3229 > Project: Apache Drill > Issue Type: Sub-task > Components: Execution - Codegen, Execution - Data Types, Execution - > Relational Operators, Functions - Drill >Reporter: Jacques Nadeau >Assignee: Steven Phillips > Fix For: 1.4.0 > > > Embedded Vector will leverage a binary encoding for holding information about > type for each individual field. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-284) Publish artifacts to maven for Drill
[ https://issues.apache.org/jira/browse/DRILL-284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jason Altekruse resolved DRILL-284. --- Resolution: Fixed Fix Version/s: (was: Future) 1.1.0 > Publish artifacts to maven for Drill > > > Key: DRILL-284 > URL: https://issues.apache.org/jira/browse/DRILL-284 > Project: Apache Drill > Issue Type: Task >Reporter: Timothy Chen > Fix For: 1.1.0 > > > We need to publish our artifacts and version to maven so other dependencies > (Whirr, or other ones that wants maven include) can use. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4434) Remove (or deprecate) GroupScan.enforceWidth and use GroupScan.getMinParallelization
[ https://issues.apache.org/jira/browse/DRILL-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166411#comment-15166411 ] ASF GitHub Bot commented on DRILL-4434: --- Github user sudheeshkatkam commented on the pull request: https://github.com/apache/drill/pull/390#issuecomment-188532721 Sounds good. > Remove (or deprecate) GroupScan.enforceWidth and use > GroupScan.getMinParallelization > > > Key: DRILL-4434 > URL: https://issues.apache.org/jira/browse/DRILL-4434 > Project: Apache Drill > Issue Type: Bug >Reporter: Venki Korukanti > > It seems like enforceWidth is not necessary which is used only in > ExcessibleExchangeRemover. Instead we should rely on > GroupScan.getMinParallelization(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4434) Remove (or deprecate) GroupScan.enforceWidth and use GroupScan.getMinParallelization
[ https://issues.apache.org/jira/browse/DRILL-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166401#comment-15166401 ] ASF GitHub Bot commented on DRILL-4434: --- Github user vkorukanti commented on the pull request: https://github.com/apache/drill/pull/390#issuecomment-188531968 Not sure of the policy around changing public APIs, but I think it is better to keep the method around to avoid breaking existing storage plugins until next major version release (2.x) > Remove (or deprecate) GroupScan.enforceWidth and use > GroupScan.getMinParallelization > > > Key: DRILL-4434 > URL: https://issues.apache.org/jira/browse/DRILL-4434 > Project: Apache Drill > Issue Type: Bug >Reporter: Venki Korukanti > > It seems like enforceWidth is not necessary which is used only in > ExcessibleExchangeRemover. Instead we should rely on > GroupScan.getMinParallelization(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4434) Remove (or deprecate) GroupScan.enforceWidth and use GroupScan.getMinParallelization
[ https://issues.apache.org/jira/browse/DRILL-4434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15166387#comment-15166387 ] ASF GitHub Bot commented on DRILL-4434: --- Github user sudheeshkatkam commented on the pull request: https://github.com/apache/drill/pull/390#issuecomment-188528781 +1 Can we remove the method? (There was a discussion quite a while back about removing this.) > Remove (or deprecate) GroupScan.enforceWidth and use > GroupScan.getMinParallelization > > > Key: DRILL-4434 > URL: https://issues.apache.org/jira/browse/DRILL-4434 > Project: Apache Drill > Issue Type: Bug >Reporter: Venki Korukanti > > It seems like enforceWidth is not necessary which is used only in > ExcessibleExchangeRemover. Instead we should rely on > GroupScan.getMinParallelization(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4411) HashJoin should not only depend on number of records, but also on size
[ https://issues.apache.org/jira/browse/DRILL-4411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163916#comment-15163916 ] ASF GitHub Bot commented on DRILL-4411: --- Github user jaltekruse commented on a diff in the pull request: https://github.com/apache/drill/pull/381#discussion_r54017219 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinProbeTemplate.java --- @@ -47,7 +47,11 @@ private HashJoinBatch outgoingJoinBatch = null; - private static final int TARGET_RECORDS_PER_BATCH = 4000; + private int targetRecordsPerBatch = 4000; --- End diff -- I would add a comment here about when this value will be mutated as we are moving it away from immutability, and most of the other operators currently have this as an immutable value. > HashJoin should not only depend on number of records, but also on size > -- > > Key: DRILL-4411 > URL: https://issues.apache.org/jira/browse/DRILL-4411 > Project: Apache Drill > Issue Type: Bug > Components: Server >Reporter: MinJi Kim >Assignee: MinJi Kim > > In HashJoinProbeTemplate, each batch is limited to TARGET_RECORDS_PER_BATCH > (4000). But we should not only depend on the number of records, but also > size (in case of extremely large records). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4411) HashJoin should not only depend on number of records, but also on size
[ https://issues.apache.org/jira/browse/DRILL-4411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15164491#comment-15164491 ] ASF GitHub Bot commented on DRILL-4411: --- Github user jaltekruse commented on a diff in the pull request: https://github.com/apache/drill/pull/381#discussion_r54018324 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinProbeTemplate.java --- @@ -98,7 +102,9 @@ public void setupHashJoinProbe(FragmentContext context, VectorContainer buildBat } public void executeProjectRightPhase() { -while (outputRecords < TARGET_RECORDS_PER_BATCH && recordsProcessed < recordsToProcess) { +while (outputRecords < targetRecordsPerBatch +&& recordsProcessed < recordsToProcess +&& (!adjustTargetRecordsPerBatch || outgoingJoinBatch.getMemoryUsed() < TARGET_BATCH_SIZE_IN_BYTES)) { --- End diff -- It seems like the thing we are testing for here isn't actually directly related to the condition we are trying to avoid. The overall memory consumed when outputting records will be a function of both size of values as well as number of columns. I think this is a reasonable approach for now but we should open a follow-up JIRA to look at where things will break as we encounter cases where there are many wide columns in a dataset. > HashJoin should not only depend on number of records, but also on size > -- > > Key: DRILL-4411 > URL: https://issues.apache.org/jira/browse/DRILL-4411 > Project: Apache Drill > Issue Type: Bug > Components: Server >Reporter: MinJi Kim >Assignee: MinJi Kim > > In HashJoinProbeTemplate, each batch is limited to TARGET_RECORDS_PER_BATCH > (4000). But we should not only depend on the number of records, but also > size (in case of extremely large records). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4411) HashJoin should not only depend on number of records, but also on size
[ https://issues.apache.org/jira/browse/DRILL-4411?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163923#comment-15163923 ] ASF GitHub Bot commented on DRILL-4411: --- Github user jaltekruse commented on a diff in the pull request: https://github.com/apache/drill/pull/381#discussion_r54017535 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/join/HashJoinProbeTemplate.java --- @@ -47,7 +47,11 @@ private HashJoinBatch outgoingJoinBatch = null; - private static final int TARGET_RECORDS_PER_BATCH = 4000; + private int targetRecordsPerBatch = 4000; + + private boolean adjustTargetRecordsPerBatch = true; --- End diff -- It looks like this flag is designed to allow the adjustment to only happen once, is that actually what we want? If the row size is growing it would seem like a good idea to allow for several batch size adjustments. It also removes another boolean state to manage. > HashJoin should not only depend on number of records, but also on size > -- > > Key: DRILL-4411 > URL: https://issues.apache.org/jira/browse/DRILL-4411 > Project: Apache Drill > Issue Type: Bug > Components: Server >Reporter: MinJi Kim >Assignee: MinJi Kim > > In HashJoinProbeTemplate, each batch is limited to TARGET_RECORDS_PER_BATCH > (4000). But we should not only depend on the number of records, but also > size (in case of extremely large records). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3745) Hive CHAR not supported
[ https://issues.apache.org/jira/browse/DRILL-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163844#comment-15163844 ] Venki Korukanti commented on DRILL-3745: Looks good. One other place where we need to handle CHAR is in Hive UDFs. > Hive CHAR not supported > --- > > Key: DRILL-3745 > URL: https://issues.apache.org/jira/browse/DRILL-3745 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Nathaniel Auvil >Assignee: Arina Ielchiieva > > It doesn’t look like Drill 1.1.0 supports the Hive CHAR type? > In Hive: > create table development.foo > ( > bad CHAR(10) > ); > And then in sqlline: > > use `hive.development`; > > select * from foo; > Error: PARSE ERROR: Unsupported Hive data type CHAR. > Following Hive data types are supported in Drill INFORMATION_SCHEMA: > BOOLEAN, BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, DATE, TIMESTAMP, > BINARY, DECIMAL, STRING, VARCHAR, LIST, MAP, STRUCT and UNION > [Error Id: 58bf3940-3c09-4ad2-8f52-d052dffd4b17 on dtpg05:31010] > (state=,code=0) > This was originally found when getting failures trying to connect via JDBS > using Squirrel. We have the Hive plugin enabled with tables using CHAR. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3745) Hive CHAR not supported
[ https://issues.apache.org/jira/browse/DRILL-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163837#comment-15163837 ] Arina Ielchiieva commented on DRILL-3745: - Since Drill basically handles all chars as varchars to work correctly with hive chars, we can return them as varchars but trimmed beforehand so they can be compared to each other. Proposed solution - [https://github.com/arina-ielchiieva/drill/commit/0e3821e5d100d295e163d7d03d94a064329a4982] Can anybody look at it? > Hive CHAR not supported > --- > > Key: DRILL-3745 > URL: https://issues.apache.org/jira/browse/DRILL-3745 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Nathaniel Auvil >Assignee: Arina Ielchiieva > > It doesn’t look like Drill 1.1.0 supports the Hive CHAR type? > In Hive: > create table development.foo > ( > bad CHAR(10) > ); > And then in sqlline: > > use `hive.development`; > > select * from foo; > Error: PARSE ERROR: Unsupported Hive data type CHAR. > Following Hive data types are supported in Drill INFORMATION_SCHEMA: > BOOLEAN, BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, DATE, TIMESTAMP, > BINARY, DECIMAL, STRING, VARCHAR, LIST, MAP, STRUCT and UNION > [Error Id: 58bf3940-3c09-4ad2-8f52-d052dffd4b17 on dtpg05:31010] > (state=,code=0) > This was originally found when getting failures trying to connect via JDBS > using Squirrel. We have the Hive plugin enabled with tables using CHAR. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (DRILL-4415) Minimum parallelization width not respected in group scans
[ https://issues.apache.org/jira/browse/DRILL-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163561#comment-15163561 ] Venki Korukanti edited comment on DRILL-4415 at 2/24/16 7:29 PM: - The issue here: GroupScan.enforceWidth is not overridden, UnionExchange is removed as part of the ExcessibleExchangeRemover. So there only one fragment which has GroupScan->Project->Screen. As the Screen has the max parallelization of 1, it overrides the GroupScan min parallelization. It seems GroupScan.enforceWidth API is unnecessary, instead ExcessibleExchangeRemover should look at GroupScan.minParallelization. Also in SimpleParallelizer we should throw exceptions if encounter a situation where max parallelization of operator overrides min parallelization of another operator in the same fragment. was (Author: vkorukanti): The issue here: GroupScan.enforceWidth is not overridden, UnionExchange is removed as part of the ExcessibleExchangeRemover. So there only one fragment which has GroupScan->Project->Screen. As the Screen has the max parallelization of 1, it overrides the GroupScan min parallelization. It seems GroupScan.enforceWidth API is unnecessary, instead ExcessibleExchangeRemover should look at GroupScan.minParallelization. Also in SimpleParallelizer we should through exceptions if encounter a situation where max parallelization of operator overrides min parallelization of another operator in the same fragment. > Minimum parallelization width not respected in group scans > -- > > Key: DRILL-4415 > URL: https://issues.apache.org/jira/browse/DRILL-4415 > Project: Apache Drill > Issue Type: Bug > Components: Server >Reporter: MinJi Kim > > Even if the minimum parallelization widith is set to > 1 value, this value is > not always respected. > For example, in AggPruleBase, the decision to do two phase aggregation > (create2PhasePlan) only takes into account the number of rows. So, if table > is small enough, even if the width is set to > 1 value, the plan will ignore > the minimum width request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4434) Remove (or deprecate) GroupScan.enforceWidth and use GroupScan.getMinParallelization
Venki Korukanti created DRILL-4434: -- Summary: Remove (or deprecate) GroupScan.enforceWidth and use GroupScan.getMinParallelization Key: DRILL-4434 URL: https://issues.apache.org/jira/browse/DRILL-4434 Project: Apache Drill Issue Type: Bug Reporter: Venki Korukanti It seems like enforceWidth is not necessary which is used only in ExcessibleExchangeRemover. Instead we should rely on GroupScan.getMinParallelization(). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4415) Minimum parallelization width not respected in group scans
[ https://issues.apache.org/jira/browse/DRILL-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163561#comment-15163561 ] Venki Korukanti commented on DRILL-4415: The issue here: GroupScan.enforceWidth is not overridden, UnionExchange is removed as part of the ExcessibleExchangeRemover. So there only one fragment which has GroupScan->Project->Screen. As the Screen has the max parallelization of 1, it overrides the GroupScan min parallelization. It seems GroupScan.enforceWidth API is unnecessary, instead ExcessibleExchangeRemover should look at GroupScan.minParallelization. Also in SimpleParallelizer we should through exceptions if encounter a situation where max parallelization of operator overrides min parallelization of another operator in the same fragment. > Minimum parallelization width not respected in group scans > -- > > Key: DRILL-4415 > URL: https://issues.apache.org/jira/browse/DRILL-4415 > Project: Apache Drill > Issue Type: Bug > Components: Server >Reporter: MinJi Kim > > Even if the minimum parallelization widith is set to > 1 value, this value is > not always respected. > For example, in AggPruleBase, the decision to do two phase aggregation > (create2PhasePlan) only takes into account the number of rows. So, if table > is small enough, even if the width is set to > 1 value, the plan will ignore > the minimum width request. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (DRILL-2282) Eliminate spaces, special characters from names in function templates
[ https://issues.apache.org/jira/browse/DRILL-2282?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15153973#comment-15153973 ] Vitalii Diravka edited comment on DRILL-2282 at 2/24/16 7:28 PM: - [~mehant] I tried to reproduce issues mentioned in this jira but didn't get their. Every query with spaces and special symbols in functions works properly I mean. Here is available the test as proof for successful work such queries.[Updated patch version|https://github.com/vdiravka/drill/commit/72aec00985b2a385f34c1861eb44a5fb83f0bb9b] was (Author: vitalii): [~mehant] I tried to reproduce issues mentioned in this jira but didn't get their. Every query with spaces and special symbols in functions works properly I mean. Here is available the test as proof for successful work such queries.[Updated patch|https://github.com/vdiravka/drill/commit/72aec00985b2a385f34c1861eb44a5fb83f0bb9b] > Eliminate spaces, special characters from names in function templates > - > > Key: DRILL-2282 > URL: https://issues.apache.org/jira/browse/DRILL-2282 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Reporter: Mehant Baid >Assignee: Vitalii Diravka > Fix For: 1.6.0 > > Attachments: DRILL-2282.patch > > > Having spaces in the name of the functions causes issues while deserializing > such expressions when we try to read the plan fragment. As part of this JIRA > would like to clean up all the templates to not include special characters in > their names. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4433) Unnecessary entries created in ZK when a jdbc application uses a wrong connection URL
[ https://issues.apache.org/jira/browse/DRILL-4433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163528#comment-15163528 ] Jacques Nadeau commented on DRILL-4433: --- I would add an additional note here, as well: ZK is only used by the client to determine what nodes are available, not maintain cluster cohesion. As such, it should be fast and not a long-lasting connection. I think that as part of this change we should also use a faster mechanism to doing this lookup. > Unnecessary entries created in ZK when a jdbc application uses a wrong > connection URL > -- > > Key: DRILL-4433 > URL: https://issues.apache.org/jira/browse/DRILL-4433 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Reporter: Rahul Challapalli > > commit # : 6d5f4983003b8f5d351adcb0bc9881d2dc4c2d3f > commit date : Feb 22, 2016 > In my JDBC application, I accidentally used an invalid connection string like > below > {code} > jdbc:drill:zk=x.x.x.x:5181/zkroot-blah/cluster-name > {code} > While drill correctly reported an "No DrillbitEndpoint can be found error", I > also observed that it created an entry in zookeeper by the name > "zkroot-blah". There seems to be no reason for doing this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4433) Unnecessary entries created in ZK when a jdbc application uses a wrong connection URL
Rahul Challapalli created DRILL-4433: Summary: Unnecessary entries created in ZK when a jdbc application uses a wrong connection URL Key: DRILL-4433 URL: https://issues.apache.org/jira/browse/DRILL-4433 Project: Apache Drill Issue Type: Bug Components: Client - JDBC Reporter: Rahul Challapalli commit # : 6d5f4983003b8f5d351adcb0bc9881d2dc4c2d3f commit date : Feb 22, 2016 In my JDBC application, I accidentally used an invalid connection string like below {code} jdbc:drill:zk=x.x.x.x:5181/zkroot-blah/cluster-name {code} While drill correctly reported an "No DrillbitEndpoint can be found error", I also observed that it created an entry in zookeeper by the name "zkroot-blah". There seems to be no reason for doing this. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3745) Hive CHAR not supported
[ https://issues.apache.org/jira/browse/DRILL-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163438#comment-15163438 ] Parth Chandra commented on DRILL-3745: -- Ah. Good to know. > Hive CHAR not supported > --- > > Key: DRILL-3745 > URL: https://issues.apache.org/jira/browse/DRILL-3745 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Nathaniel Auvil >Assignee: Arina Ielchiieva > > It doesn’t look like Drill 1.1.0 supports the Hive CHAR type? > In Hive: > create table development.foo > ( > bad CHAR(10) > ); > And then in sqlline: > > use `hive.development`; > > select * from foo; > Error: PARSE ERROR: Unsupported Hive data type CHAR. > Following Hive data types are supported in Drill INFORMATION_SCHEMA: > BOOLEAN, BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, DATE, TIMESTAMP, > BINARY, DECIMAL, STRING, VARCHAR, LIST, MAP, STRUCT and UNION > [Error Id: 58bf3940-3c09-4ad2-8f52-d052dffd4b17 on dtpg05:31010] > (state=,code=0) > This was originally found when getting failures trying to connect via JDBS > using Squirrel. We have the Hive plugin enabled with tables using CHAR. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3745) Hive CHAR not supported
[ https://issues.apache.org/jira/browse/DRILL-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163359#comment-15163359 ] Venki Korukanti commented on DRILL-3745: [~parthc] It is introduced in 0.13 and got missed in Hive plugin upgrade (0.12 to 0.13). > Hive CHAR not supported > --- > > Key: DRILL-3745 > URL: https://issues.apache.org/jira/browse/DRILL-3745 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Nathaniel Auvil >Assignee: Arina Ielchiieva > > It doesn’t look like Drill 1.1.0 supports the Hive CHAR type? > In Hive: > create table development.foo > ( > bad CHAR(10) > ); > And then in sqlline: > > use `hive.development`; > > select * from foo; > Error: PARSE ERROR: Unsupported Hive data type CHAR. > Following Hive data types are supported in Drill INFORMATION_SCHEMA: > BOOLEAN, BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, DATE, TIMESTAMP, > BINARY, DECIMAL, STRING, VARCHAR, LIST, MAP, STRUCT and UNION > [Error Id: 58bf3940-3c09-4ad2-8f52-d052dffd4b17 on dtpg05:31010] > (state=,code=0) > This was originally found when getting failures trying to connect via JDBS > using Squirrel. We have the Hive plugin enabled with tables using CHAR. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4432) Update error message to include all unsupported functions when frame clause is used
[ https://issues.apache.org/jira/browse/DRILL-4432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deneche A. Hakim updated DRILL-4432: Labels: window_function (was: ) > Update error message to include all unsupported functions when frame clause > is used > --- > > Key: DRILL-4432 > URL: https://issues.apache.org/jira/browse/DRILL-4432 > Project: Apache Drill > Issue Type: Bug > Components: SQL Parser >Affects Versions: 1.6.0 >Reporter: Khurram Faraaz >Priority: Minor > Labels: window_function > > We need to update the error message to include LEAD, LAG, NTILE, and ranking > functions, right now we only say ROW/RANGE is not allowed with RANK, > DENSE_RANK or ROW_NUMBER functions, when frame clause is used in the window > definition of any of these window functions. > Another typo that needs fix in error message, it is not ROW/RANGE, it should > be ROWS/RANGE (rows not row) > An example of current error message that we see. > {noformat} > 0: jdbc:drill:schema=dfs.tmp> select LEAD(columns[3]) OVER(PARTITION BY > CAST(columns[0] as integer) ORDER BY cast(columns[0] as integer) ROWS > UNBOUNDED PRECEDING) from dfs.tmp.`t_alltype.csv`; > Error: VALIDATION ERROR: From line 1, column 108 to line 1, column 111: > ROW/RANGE not allowed with RANK, DENSE_RANK or ROW_NUMBER functions > [Error Id: 98565028-bfd8-4f57-acdb-5235195d3d6d on centos-01.qa.lab:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4431) NTILE function should NOT allow the use of frame clause in its window definition
[ https://issues.apache.org/jira/browse/DRILL-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deneche A. Hakim updated DRILL-4431: Labels: window_function (was: ) > NTILE function should NOT allow the use of frame clause in its window > definition > > > Key: DRILL-4431 > URL: https://issues.apache.org/jira/browse/DRILL-4431 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.6.0 >Reporter: Khurram Faraaz > Labels: window_function > Fix For: 1.6.0 > > > NTILE function should not allow the use of frame clause in its window > definition, the below query should return and error and say the operation is > not supported. > Drill 1.6.0, commit ID: 6d5f4983 > The SQL spec says NTILE should not allow use of frame clause. > {noformat} > From the SQL SPEC on page 220 > ISO/IEC 9075-2:2011(E) > 6.10 > 7) > > If , , or > ROW_NUMBER is specified, then: > a) If , , RANK or DENSE_RANK is > specified, then the window > ordering clause WOC of WDX shall be present. > b) The window framing clause of WDX shall not be present. > {noformat} > {noformat} > 0: jdbc:drill:schema=dfs.tmp> select NTILE(3) OVER(PARTITION BY > CAST(columns[0] as integer) ORDER BY cast(columns[0] as integer) ROWS > UNBOUNDED PRECEDING) from dfs.tmp.`t_alltype.csv`; > +-+ > | EXPR$0 | > +-+ > | 1 | > | 1 | > | 1 | > ... > ... > | 1 | > | 1 | > | 1 | > ++ > 145 rows selected (0.322 seconds) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4431) NTILE function should NOT allow the use of frame clause in its window definition
[ https://issues.apache.org/jira/browse/DRILL-4431?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deneche A. Hakim updated DRILL-4431: Fix Version/s: 1.6.0 > NTILE function should NOT allow the use of frame clause in its window > definition > > > Key: DRILL-4431 > URL: https://issues.apache.org/jira/browse/DRILL-4431 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.6.0 >Reporter: Khurram Faraaz > Labels: window_function > Fix For: 1.6.0 > > > NTILE function should not allow the use of frame clause in its window > definition, the below query should return and error and say the operation is > not supported. > Drill 1.6.0, commit ID: 6d5f4983 > The SQL spec says NTILE should not allow use of frame clause. > {noformat} > From the SQL SPEC on page 220 > ISO/IEC 9075-2:2011(E) > 6.10 > 7) > > If , , or > ROW_NUMBER is specified, then: > a) If , , RANK or DENSE_RANK is > specified, then the window > ordering clause WOC of WDX shall be present. > b) The window framing clause of WDX shall not be present. > {noformat} > {noformat} > 0: jdbc:drill:schema=dfs.tmp> select NTILE(3) OVER(PARTITION BY > CAST(columns[0] as integer) ORDER BY cast(columns[0] as integer) ROWS > UNBOUNDED PRECEDING) from dfs.tmp.`t_alltype.csv`; > +-+ > | EXPR$0 | > +-+ > | 1 | > | 1 | > | 1 | > ... > ... > | 1 | > | 1 | > | 1 | > ++ > 145 rows selected (0.322 seconds) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4432) Update error message to include all unsupported functions when frame clause is used
Khurram Faraaz created DRILL-4432: - Summary: Update error message to include all unsupported functions when frame clause is used Key: DRILL-4432 URL: https://issues.apache.org/jira/browse/DRILL-4432 Project: Apache Drill Issue Type: Bug Components: SQL Parser Affects Versions: 1.6.0 Reporter: Khurram Faraaz Priority: Minor We need to update the error message to include LEAD, LAG, NTILE, and ranking functions, right now we only say ROW/RANGE is not allowed with RANK, DENSE_RANK or ROW_NUMBER functions, when frame clause is used in the window definition of any of these window functions. Another typo that needs fix in error message, it is not ROW/RANGE, it should be ROWS/RANGE (rows not row) An example of current error message that we see. {noformat} 0: jdbc:drill:schema=dfs.tmp> select LEAD(columns[3]) OVER(PARTITION BY CAST(columns[0] as integer) ORDER BY cast(columns[0] as integer) ROWS UNBOUNDED PRECEDING) from dfs.tmp.`t_alltype.csv`; Error: VALIDATION ERROR: From line 1, column 108 to line 1, column 111: ROW/RANGE not allowed with RANK, DENSE_RANK or ROW_NUMBER functions [Error Id: 98565028-bfd8-4f57-acdb-5235195d3d6d on centos-01.qa.lab:31010] (state=,code=0) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4431) NTILE function should NOT allow the use of frame clause in its window definition
Khurram Faraaz created DRILL-4431: - Summary: NTILE function should NOT allow the use of frame clause in its window definition Key: DRILL-4431 URL: https://issues.apache.org/jira/browse/DRILL-4431 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.6.0 Reporter: Khurram Faraaz NTILE function should not allow the use of frame clause in its window definition, the below query should return and error and say the operation is not supported. Drill 1.6.0, commit ID: 6d5f4983 The SQL spec says NTILE should not allow use of frame clause. {noformat} >From the SQL SPEC on page 220 ISO/IEC 9075-2:2011(E) 6.10 7) If , , or ROW_NUMBER is specified, then: a) If , , RANK or DENSE_RANK is specified, then the window ordering clause WOC of WDX shall be present. b) The window framing clause of WDX shall not be present. {noformat} {noformat} 0: jdbc:drill:schema=dfs.tmp> select NTILE(3) OVER(PARTITION BY CAST(columns[0] as integer) ORDER BY cast(columns[0] as integer) ROWS UNBOUNDED PRECEDING) from dfs.tmp.`t_alltype.csv`; +-+ | EXPR$0 | +-+ | 1 | | 1 | | 1 | ... ... | 1 | | 1 | | 1 | ++ 145 rows selected (0.322 seconds) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (DRILL-3745) Hive CHAR not supported
[ https://issues.apache.org/jira/browse/DRILL-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva reassigned DRILL-3745: --- Assignee: Arina Ielchiieva > Hive CHAR not supported > --- > > Key: DRILL-3745 > URL: https://issues.apache.org/jira/browse/DRILL-3745 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Nathaniel Auvil >Assignee: Arina Ielchiieva > > It doesn’t look like Drill 1.1.0 supports the Hive CHAR type? > In Hive: > create table development.foo > ( > bad CHAR(10) > ); > And then in sqlline: > > use `hive.development`; > > select * from foo; > Error: PARSE ERROR: Unsupported Hive data type CHAR. > Following Hive data types are supported in Drill INFORMATION_SCHEMA: > BOOLEAN, BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, DATE, TIMESTAMP, > BINARY, DECIMAL, STRING, VARCHAR, LIST, MAP, STRUCT and UNION > [Error Id: 58bf3940-3c09-4ad2-8f52-d052dffd4b17 on dtpg05:31010] > (state=,code=0) > This was originally found when getting failures trying to connect via JDBS > using Squirrel. We have the Hive plugin enabled with tables using CHAR. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4430) Unable to execute drill jdbc from jboss container
[ https://issues.apache.org/jira/browse/DRILL-4430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163271#comment-15163271 ] Sudheesh Katkam commented on DRILL-4430: Is there a preceding error message? Can you look at the server side logs as well? > Unable to execute drill jdbc from jboss container > - > > Key: DRILL-4430 > URL: https://issues.apache.org/jira/browse/DRILL-4430 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Affects Versions: 1.5.0 > Environment: Linux > Windows 7 > Jboss 7.1 >Reporter: abhishek agrawal > > Unable to execute jdbc query from jboss application server. Its closing the > connection with an error: > 20:30:46,543 INFO > [org.apache.curator.framework.state.ConnectionStateManager] > (http--0.0.0.0-8080-1-EventThread) State change: CONNECTED > 20:30:47,609 ERROR [org.apache.drill.exec.rpc.RpcExceptionHandler] (Client-1) > Exception in RPC communication. Connection: /172.18.129.2:39687 <--> > INBBRDSSVM265/172.18.129.2:31010 (user client). Closing connection. > 20:30:47,614 INFO [org.apache.drill.exec.rpc.user.UserClient] (Client-1) > Channel closed /172.18.129.2:39687 <--> INBBRDSSVM265/172.18.129.2:31010. > 20:30:47,614 INFO [org.apache.drill.exec.rpc.user.QueryResultHandler] > (Client-1) User Error Occurred: > org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: > Connection /172.18.129.2:39687 <--> INBBRDSSVM265/172.18.129.2:31010 (user > client) closed unexpectedly. > Note: the same code with right dependency works fine with standalone java > code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3745) Hive CHAR not supported
[ https://issues.apache.org/jira/browse/DRILL-3745?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15163195#comment-15163195 ] Parth Chandra commented on DRILL-3745: -- [~vkorukanti] Is this an oversight (ie we never implemented)? > Hive CHAR not supported > --- > > Key: DRILL-3745 > URL: https://issues.apache.org/jira/browse/DRILL-3745 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Nathaniel Auvil > > It doesn’t look like Drill 1.1.0 supports the Hive CHAR type? > In Hive: > create table development.foo > ( > bad CHAR(10) > ); > And then in sqlline: > > use `hive.development`; > > select * from foo; > Error: PARSE ERROR: Unsupported Hive data type CHAR. > Following Hive data types are supported in Drill INFORMATION_SCHEMA: > BOOLEAN, BYTE, SHORT, INT, LONG, FLOAT, DOUBLE, DATE, TIMESTAMP, > BINARY, DECIMAL, STRING, VARCHAR, LIST, MAP, STRUCT and UNION > [Error Id: 58bf3940-3c09-4ad2-8f52-d052dffd4b17 on dtpg05:31010] > (state=,code=0) > This was originally found when getting failures trying to connect via JDBS > using Squirrel. We have the Hive plugin enabled with tables using CHAR. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4430) Unable to execute drill jdbc from jboss container
abhishek agrawal created DRILL-4430: --- Summary: Unable to execute drill jdbc from jboss container Key: DRILL-4430 URL: https://issues.apache.org/jira/browse/DRILL-4430 Project: Apache Drill Issue Type: Bug Components: Client - JDBC Affects Versions: 1.5.0 Environment: Linux Windows 7 Jboss 7.1 Reporter: abhishek agrawal Unable to execute jdbc query from jboss application server. Its closing the connection with an error: 20:30:46,543 INFO [org.apache.curator.framework.state.ConnectionStateManager] (http--0.0.0.0-8080-1-EventThread) State change: CONNECTED 20:30:47,609 ERROR [org.apache.drill.exec.rpc.RpcExceptionHandler] (Client-1) Exception in RPC communication. Connection: /172.18.129.2:39687 <--> INBBRDSSVM265/172.18.129.2:31010 (user client). Closing connection. 20:30:47,614 INFO [org.apache.drill.exec.rpc.user.UserClient] (Client-1) Channel closed /172.18.129.2:39687 <--> INBBRDSSVM265/172.18.129.2:31010. 20:30:47,614 INFO [org.apache.drill.exec.rpc.user.QueryResultHandler] (Client-1) User Error Occurred: org.apache.drill.common.exceptions.UserException: CONNECTION ERROR: Connection /172.18.129.2:39687 <--> INBBRDSSVM265/172.18.129.2:31010 (user client) closed unexpectedly. Note: the same code with right dependency works fine with standalone java code. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4429) Need a better error message when window frame exclusion is used in frame-clause
Khurram Faraaz created DRILL-4429: - Summary: Need a better error message when window frame exclusion is used in frame-clause Key: DRILL-4429 URL: https://issues.apache.org/jira/browse/DRILL-4429 Project: Apache Drill Issue Type: Bug Components: SQL Parser Affects Versions: 1.6.0 Reporter: Khurram Faraaz Priority: Minor We need to report a proper error message to user when window frame exclusion is used in the window frame-clause. Currently we do not support it and the current error message is not helpful. Drill 1.6.0 commit ID: 6d5f4983 {noformat} 0: jdbc:drill:schema=dfs.tmp> select count(*) OVER(PARTITION BY CAST(columns[0] as integer) ORDER BY cast(columns[0] as integer) RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED following EXCLUDE CURRENT ROW) from dfs.tmp.`t_alltype.csv`; Error: PARSE ERROR: Encountered "EXCLUDE" at line 1, column 158. Was expecting one of: ")" ... "ALLOW" ... "DISALLOW" ... while parsing SQL query: select count(*) OVER(PARTITION BY CAST(columns[0] as integer) ORDER BY cast(columns[0] as integer) RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED following EXCLUDE CURRENT ROW) from dfs.tmp.`t_alltype.csv` ^ [Error Id: ddfcd7fc-1a84-4e1f-8e51-34601e9155a6 on centos-04.qa.lab:31010] (state=,code=0) {noformat} >From the SQL specification, we need to add these. {noformat} ::= EXCLUDE CURRENT ROW | EXCLUDE GROUP | EXCLUDE TIES | EXCLUDE NO OTHERS {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3325) Explicitly specified default window frame throws an error requiring order by
[ https://issues.apache.org/jira/browse/DRILL-3325?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15162937#comment-15162937 ] Khurram Faraaz commented on DRILL-3325: --- This case (RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) is fixed on Drill 1.6.0 commit ID: 6d5f4983 {noformat} 0: jdbc:drill:schema=dfs.tmp> select count(*) OVER(PARTITION BY CAST(columns[0] as integer) ORDER BY cast(columns[0] as integer) RANGE BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING) from dfs.tmp.`t_alltype.csv`; +-+ | EXPR$0 | +-+ | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | | 1 | ... | 1 | | 1 | | 1 | ++ 145 rows selected (0.299 seconds) {noformat} > Explicitly specified default window frame throws an error requiring order by > > > Key: DRILL-3325 > URL: https://issues.apache.org/jira/browse/DRILL-3325 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.0.0 >Reporter: Victoria Markman > Labels: window_function > Fix For: Future > > > Calcite requires explicit ORDER BY clause when "default" frame ("UNBOUNDED > PRECEDING AND CURRENT ROW") is specified: > {code} > 0: jdbc:drill:schema=dfs> select sum(a1) over(partition by b1 range between > unbounded preceding and unbounded following ) from t1; > Error: PARSE ERROR: From line 1, column 20 to line 1, column 95: Window > specification must contain an ORDER BY clause > [Error Id: 446c7fd3-f588-4832-b79d-08cec610ff24 on atsqa4-133.qa.lab:31010] > (state=,code=0) > {code} > I thought that we decided to make query above to be equivalent to: > {code} > 0: jdbc:drill:schema=dfs> explain plan for select sum(a1) over(partition by > b1) from t1; > {code} > Explain plan (notice frame and an empty "order by"): > {code} > | 00-00Screen > 00-01 Project(EXPR$0=[CASE(>($2, 0), CAST($3):ANY, null)]) > 00-02Window(window#0=[window(partition {1} order by [] range between > UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [COUNT($0), $SUM0($0)])]) > 00-03 SelectionVectorRemover > 00-04Sort(sort0=[$1], dir0=[ASC]) > 00-05 Project(a1=[$1], b1=[$0]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=/drill/testdata/subqueries/t1, numFiles=1, columns=[`a1`, > `b1`]]]) > {code} > If I add order by, I get an error. Because based on our design, see > DRILL-3188, order by only allows frames: "RANGE UNBOUNDED PRECEDING" and > "RANGE BETEWEEN UNBOUNDED PRECEDING AND CURRENT ROW" > {code} > 0: jdbc:drill:schema=dfs> select sum(a1) over(partition by b1 order by c1 > range between unbounded preceding and unbounded following ) from t1; > Error: UNSUPPORTED_OPERATION ERROR: This type of window frame is currently > not supported > See Apache Drill JIRA: DRILL-3188 > [Error Id: 7b2f1e39-0ad2-4584-aa4c-bdace84adfe4 on atsqa4-133.qa.lab:31010] > (state=,code=0) > {code} > "{color:red}ROWS{color} BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING" > works: > {code} > 0: jdbc:drill:schema=dfs> select sum(a1) over(partition by b1 rows between > unbounded preceding and unbounded following ) from t1; > +-+ > | EXPR$0 | > +-+ > | 1 | > | 2 | > | 3 | > | 5 | > | 6 | > | 7 | > | null| > | 9 | > | 10 | > | 4 | > +-+ > 10 rows selected (0.312 seconds) > {code} > explain plan: notice empty "order by" as well. > {code} > 00-01 Project(EXPR$0=[CASE(>($2, 0), CAST($3):ANY, null)]) > 00-02Window(window#0=[window(partition {1} order by [] rows between > UNBOUNDED PRECEDING and UNBOUNDED FOLLOWING aggs [COUNT($0), $SUM0($0)])]) > 00-03 SelectionVectorRemover > 00-04Sort(sort0=[$1], dir0=[ASC]) > 00-05 Project(a1=[$1], b1=[$0]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath [path=maprfs:///drill/testdata/subqueries/t1]], > selectionRoot=/drill/testdata/subqueries/t1, numFiles=1, columns=[`a1`, > `b1`]]]) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4428) Need a better error message when GROUPS keyword used in window frame-clause
Khurram Faraaz created DRILL-4428: - Summary: Need a better error message when GROUPS keyword used in window frame-clause Key: DRILL-4428 URL: https://issues.apache.org/jira/browse/DRILL-4428 Project: Apache Drill Issue Type: Bug Components: SQL Parser Affects Versions: 1.6.0 Reporter: Khurram Faraaz Priority: Minor We need to report a better message to user that we do not support (today) the use of GROUPS keyword as the window frame unit, in the window frame clause. commit ID: 6d5f4983 Drill version : 1.6.0 {noformat} 0: jdbc:drill:schema=dfs.tmp> select count(*) OVER(PARTITION BY CAST(columns[0] as integer) GROUPS) from dfs.tmp.`t_alltype.csv`; Error: PARSE ERROR: Encountered "GROUPS" at line 1, column 63. Was expecting one of: ")" ... "," ... "ORDER" ... "ROWS" ... "RANGE" ... "ALLOW" ... "DISALLOW" ... "NOT" ... "IN" ... "BETWEEN" ... "LIKE" ... "SIMILAR" ... "=" ... ">" ... "<" ... "<=" ... ">=" ... "<>" ... "+" ... "-" ... "*" ... "/" ... "||" ... "AND" ... "OR" ... "IS" ... "MEMBER" ... "SUBMULTISET" ... "MULTISET" ... "[" ... while parsing SQL query: select count(*) OVER(PARTITION BY CAST(columns[0] as integer) GROUPS) from dfs.tmp.`t_alltype.csv` ^ [Error Id: f08e924a-bde9-48c4-bc7b-cb32c96908f2 on centos-04.qa.lab:31010] (state=,code=0) {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)