[jira] [Commented] (DRILL-1851) Need some samples for RANK(), ROW_NUMBER(), SubQuery in SELECTStatement - Apache Drill
[ https://issues.apache.org/jira/browse/DRILL-1851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978282#comment-14978282 ] Tom Barber commented on DRILL-1851: --- Hello folks, ROW_NUMBER works for me in 1.2 so I don't know how much of this ticket remains open: select ROW_NUMBER() OVER (ORDER BY columns[0]), columns[0] from dfs.`/home/bugg/tmp/hads/` limit 10; Just thought I'd let those watching know! Tom > Need some samples for RANK(), ROW_NUMBER(), SubQuery in SELECTStatement - > Apache Drill > -- > > Key: DRILL-1851 > URL: https://issues.apache.org/jira/browse/DRILL-1851 > Project: Apache Drill > Issue Type: New Feature > Components: Functions - Drill >Affects Versions: 0.6.0 > Environment: Drill SQL >Reporter: Chandru >Priority: Critical > Fix For: Future > > Attachments: Issue_Hugefiles-Drill.jpg > > Original Estimate: 2h > Remaining Estimate: 2h > > Provide some sample queries for the below scenarios, > 1.RANK( ) function > rank() over (ORDER BY columns[0] DESC) as rowcount > 2. ROW_NUMBER( ) function > row_number() over (PARTITION BY columns[0] ORDER BY columns[1] DESC) as > rowcount > 3. SubQuery in Select Statement. > SELECT paccnt.R_NAME, > CAST((SELECT N_NATIONKEY FROM > dfs.`/home/user/drill/drill-0.6.0/sample-data/nation.parquet`) AS CHAR) AS > NUMB > from dfs.`/home/user/drill/drill-0.6.0/sample-data/region.parquet` as paccnt; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3990) Create a sys.fragments table
[ https://issues.apache.org/jira/browse/DRILL-3990?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Nadeau updated DRILL-3990: -- Description: Similar to DRILL-3989, we should create a table which lists all the currently executing fragments. This could include the query they are associated with, the start and stop time, the node they are executing on and maybe a couple of key metrics (e.g. records consumed, records produced, current and peak memory consumed). This could also be modeled after the sys.threads and sys.memory tables. (was: Similar to DRILL-3988, we should create a table which lists all the currently executing fragments. This could include the query they are associated with, the start and stop time, the node they are executing on and maybe a couple of key metrics (e.g. records consumed, records produced, current and peak memory consumed). This could also be modeled after the sys.threads and sys.memory tables.) > Create a sys.fragments table > > > Key: DRILL-3990 > URL: https://issues.apache.org/jira/browse/DRILL-3990 > Project: Apache Drill > Issue Type: Improvement > Components: Metadata >Reporter: Jacques Nadeau > Labels: newbie > > Similar to DRILL-3989, we should create a table which lists all the currently > executing fragments. This could include the query they are associated with, > the start and stop time, the node they are executing on and maybe a couple of > key metrics (e.g. records consumed, records produced, current and peak memory > consumed). This could also be modeled after the sys.threads and sys.memory > tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3990) Create a sys.fragments table
Jacques Nadeau created DRILL-3990: - Summary: Create a sys.fragments table Key: DRILL-3990 URL: https://issues.apache.org/jira/browse/DRILL-3990 Project: Apache Drill Issue Type: Improvement Components: Metadata Reporter: Jacques Nadeau Similar to DRILL-3988, we should create a table which lists all the currently executing fragments. This could include the query they are associated with, the start and stop time, the node they are executing on and maybe a couple of key metrics (e.g. records consumed, records produced, current and peak memory consumed). This could also be modeled after the sys.threads and sys.memory tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3988) Create a sys.functions table to expose available Drill functions
[ https://issues.apache.org/jira/browse/DRILL-3988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Nadeau updated DRILL-3988: -- Labels: newbie (was: ) > Create a sys.functions table to expose available Drill functions > > > Key: DRILL-3988 > URL: https://issues.apache.org/jira/browse/DRILL-3988 > Project: Apache Drill > Issue Type: Improvement > Components: Metadata >Reporter: Jacques Nadeau > Labels: newbie > > Create a new sys.functions table that returns a list of all available > functions. > Key considerations: > - one row per name or one per argument set. I'm inclined to latter so people > can use queries to get to data. > - we need to create a delineation between user functions and internal > functions and only show user functions. 'CastInt' isn't something the user > should be able to see (or run). > - should we add a description annotation that could be included in the > sys.functions table? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3750) Not able to connect to HDFS and/or Hive
[ https://issues.apache.org/jira/browse/DRILL-3750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Nadeau updated DRILL-3750: -- Labels: (was: newbie) > Not able to connect to HDFS and/or Hive > --- > > Key: DRILL-3750 > URL: https://issues.apache.org/jira/browse/DRILL-3750 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill, Metadata, Storage - Hive, Storage - > Text & CSV >Affects Versions: 1.1.0 > Environment: apache hadoop and apache drill >Reporter: ravi ranjan kumar > Fix For: Future > > Original Estimate: 504h > Remaining Estimate: 504h > > I am not able to connect/fetch data using select queries form hive storage > and hdfs storage. > hive storage config - > { > "type": "hive", > "enabled": true, > "configProps": { > "hive.metastore.uris": "thrift://192.168.146.138:9083", > "javax.jdo.option.ConnectionURL": > "jdbc:derby:;databaseName=/home/ravi/bigdata/hive-1.0.1/metastore_db;create=true", > "hive.metastore.warehouse.dir": "/tmp/hive", > "fs.default.name": "hdfs://192.168.146.136:9000/", > "hive.metastore.sasl.enabled": "false" > } > } > Query - select * from hive.`customers` > ERROR - org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > EOFException [Error Id: 29746c1e-90fc-41f6-8263-ce943354b07e on ubuntu:31010] > HDFS storage config - > { > "type": "file", > "enabled": true, > "connection": "hdfs://192.168.146.136:9000/", > "workspaces": { > "root": { > "location": "/", > "writable": false, > "defaultInputFormat": null > }, > "tmp": { > "location": "/tmp", > "writable": true, > "defaultInputFormat": null > } > }, > "formats": { > "psv": { > "type": "text", > "extensions": [ > "tbl" > ], > "delimiter": "|" > }, > "csv": { > "type": "text", > "extensions": [ > "csv" > ], > "delimiter": "," > }, > "tsv": { > "type": "text", > "extensions": [ > "tsv" > ], > "delimiter": "\t" > }, > "parquet": { > "type": "parquet" > }, > "json": { > "type": "json" > }, > "avro": { > "type": "avro" > } > } > } > Query - select * from hdfs.`/customers.csv` > ERROR - org.apache.drill.common.exceptions.UserRemoteException: PARSE ERROR: > From line 1, column 15 to line 1, column 18: Table 'hdfs./customers.csv' not > found [Error Id: 13df2ccb-01bd-480f-966c-ceda7e1503a8 on ubuntu:31010] -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3989) Create a sys.queries table
Jacques Nadeau created DRILL-3989: - Summary: Create a sys.queries table Key: DRILL-3989 URL: https://issues.apache.org/jira/browse/DRILL-3989 Project: Apache Drill Issue Type: Bug Components: Metadata Reporter: Jacques Nadeau We should create a sys.queries table that provides a clusterwide view of active queries. It could include the following columns: queryid, user, sql, current status, number of nodes involved, number of total fragments, number of fragments completed, start time This should be a pretty straightforward task as we should be able to leverage the capabilities around required affinity. A great model to build off of are the sys.memory and sys.threads tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3937) We are not pruning when we have a metadata cache and auto partitioned data in some cases
[ https://issues.apache.org/jira/browse/DRILL-3937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978457#comment-14978457 ] Aman Sinha commented on DRILL-3937: --- I found 2 more issues during testing...I will upload an new PR for this along with unit tests. > We are not pruning when we have a metadata cache and auto partitioned data in > some cases > > > Key: DRILL-3937 > URL: https://issues.apache.org/jira/browse/DRILL-3937 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Reporter: Rahul Challapalli >Assignee: Aman Sinha > Attachments: 1_0_9998.parquet, 1_0_.parquet > > > git.commit.id.abbrev=2736412 > The below plan indicates that we are not pruning > {code} > explain plan for select count(*) from dfs.`/drill/comscore/orders2` where > o_clerk='Clerk#79443'; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(EXPR$0=[$0]) > 00-02StreamAgg(group=[{}], EXPR$0=[COUNT()]) > 00-03 Project($f0=[0]) > 00-04SelectionVectorRemover > 00-05 Filter(condition=[=($0, 'Clerk#79443')]) > 00-06Scan(groupscan=[ParquetGroupScan > [entries=[ReadEntryWithPath > [path=maprfs:///drill/comscore/orders2/1_0_.parquet], ReadEntryWithPath > [path=maprfs:///drill/comscore/orders2/1_0_9998.parquet]], > selectionRoot=/drill/comscore/orders2, numFiles=2, usedMetadataFile=true, > columns=[`o_clerk`]]]) > {code} > Error from the logs > {code} > 2015-10-15 01:24:28,467 [29e0ffb4-1c91-f40a-8bf0-5e3665dcf107:foreman] WARN > o.a.d.e.p.l.partition.PruneScanRule - Exception while trying to prune > partition. > java.lang.ClassCastException: java.util.LinkedHashMap cannot be cast to > parquet.io.api.Binary > at > org.apache.drill.exec.store.parquet.ParquetGroupScan.populatePruningVector(ParquetGroupScan.java:414) > ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.planner.ParquetPartitionDescriptor.populatePartitionVectors(ParquetPartitionDescriptor.java:96) > ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.planner.logical.partition.PruneScanRule.doOnMatch(PruneScanRule.java:212) > ~[drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.planner.logical.partition.ParquetPruneScanRule$2.onMatch(ParquetPruneScanRule.java:87) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:228) > [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] > at > org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:808) > [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] > at > org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:303) > [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] > at > org.apache.calcite.prepare.PlannerImpl.transform(PlannerImpl.java:303) > [calcite-core-1.4.0-drill-r6.jar:1.4.0-drill-r6] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.logicalPlanningVolcanoAndLopt(DefaultSqlHandler.java:545) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:213) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:248) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.handlers.ExplainHandler.getPlan(ExplainHandler.java:61) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:178) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:905) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:244) > [drill-java-exec-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_71] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_71] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] > {code} > The partition column type in this case is binary which could be causing the > issue. > Partition pruning seems to be working when we have Metadata Caching + Auto > Partitioned Files with integer partition column -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3951) Lexical Errors in ODBC Queries
[ https://issues.apache.org/jira/browse/DRILL-3951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Nadeau updated DRILL-3951: -- Labels: (was: newbie) > Lexical Errors in ODBC Queries > -- > > Key: DRILL-3951 > URL: https://issues.apache.org/jira/browse/DRILL-3951 > Project: Apache Drill > Issue Type: Bug > Components: Client - ODBC >Affects Versions: 1.1.0, 1.2.0 > Environment: Mac OS 10.11, Apache Drill v. 1.2, Python 3.4, >Reporter: Charles Givre > > I followed the instructions to install the latest version of Apache Drill, > and the Mapr ODBC drivers, but when I attempt to query a data source via > ODBC, I get the following errors: > Error: ('HY000', '[HY000] [MapR][Drill] (1040) Drill failed to execute the > query: `\n[30027]Query execution error. Details:[ \nPARSE > ERROR: Lexical error at line 1, column 1. Encountered: "\\ufffd" (65533), > after : ""\n\n\n[Error Id: 8e1f4049-f3e9-477f-9e3f-5df62c (1040) > (SQLExecDirectW)') > Here is the code which generates the errors: > import pyodbc > import pandas as pd > MY_DSN = > "DRIVER=/opt/mapr/drillodbc/lib/universal/libmaprdrillodbc.dylib;Host=localhost;Port=31010;ConnectionType=Direct;Catalog=Drill;Schema=mfs.views;AuthenticationType=No > Authentication" > conn = pyodbc.connect(MY_DSN, autocommit=True) > cursor = conn.cursor() > employee_query = "SELECT * FROM dfs.`employee.json`" > data = pd.read_sql( employee_query, conn ) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3925) Implementing and Configuring a Custom Authenticator
[ https://issues.apache.org/jira/browse/DRILL-3925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Nadeau updated DRILL-3925: -- Labels: (was: newbie) > Implementing and Configuring a Custom Authenticator > --- > > Key: DRILL-3925 > URL: https://issues.apache.org/jira/browse/DRILL-3925 > Project: Apache Drill > Issue Type: Bug > Components: Documentation, Execution - RPC, Functions - Drill >Affects Versions: 1.1.0 > Environment: MacOSX, Linux >Reporter: Tri Dung Le > > https://drill.apache.org/docs/configuring-user-authentication/#implementing-and-configuring-a-custom-authenticator > I have been read this tutorial to implement a custom authentication for > apache drill. But I get error. Please help me to figure out what problem. > Please see full detail as below: > {quote} > Error: Failure in starting embedded Drillbit: > org.apache.drill.exec.exception.DrillbitStartupException: Failed to find the > implementation of '{}' for type '{}' (state=,code=0) > java.sql.SQLException: Failure in starting embedded Drillbit: > org.apache.drill.exec.exception.DrillbitStartupException: Failed to find the > implementation of '{}' for type '{}' > at > org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:109) > at > org.apache.drill.jdbc.impl.DrillJdbc41Factory.newDrillConnection(DrillJdbc41Factory.java:66) > at > org.apache.drill.jdbc.impl.DrillFactory.newConnection(DrillFactory.java:69) > at > net.hydromatic.avatica.UnregisteredDriver.connect(UnregisteredDriver.java:126) > at org.apache.drill.jdbc.Driver.connect(Driver.java:78) > at sqlline.DatabaseConnection.connect(DatabaseConnection.java:167) > at sqlline.DatabaseConnection.getConnection(DatabaseConnection.java:213) > at sqlline.Commands.connect(Commands.java:1083) > at sqlline.Commands.connect(Commands.java:1015) > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > sqlline.ReflectiveCommandHandler.execute(ReflectiveCommandHandler.java:36) > at sqlline.SqlLine.dispatch(SqlLine.java:734) > at sqlline.SqlLine.initArgs(SqlLine.java:519) > at sqlline.SqlLine.begin(SqlLine.java:587) > at sqlline.SqlLine.start(SqlLine.java:366) > at sqlline.SqlLine.main(SqlLine.java:259) > Caused by: org.apache.drill.exec.exception.DrillbitStartupException: Failed > to find the implementation of '{}' for type '{}' > at > org.apache.drill.exec.rpc.user.security.UserAuthenticatorFactory.createAuthenticator(UserAuthenticatorFactory.java:104) > at org.apache.drill.exec.rpc.user.UserServer.(UserServer.java:75) > at > org.apache.drill.exec.service.ServiceEngine.(ServiceEngine.java:57) > at org.apache.drill.exec.server.Drillbit.(Drillbit.java:184) > at > org.apache.drill.jdbc.impl.DrillConnectionImpl.(DrillConnectionImpl.java:99) > ... 18 more > apache drill 1.0.0 > "a drill is a terrible thing to waste" > {quote} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3933) Error execute select command line sqlline -u -q
[ https://issues.apache.org/jira/browse/DRILL-3933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Nadeau updated DRILL-3933: -- Labels: bug (was: bug newbie) > Error execute select command line sqlline -u -q > --- > > Key: DRILL-3933 > URL: https://issues.apache.org/jira/browse/DRILL-3933 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.1.0 >Reporter: Jon > Labels: bug > > I'm newbie with Drill and Jira, so sorry if this is not the correct site. > When I query : "sqlline -u 'jdbc:drill:drillbit=localhost' -q 'select * from > hive.database.table;' " return: > "select anaconda-ks.cfg build.out install.log install.log.syslog > ranger_tutorial sandbox.info start_ambari.sh start_hbase.sh start_solr.sh > stop_solr.sh from hive.database.table;" > Error: PARSE ERROR: Encountered "." at line 1, column 29. > Was expecting one of: > "FROM" ... > "," ... > So, to fix this, i should type all columns to do this one work. > But, if I used UI, in localhost:8047/query, the query works. Drill is > connected to Hive with plugin of course. Is this a bug or something? Bad > conf.? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3987) Create a POC VV extraction
[ https://issues.apache.org/jira/browse/DRILL-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978633#comment-14978633 ] Jacques Nadeau commented on DRILL-3987: --- One of the things I'm looking at here what is the right separation between describing schema and Drill's concept of materialized field and schemapath. It seems like we need a simplified MaterializedField in the vector classes and then a specialization that supports things like Drill's logical expressions in the Drill codebase. What do you think [~hgunes]? > Create a POC VV extraction > -- > > Key: DRILL-3987 > URL: https://issues.apache.org/jira/browse/DRILL-3987 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Jacques Nadeau >Assignee: Jacques Nadeau > > I'd like to start by looking at an extraction that pulls out the base > concepts of: > buffer allocation, value vectors and complexwriter/fieldreader. > I need to figure out how to resolve some of the cross-dependency issues (such > as the jdbc accessor connections). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-1491) Support for JDK 8
[ https://issues.apache.org/jira/browse/DRILL-1491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wong updated DRILL-1491: Attachment: DRILL-1491.1.patch.txt DRILL-1491.1.patch.txt - allow JDK 8 > Support for JDK 8 > - > > Key: DRILL-1491 > URL: https://issues.apache.org/jira/browse/DRILL-1491 > Project: Apache Drill > Issue Type: Task > Components: Tools, Build & Test >Reporter: Aditya Kishore > Fix For: Future > > Attachments: DRILL-1491.1.patch.txt > > > This will be the umbrella JIRA used to track and fix issues with JDK 8 > support. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3929) Support the ability to query database tables using external indices
[ https://issues.apache.org/jira/browse/DRILL-3929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978820#comment-14978820 ] Aman Sinha commented on DRILL-3929: --- Discussed the Phoenix integration with Calcite more on the dev list and through a google hangout. See discussion on Drill dev list: http://mail-archives.apache.org/mod_mbox/drill-dev/201510.mbox/%3ccajrw0orh+wfa2gfzgbglbrkqk9m6y4_aor5kh_rxhaag0cb...@mail.gmail.com%3e The approach to use projections and rely on materialized view rewrite in Calcite is predicated on how exactly does Calcite do the MV matching and rewrites. There are at least 2 pending JIRAs: CALCITE-772 and CALCITE-773 that are known items that are needed for the phoenix integration. However, I think even if they are addressed, the basic idea of converting each index column predicate into a join would not work well for Drill. It will add substantially to the join planning cost, which is not needed since we can do the secondary index planning during physical planning stage rather than logical planning. > Support the ability to query database tables using external indices > -- > > Key: DRILL-3929 > URL: https://issues.apache.org/jira/browse/DRILL-3929 > Project: Apache Drill > Issue Type: New Feature > Components: Execution - Relational Operators, Query Planning & > Optimization >Reporter: Aman Sinha >Assignee: Aman Sinha > > This is a placeholder for adding support in Drill to query database tables > using external indices. I will add more details about the use case and a > preliminary design proposal. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3989) Create a sys.queries table
[ https://issues.apache.org/jira/browse/DRILL-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978892#comment-14978892 ] Jacques Nadeau commented on DRILL-3989: --- initially i was focused on running queries. Completed queries is a much larger set. Not sure that should be in the same table. (One other note, someone can currently query the query log for this information.) > Create a sys.queries table > -- > > Key: DRILL-3989 > URL: https://issues.apache.org/jira/browse/DRILL-3989 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Reporter: Jacques Nadeau > Labels: newbie > > We should create a sys.queries table that provides a clusterwide view of > active queries. It could include the following columns: > queryid, user, sql, current status, number of nodes involved, number of total > fragments, number of fragments completed, start time > This should be a pretty straightforward task as we should be able to leverage > the capabilities around required affinity. A great model to build off of are > the sys.memory and sys.threads tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3989) Create a sys.queries table
[ https://issues.apache.org/jira/browse/DRILL-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978979#comment-14978979 ] Julian Hyde commented on DRILL-3989: I don’t think Oracle has a clear answer. Their nearest equivalent is v$sql, but they also have v$process. JDBC defines the terminology for most people, and they call it statement. Albeit a JDBC statement can be executed multiple times. I don't know whether Drill gives each execution a new id, or uses the same statement id for each. MySQL gets it mixed up: “KILL QUERY terminates the statement the connection is currently executing, but leaves the connection itself intact.” https://dev.mysql.com/doc/refman/5.0/en/kill.html I'd define it as "things that are running that a DBA would like to kill". This includes SELECT queries, DML and DDL statements. Collectively, statements. Certainly, INSERT and CREATE TABLE AS SELECT can potentially table as much time & resources as queries. > Create a sys.queries table > -- > > Key: DRILL-3989 > URL: https://issues.apache.org/jira/browse/DRILL-3989 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Reporter: Jacques Nadeau > Labels: newbie > > We should create a sys.queries table that provides a clusterwide view of > active queries. It could include the following columns: > queryid, user, sql, current status, number of nodes involved, number of total > fragments, number of fragments completed, start time > This should be a pretty straightforward task as we should be able to leverage > the capabilities around required affinity. A great model to build off of are > the sys.memory and sys.threads tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3989) Create a sys.queries table
[ https://issues.apache.org/jira/browse/DRILL-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978874#comment-14978874 ] Julian Hyde commented on DRILL-3989: You should call it "statements". You never know... > Create a sys.queries table > -- > > Key: DRILL-3989 > URL: https://issues.apache.org/jira/browse/DRILL-3989 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Reporter: Jacques Nadeau > Labels: newbie > > We should create a sys.queries table that provides a clusterwide view of > active queries. It could include the following columns: > queryid, user, sql, current status, number of nodes involved, number of total > fragments, number of fragments completed, start time > This should be a pretty straightforward task as we should be able to leverage > the capabilities around required affinity. A great model to build off of are > the sys.memory and sys.threads tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3987) Create a POC VV extraction
[ https://issues.apache.org/jira/browse/DRILL-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978974#comment-14978974 ] Hanifi Gunes commented on DRILL-3987: - For the points above, i) > Create a POC VV extraction > -- > > Key: DRILL-3987 > URL: https://issues.apache.org/jira/browse/DRILL-3987 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Jacques Nadeau >Assignee: Jacques Nadeau > > I'd like to start by looking at an extraction that pulls out the base > concepts of: > buffer allocation, value vectors and complexwriter/fieldreader. > I need to figure out how to resolve some of the cross-dependency issues (such > as the jdbc accessor connections). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3987) Create a POC VV extraction
[ https://issues.apache.org/jira/browse/DRILL-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979006#comment-14979006 ] Jacques Nadeau commented on DRILL-3987: --- Some additional thoughts: packages (both codegen and code) org.apache.drill.common.types.TypeProtos (type stuff from protoc) org.apache.drill.exec.vector org.apache.drill.exec.vector.complex org.apache.drill.exec.vector.complex.impl org.apache.drill.exec.vector.complex.reader org.apache.drill.exec.vector.complex.writer Classes org.apache.drill.exec.proto.SchemaUserBitShared.SerializedField org.apache.drill.exec.util.CallBack org.apache.drill.exec.memory.BufferAllocator io.netty.buffer.DrillBuf Need a Basic version of VectorContainer (probably without the VectorWrapper and VectorAccessible) Need to subdivide SchemaPath/MaterializedField. Need to extract OutOfMemory and some other exceptions What do to with holders... We probably need to extract some DrillBuf concepts externally (e.g. buffermanager) > Create a POC VV extraction > -- > > Key: DRILL-3987 > URL: https://issues.apache.org/jira/browse/DRILL-3987 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Jacques Nadeau >Assignee: Jacques Nadeau > > I'd like to start by looking at an extraction that pulls out the base > concepts of: > buffer allocation, value vectors and complexwriter/fieldreader. > I need to figure out how to resolve some of the cross-dependency issues (such > as the jdbc accessor connections). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-2175) Provide an option to not display the list of files in the physical plan
[ https://issues.apache.org/jira/browse/DRILL-2175?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978793#comment-14978793 ] Rahul Challapalli commented on DRILL-2175: -- We are currently using the text plan in the extended tests for testing partition pruning. Whenever the order of files change, we might run into test failures and some effort is needed to fix the tests. So getting rid of the files scanned will be helpful. We should also make sure that this applies across all Scan's (Json, Parquet, Hive etc) Also the HiveScan currently does not display the "numFiles" attribute in the scan. Without this attribute we cannot test the hive partition pruning if we end up getting rid of the list of scanned files/partitions https://issues.apache.org/jira/browse/DRILL-3634 > Provide an option to not display the list of files in the physical plan > --- > > Key: DRILL-2175 > URL: https://issues.apache.org/jira/browse/DRILL-2175 > Project: Apache Drill > Issue Type: Improvement > Components: Query Planning & Optimization >Reporter: Aman Sinha > Fix For: Future > > > The physical plan shown through explain (both the text and json version) > shows all the files to be read by the Scan node. This creates a problem > when the number of files is large (e.g hundreds) - I am unable to see the > entire plan even after raising the sqlline maxwidth to 500K (default of 10K > is too small). This is a usability issue. > We could provide an option - either through another version of Explain or > through a session option - to not display the entire list of files. Another > option is to show the parent directory and the number of files it contains. > The total number of files is shown already. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3989) Create a sys.queries table
[ https://issues.apache.org/jira/browse/DRILL-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978872#comment-14978872 ] Khurram Faraaz commented on DRILL-3989: --- Should we also have a column to hold the total time taken for execution by a query that has completed execution ? > Create a sys.queries table > -- > > Key: DRILL-3989 > URL: https://issues.apache.org/jira/browse/DRILL-3989 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Reporter: Jacques Nadeau > Labels: newbie > > We should create a sys.queries table that provides a clusterwide view of > active queries. It could include the following columns: > queryid, user, sql, current status, number of nodes involved, number of total > fragments, number of fragments completed, start time > This should be a pretty straightforward task as we should be able to leverage > the capabilities around required affinity. A great model to build off of are > the sys.memory and sys.threads tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3989) Create a sys.queries table
[ https://issues.apache.org/jira/browse/DRILL-3989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978894#comment-14978894 ] Jacques Nadeau commented on DRILL-3989: --- Do you mean the table e.g. sys.statements? I can see that. Is that what Oracle calls it? > Create a sys.queries table > -- > > Key: DRILL-3989 > URL: https://issues.apache.org/jira/browse/DRILL-3989 > Project: Apache Drill > Issue Type: Bug > Components: Metadata >Reporter: Jacques Nadeau > Labels: newbie > > We should create a sys.queries table that provides a clusterwide view of > active queries. It could include the following columns: > queryid, user, sql, current status, number of nodes involved, number of total > fragments, number of fragments completed, start time > This should be a pretty straightforward task as we should be able to leverage > the capabilities around required affinity. A great model to build off of are > the sys.memory and sys.threads tables. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3991) Support schema changes in hash join operator
[ https://issues.apache.org/jira/browse/DRILL-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] amit hadke updated DRILL-3991: -- Description: Hash join should be able to support schema changes during execution. It should resolve edge cases when join columns are missing. Example: |Table A | Table B| |--|:---:| | k1 v1 | k2 v2| | 1 "a" | "2" "b"| | 2 "b" | 1"a"| | 2.0 "b" | 2.0 "b"| | 3 "c" | | A INNER JOIN B on A.k1=B.k2 |k1 | v1 | k2|v2| |---|::|--:|--:| | 1 | "a" | 1 | "a" | | 2 | "b" | 2.0 | "b" | | 2.0 | "b" | 2.0 | "b" | Where in output k1 is a union type (INTEGER, DOUBLE) k2 is a union type (INTEGER, DOUBLE, VARCHAR) was: Hash join should be able to support schema changes during execution. It should resolve edge cases when join columns are missing. Example: Table A Table B k1 v1k2 v2 1 "a" "2" "b" 2 "b"1"a" 2.0 "b" 2.0 "b" 3 "c" A inner join B on A.key=B.key k1 v1 k2v2 1 "a" 1 "a" 2 "b" 2.0 "b" 2.0 "b" 2.0 "b" Where in output k1 is a union type (INTEGER, DOUBLE) k2 is a union type (INTEGER, DOUBLE, VARCHAR) > Support schema changes in hash join operator > > > Key: DRILL-3991 > URL: https://issues.apache.org/jira/browse/DRILL-3991 > Project: Apache Drill > Issue Type: Improvement >Reporter: amit hadke > > Hash join should be able to support schema changes during execution. > It should resolve edge cases when join columns are missing. > Example: > |Table A | Table B| > |--|:---:| > | k1 v1 | k2 v2| > | 1 "a" | "2" "b"| > | 2 "b" | 1"a"| > | 2.0 "b" | 2.0 "b"| > | 3 "c" | | > > A INNER JOIN B on A.k1=B.k2 > |k1 | v1 | k2|v2| > |---|::|--:|--:| > | 1 | "a" | 1 | "a" | > | 2 | "b" | 2.0 | "b" | > | 2.0 | "b" | 2.0 | "b" | > Where in output > > k1 is a union type (INTEGER, DOUBLE) > k2 is a union type (INTEGER, DOUBLE, VARCHAR) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3991) Support schema changes in hash join operator
[ https://issues.apache.org/jira/browse/DRILL-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] amit hadke updated DRILL-3991: -- Description: Hash join should be able to support schema changes during execution. It should resolve edge cases when join columns are missing. Example: |Table A | Table B| | k1 v1 | k2 v2| | 1 "a" | "2" "b"| | 2 "b" | 1"a"| | 2.0 "b" | 2.0 "b"| | 3 "c" | | A INNER JOIN B on A.k1=B.k2 |k1 | v1 | k2|v2| | 1 | "a" | 1 | "a" | | 2 | "b" | 2.0 | "b" | | 2.0 | "b" | 2.0 | "b" | Where in output k1 is of union type (INTEGER, DOUBLE) k2 is of union type (INTEGER, DOUBLE, VARCHAR) was: Hash join should be able to support schema changes during execution. It should resolve edge cases when join columns are missing. Example: |Table A | Table B| | k1 v1 | k2 v2| | 1 "a" | "2" "b"| | 2 "b" | 1"a"| | 2.0 "b" | 2.0 "b"| | 3 "c" | | A INNER JOIN B on A.k1=B.k2 |k1 | v1 | k2|v2| | 1 | "a" | 1 | "a" | | 2 | "b" | 2.0 | "b" | | 2.0 | "b" | 2.0 | "b" | Where in output k1 is a union type (INTEGER, DOUBLE) k2 is a union type (INTEGER, DOUBLE, VARCHAR) > Support schema changes in hash join operator > > > Key: DRILL-3991 > URL: https://issues.apache.org/jira/browse/DRILL-3991 > Project: Apache Drill > Issue Type: Improvement >Reporter: amit hadke > > Hash join should be able to support schema changes during execution. > It should resolve edge cases when join columns are missing. > Example: > |Table A | Table B| > | k1 v1 | k2 v2| > | 1 "a" | "2" "b"| > | 2 "b" | 1"a"| > | 2.0 "b" | 2.0 "b"| > | 3 "c" | | > > A INNER JOIN B on A.k1=B.k2 > |k1 | v1 | k2|v2| > | 1 | "a" | 1 | "a" | > | 2 | "b" | 2.0 | "b" | > | 2.0 | "b" | 2.0 | "b" | > Where in output > > k1 is of union type (INTEGER, DOUBLE) > k2 is of union type (INTEGER, DOUBLE, VARCHAR) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (DRILL-2123) Order of columns in the Web UI is wrong when columns are explicitly specified in projection list
[ https://issues.apache.org/jira/browse/DRILL-2123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sudheesh Katkam reassigned DRILL-2123: -- Assignee: Sudheesh Katkam > Order of columns in the Web UI is wrong when columns are explicitly specified > in projection list > > > Key: DRILL-2123 > URL: https://issues.apache.org/jira/browse/DRILL-2123 > Project: Apache Drill > Issue Type: Bug > Components: Client - HTTP >Affects Versions: 0.8.0 >Reporter: Victoria Markman >Assignee: Sudheesh Katkam >Priority: Critical > Fix For: Future > > Attachments: Screen Shot 2015-01-29 at 4.08.06 PM.png > > > I'm running query: > {code} > select c_integer, >c_bigint, >nullif(c_integer, c_bigint) > from `dfs.aggregation`.t1 > order by c_integer > {code} > In sqlline I get correct order of columns: > {code} > 0: jdbc:drill:schema=dfs> select c_integer, c_bigint, nullif(c_integer, > c_bigint) from `dfs.aggregation`.t1; > ++++ > | c_integer | c_bigint | EXPR$2 | > ++++ > | 451237400 | -3477884857818808320 | 451237400 | > {code} > In Web UI - columns are sorted in alphabetical order. > Screenshot is attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3991) Support schema changes in hash join operator
[ https://issues.apache.org/jira/browse/DRILL-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] amit hadke updated DRILL-3991: -- Description: Hash join should be able to support schema changes during execution. It should resolve edge cases when join columns are missing. Example: |Table A | Table B| | k1 v1 | k2 v2| | 1 "a" | "2" "b"| | 2 "b" | 1"a"| | 2.0 "b" | 2.0 "b"| | 3 "c" | | A INNER JOIN B on A.k1=B.k2 |k1 | v1 | k2|v2| | 1 | "a" | 1 | "a" | | 2 | "b" | 2.0 | "b" | | 2.0 | "b" | 2.0 | "b" | Where in output k1 is a union type (INTEGER, DOUBLE) k2 is a union type (INTEGER, DOUBLE, VARCHAR) was: Hash join should be able to support schema changes during execution. It should resolve edge cases when join columns are missing. Example: |Table A | Table B| |--|:---:| | k1 v1 | k2 v2| | 1 "a" | "2" "b"| | 2 "b" | 1"a"| | 2.0 "b" | 2.0 "b"| | 3 "c" | | A INNER JOIN B on A.k1=B.k2 |k1 | v1 | k2|v2| |---|::|--:|--:| | 1 | "a" | 1 | "a" | | 2 | "b" | 2.0 | "b" | | 2.0 | "b" | 2.0 | "b" | Where in output k1 is a union type (INTEGER, DOUBLE) k2 is a union type (INTEGER, DOUBLE, VARCHAR) > Support schema changes in hash join operator > > > Key: DRILL-3991 > URL: https://issues.apache.org/jira/browse/DRILL-3991 > Project: Apache Drill > Issue Type: Improvement >Reporter: amit hadke > > Hash join should be able to support schema changes during execution. > It should resolve edge cases when join columns are missing. > Example: > |Table A | Table B| > | k1 v1 | k2 v2| > | 1 "a" | "2" "b"| > | 2 "b" | 1"a"| > | 2.0 "b" | 2.0 "b"| > | 3 "c" | | > > A INNER JOIN B on A.k1=B.k2 > |k1 | v1 | k2|v2| > | 1 | "a" | 1 | "a" | > | 2 | "b" | 2.0 | "b" | > | 2.0 | "b" | 2.0 | "b" | > Where in output > > k1 is a union type (INTEGER, DOUBLE) > k2 is a union type (INTEGER, DOUBLE, VARCHAR) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (DRILL-3991) Support schema changes in hash join operator
[ https://issues.apache.org/jira/browse/DRILL-3991?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] amit hadke reassigned DRILL-3991: - Assignee: amit hadke > Support schema changes in hash join operator > > > Key: DRILL-3991 > URL: https://issues.apache.org/jira/browse/DRILL-3991 > Project: Apache Drill > Issue Type: Improvement >Reporter: amit hadke >Assignee: amit hadke > > Hash join should be able to support schema changes during execution. > It should resolve edge cases when join columns are missing. > Example: > |Table A | Table B| > | k1 v1 | k2 v2| > | 1 "a" | "2" "b"| > | 2 "b" | 1"a"| > | 2.0 "b" | 2.0 "b"| > | 3 "c" | | > > A INNER JOIN B on A.k1=B.k2 > |k1 | v1 | k2|v2| > | 1 | "a" | 1 | "a" | > | 2 | "b" | 2.0 | "b" | > | 2.0 | "b" | 2.0 | "b" | > Where in output > > k1 is of union type (INTEGER, DOUBLE) > k2 is of union type (INTEGER, DOUBLE, VARCHAR) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (DRILL-3987) Create a POC VV extraction
[ https://issues.apache.org/jira/browse/DRILL-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14978974#comment-14978974 ] Hanifi Gunes edited comment on DRILL-3987 at 10/28/15 7:47 PM: --- Vectors should store specific types of values supporting append only writes & random reads as well as exporting convenience functions for zero-copy buffer transfer, accessing vector metadata like buffer size, schema etc. So for the points above, we need to export i) a purified ByteBuf sub-interface. DrillBuf seems over convoluted with operator, fragment ctx and suchlike operations. ii) a subset of Drill's BufferAllocator removing drill specific logic like getFragmentLimit iii) builders to instantiate vectors, writers to support append only writes, readers to make random reads iv) Involving RPC related stuff in the base library sounds out of scope. I would model transfers happening amongst vectors. v) you can export a vector into a metadata & composite buffer it would be really nice if you could build it back again. Exporting convenience classes/methods like VectorContainers, RecordBatchLoader (will need a better name here :) would be really complementary. vi) I would also propose touching to the design for abstracting out a ListVector and removing Repeated* types. vii) [~jnadeau] we had a lot of difficulty in the past due to serialized/materialized mix in the past esp with computing hash code, materialized field mismatching complex VV instances. At this point, I would think that having an immutable vector descriptor along with an immutable schema descriptor built lazily on demand (see BaseVV#getMetadataBuilder) would make sense. To me a barebones vector descriptor is as simple as a path/name + type (all immutable). We should be able to create a vector just using these two. We can still keep MField for carrying out metadata info. Will look at this more as PoC gets a shape. was (Author: hgunes): For the points above, i) > Create a POC VV extraction > -- > > Key: DRILL-3987 > URL: https://issues.apache.org/jira/browse/DRILL-3987 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Jacques Nadeau >Assignee: Jacques Nadeau > > I'd like to start by looking at an extraction that pulls out the base > concepts of: > buffer allocation, value vectors and complexwriter/fieldreader. > I need to figure out how to resolve some of the cross-dependency issues (such > as the jdbc accessor connections). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3994) Build Fails on Windows after DRILL-3742
[ https://issues.apache.org/jira/browse/DRILL-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979704#comment-14979704 ] Jacques Nadeau commented on DRILL-3994: --- Hey [~julienledem], sounds like this patch may have created a regression in the Windows build. Can you take a look? > Build Fails on Windows after DRILL-3742 > --- > > Key: DRILL-3994 > URL: https://issues.apache.org/jira/browse/DRILL-3994 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build & Test >Reporter: Sudheesh Katkam >Assignee: Julien Le Dem >Priority: Critical > > Build fails on Windows on the latest master: > {code} > c:\drill> mvn clean install -DskipTests > ... > [INFO] Rat check: Summary of files. Unapproved: 0 unknown: 0 generated: 0 > approved: 169 licence. > [INFO] > [INFO] <<< exec-maven-plugin:1.2.1:java (default) < validate @ drill-common > <<< > [INFO] > [INFO] --- exec-maven-plugin:1.2.1:java (default) @ drill-common --- > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See > http://www.slf4j.org/codes.html#StaticLoggerBinder > for further details. > Scanning: C:\drill\common\target\classes > [WARNING] > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.IllegalArgumentException: > file:C:/drill/common/target/classes/ not in > [file:/C:/drill/common/target/classes/] > at > org.apache.drill.common.scanner.BuildTimeScan.main(BuildTimeScan.java:129) > ... 6 more > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Apache Drill Root POM .. SUCCESS [ 10.016 > s] > [INFO] tools/Parent Pom ... SUCCESS [ 1.062 > s] > [INFO] tools/freemarker codegen tooling ... SUCCESS [ 6.922 > s] > [INFO] Drill Protocol . SUCCESS [ 10.062 > s] > [INFO] Common (Logical Plan, Base expressions) FAILURE [ 9.954 > s] > [INFO] contrib/Parent Pom . SKIPPED > [INFO] contrib/data/Parent Pom SKIPPED > [INFO] contrib/data/tpch-sample-data .. SKIPPED > [INFO] exec/Parent Pom SKIPPED > [INFO] exec/Java Execution Engine . SKIPPED > [INFO] exec/JDBC Driver using dependencies SKIPPED > [INFO] JDBC JAR with all dependencies . SKIPPED > [INFO] contrib/mongo-storage-plugin ... SKIPPED > [INFO] contrib/hbase-storage-plugin ... SKIPPED > [INFO] contrib/jdbc-storage-plugin SKIPPED > [INFO] contrib/hive-storage-plugin/Parent Pom . SKIPPED > [INFO] contrib/hive-storage-plugin/hive-exec-shaded ... SKIPPED > [INFO] contrib/hive-storage-plugin/core ... SKIPPED > [INFO] contrib/drill-gis-plugin ... SKIPPED > [INFO] Packaging and Distribution Assembly SKIPPED > [INFO] contrib/sqlline SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 38.813 s > [INFO] Finished at: 2015-10-28T12:17:19-07:00 > [INFO] Final Memory: 67M/466M > [INFO] > > [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:java > (default) on project drill-common: An exception occured while executing the > Java class. null: InvocationTargetException: > file:C:/drill/common/target/classes/ not in > [file:/C:/drill/common/target/classes/] -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build
[jira] [Updated] (DRILL-3994) Build Fails on Windows after DRILL-3742
[ https://issues.apache.org/jira/browse/DRILL-3994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jacques Nadeau updated DRILL-3994: -- Assignee: Julien Le Dem > Build Fails on Windows after DRILL-3742 > --- > > Key: DRILL-3994 > URL: https://issues.apache.org/jira/browse/DRILL-3994 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build & Test >Reporter: Sudheesh Katkam >Assignee: Julien Le Dem >Priority: Critical > > Build fails on Windows on the latest master: > {code} > c:\drill> mvn clean install -DskipTests > ... > [INFO] Rat check: Summary of files. Unapproved: 0 unknown: 0 generated: 0 > approved: 169 licence. > [INFO] > [INFO] <<< exec-maven-plugin:1.2.1:java (default) < validate @ drill-common > <<< > [INFO] > [INFO] --- exec-maven-plugin:1.2.1:java (default) @ drill-common --- > SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". > SLF4J: Defaulting to no-operation (NOP) logger implementation > SLF4J: See > http://www.slf4j.org/codes.html#StaticLoggerBinder > for further details. > Scanning: C:\drill\common\target\classes > [WARNING] > java.lang.reflect.InvocationTargetException > at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297) > at java.lang.Thread.run(Thread.java:745) > Caused by: java.lang.IllegalArgumentException: > file:C:/drill/common/target/classes/ not in > [file:/C:/drill/common/target/classes/] > at > org.apache.drill.common.scanner.BuildTimeScan.main(BuildTimeScan.java:129) > ... 6 more > [INFO] > > [INFO] Reactor Summary: > [INFO] > [INFO] Apache Drill Root POM .. SUCCESS [ 10.016 > s] > [INFO] tools/Parent Pom ... SUCCESS [ 1.062 > s] > [INFO] tools/freemarker codegen tooling ... SUCCESS [ 6.922 > s] > [INFO] Drill Protocol . SUCCESS [ 10.062 > s] > [INFO] Common (Logical Plan, Base expressions) FAILURE [ 9.954 > s] > [INFO] contrib/Parent Pom . SKIPPED > [INFO] contrib/data/Parent Pom SKIPPED > [INFO] contrib/data/tpch-sample-data .. SKIPPED > [INFO] exec/Parent Pom SKIPPED > [INFO] exec/Java Execution Engine . SKIPPED > [INFO] exec/JDBC Driver using dependencies SKIPPED > [INFO] JDBC JAR with all dependencies . SKIPPED > [INFO] contrib/mongo-storage-plugin ... SKIPPED > [INFO] contrib/hbase-storage-plugin ... SKIPPED > [INFO] contrib/jdbc-storage-plugin SKIPPED > [INFO] contrib/hive-storage-plugin/Parent Pom . SKIPPED > [INFO] contrib/hive-storage-plugin/hive-exec-shaded ... SKIPPED > [INFO] contrib/hive-storage-plugin/core ... SKIPPED > [INFO] contrib/drill-gis-plugin ... SKIPPED > [INFO] Packaging and Distribution Assembly SKIPPED > [INFO] contrib/sqlline SKIPPED > [INFO] > > [INFO] BUILD FAILURE > [INFO] > > [INFO] Total time: 38.813 s > [INFO] Finished at: 2015-10-28T12:17:19-07:00 > [INFO] Final Memory: 67M/466M > [INFO] > > [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:java > (default) on project drill-common: An exception occured while executing the > Java class. null: InvocationTargetException: > file:C:/drill/common/target/classes/ not in > [file:/C:/drill/common/target/classes/] -> [Help 1] > [ERROR] > [ERROR] To see the full stack trace of the errors, re-run Maven with the -e > switch. > [ERROR] Re-run Maven using the -X switch to enable full debug logging. > [ERROR] > [ERROR] For more information about the errors and possible solutions, please > read the following articles: > [ERROR] [Help 1] > http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException > [ERROR] > [ERROR] After correcting the problems, you can resume the build with the > command > [ERROR] mvn -rf :drill-common > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3995) Scalar replacement bug with Common Subexpression Elimination
Steven Phillips created DRILL-3995: -- Summary: Scalar replacement bug with Common Subexpression Elimination Key: DRILL-3995 URL: https://issues.apache.org/jira/browse/DRILL-3995 Project: Apache Drill Issue Type: Bug Reporter: Steven Phillips The following query: {code} select t1.full_name from cp.`employee.json` t1, cp.`department.json` t2 where t1.department_id = t2.department_id and t1.position_id = t2.department_id {code} fails with the following: org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: RuntimeException: Error at instruction 43: Expected an object reference, but found . setValue(II)V 0 R I I . . . . : :L0 1 R I I . . . . : : LINENUMBER 249 L0 2 R I I . . . . : : ICONST_0 3 R I I . . . . : I : ISTORE 3 4 R I I I . . . : : LCONST_0 5 R I I I . . . : J : LSTORE 4 6 R I I I J . . : :L1 7 R I I I J . . : : LINENUMBER 251 L1 8 R I I I J . . : : ALOAD 0 9 R I I I J . . : R : GETFIELD org/apache/drill/exec/test/generated/HashTableGen2$BatchHolder.vv20 : Lorg/apache/drill/exec/vector/NullableBigIntVector; 00010 R I I I J . . : R : INVOKEVIRTUAL org/apache/drill/exec/vector/NullableBigIntVector.getAccessor ()Lorg/apache/drill/exec/vector/NullableBigIntVector$Accessor; 00011 R I I I J . . : R : ILOAD 1 00012 R I I I J . . : R I : INVOKEVIRTUAL org/apache/drill/exec/vector/NullableBigIntVector$Accessor.isSet (I)I 00013 R I I I J . . : I : ISTORE 3 00014 R I I I J . . : :L2 00015 R I I I J . . : : LINENUMBER 252 L2 00016 R I I I J . . : : ILOAD 3 00017 R I I I J . . : I : ICONST_1 00018 R I I I J . . : I I : IF_ICMPNE L3 00019 R I I I J . . : :L4 00020 ? : LINENUMBER 253 L4 00021 ? : ALOAD 0 00022 ? : GETFIELD org/apache/drill/exec/test/generated/HashTableGen2$BatchHolder.vv20 : Lorg/apache/drill/exec/vector/NullableBigIntVector; 00023 ? : INVOKEVIRTUAL org/apache/drill/exec/vector/NullableBigIntVector.getAccessor ()Lorg/apache/drill/exec/vector/NullableBigIntVector$Accessor; 00024 ? : ILOAD 1 00025 ? : INVOKEVIRTUAL org/apache/drill/exec/vector/NullableBigIntVector$Accessor.get (I)J 00026 ? : LSTORE 4 00027 R I I I J . . : :L3 00028 R I I I J . . : : LINENUMBER 256 L3 00029 R I I I J . . : : ILOAD 3 00030 R I I I J . . : I : ICONST_0 00031 R I I I J . . : I I : IF_ICMPEQ L5 00032 R I I I J . . : :L6 00033 ? : LINENUMBER 257 L6 00034 ? : ALOAD 0 00035 ? : GETFIELD org/apache/drill/exec/test/generated/HashTableGen2$BatchHolder.vv24 : Lorg/apache/drill/exec/vector/NullableBigIntVector; 00036 ? : INVOKEVIRTUAL org/apache/drill/exec/vector/NullableBigIntVector.getMutator ()Lorg/apache/drill/exec/vector/NullableBigIntVector$Mutator; 00037 ? : ILOAD 2 00038 ? : ILOAD 3 00039 ? : LLOAD 4 00040 ? : INVOKEVIRTUAL org/apache/drill/exec/vector/NullableBigIntVector$Mutator.set (IIJ)V 00041 R I I I J . . : :L5 00042 R I I I J . . : : LINENUMBER 259 L5 00043 R I I I J . . : : ALOAD 6 00044 ? : GETFIELD org/apache/drill/exec/expr/holders/NullableBigIntHolder.isSet : I 00045 ? : ICONST_0 00046 ? : IF_ICMPEQ L7 00047 ? :L8 00048 ? : LINENUMBER 260 L8 00049 ? : ALOAD 0 00050 ? : GETFIELD org/apache/drill/exec/test/generated/HashTableGen2$BatchHolder.vv27 : Lorg/apache/drill/exec/vector/NullableBigIntVector; 00051 ? : INVOKEVIRTUAL org/apache/drill/exec/vector/NullableBigIntVector.getMutator ()Lorg/apache/drill/exec/vector/NullableBigIntVector$Mutator; 00052 ? : ILOAD 2 00053 ? : ALOAD 6 00054 ? : GETFIELD org/apache/drill/exec/expr/holders/NullableBigIntHolder.isSet : I 00055 ? : ALOAD 6 00056 ? : GETFIELD org/apache/drill/exec/expr/holders/NullableBigIntHolder.value : J 00057 ? : INVOKEVIRTUAL org/apache/drill/exec/vector/NullableBigIntVector$Mutator.set (IIJ)V 00058 ? :L7 00059 ? : LINENUMBER 245 L7 00060 ? : RETURN 00061 ? :L9 when common subexpressions are eliminated (see DRILL-3912). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3912) Common subexpression elimination in code generation
[ https://issues.apache.org/jira/browse/DRILL-3912?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979747#comment-14979747 ] ASF GitHub Bot commented on DRILL-3912: --- Github user StevenMPhillips commented on the pull request: https://github.com/apache/drill/pull/189#issuecomment-152066327 @jinfengni I updated the PR. Could you take a look? > Common subexpression elimination in code generation > --- > > Key: DRILL-3912 > URL: https://issues.apache.org/jira/browse/DRILL-3912 > Project: Apache Drill > Issue Type: Improvement >Reporter: Steven Phillips >Assignee: Jinfeng Ni > > Drill currently will evaluate the full expression tree, even if there are > redundant subtrees. Many of these redundant evaluations can be eliminated by > reusing the results from previously evaluated expression trees. > For example, > {code} > select a + 1, (a + 1)* (a - 1) from t > {code} > Will compute the entire (a + 1) expression twice. With CSE, it will only be > evaluated once. > The benefit will be reducing the work done when evaluating expressions, as > well as reducing the amount of code that is generated, which could also lead > to better JIT optimization. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3993) Rebase Drill on Calcite 1.5.0 release
[ https://issues.apache.org/jira/browse/DRILL-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979636#comment-14979636 ] Sudheesh Katkam commented on DRILL-3993: [~julianhyde] I completely agree with you. I think my question could have been phrased better. I wanted to know if there are any documented steps that we take every time to catch up. And thanks to the Calcite community for allowing Drillers to check for regressions before a release; that's has been very helpful :) > Rebase Drill on Calcite 1.5.0 release > - > > Key: DRILL-3993 > URL: https://issues.apache.org/jira/browse/DRILL-3993 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.2.0 >Reporter: Sudheesh Katkam > > Calcite keeps moving, and now we need to catch up to Calcite 1.5, and ensure > there are no regressions. > Also, how do we resolve this 'catching up' issue in the long term? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979707#comment-14979707 ] ASF GitHub Bot commented on DRILL-3623: --- Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152059989 Just to add to my comment above, if you want to do a quick call or hangout to discuss I'm more than happy to. As I said above, it is possible I am misunderstanding. If so, I'll definitely revise my objection. > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979632#comment-14979632 ] ASF GitHub Bot commented on DRILL-3623: --- Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152047328 Interesting. Can you explain where the time is coming from? It isn't clear to me why this will have a big impact over what we had before. While you're pushing the limit down to just above the scan nodes, we already had an optimization which avoided parallelization. Since we're pipelined this really shouldn't matter much. Is limit zero not working right in the limit operator? It should terminate upon receiving schema, not wait until a batch of actual records (I'm wondering if it is doing the latter). Is sending zero records through causing operators to skip compilation? In what cases was this change taking something from hundreds of seconds to a few seconds? I'm asking these questions so I can better understand as I want to make sure there isn't a bug somewhere else. Thanks! > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979671#comment-14979671 ] ASF GitHub Bot commented on DRILL-3623: --- Github user sudheeshkatkam commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152053724 I think I see the source of confusion (sorry); this patch does not address that query in the JIRA, which is why Jinfeng asked me to change the title in one of his comments. Regarding that query, DRILL-3921 helps avoids most of the execution time, but we still incur the planning time. And my initial approach address this issue but as mentioned above, this is blocked by DRILL-2288 and other things. The new approach actually addresses any query that has a limit 0 above a blocking operator that consumes all records. And avoiding parallelization made the query much worse. (Actually, was fast-schema supposed to still kick in? Did not seem like it from my experiments.) I tested against a query like `SELECT * FROM (SELECT COUNT(DISTINCT a), COUNT(DISTINCT b), COUNT(DISTINCT c) FROM very_large_table) T LIMIT 0` and this completed two orders of magnitude faster. > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979701#comment-14979701 ] ASF GitHub Bot commented on DRILL-3623: --- Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152056732 I'm sorry to say that I'm -1 on this change It seems to be adding a planning rewrite rule where there should be a simple fix execution bug. Let's just fix the execution bug. Limit 0 should complete its execution the moment it receives a schema (as part of fast schema). It doesn't need to receive any records. You just described a situation where it is waiting for records from a blocking operator. That shouldn't be the case. If there is some other real benefit to this change after that execution bug is fixed, let's revisit in that light. If you think I'm misunderstanding your description of the execution behavior or the dynamics involved, please help me to better understand. > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979198#comment-14979198 ] ASF GitHub Bot commented on DRILL-3623: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/193#discussion_r43315255 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java --- @@ -46,6 +51,32 @@ public static boolean containsLimit0(RelNode rel) { return visitor.isContains(); } + public static DrillRel addLimitOnTopOfLeafNodes(final DrillRel rel) { +final RelShuttleImpl shuttle = new RelShuttleImpl() { + + private RelNode addLimitAsParent(RelNode node) { +final RexBuilder builder = node.getCluster().getRexBuilder(); +final RexLiteral offset = builder.makeExactLiteral(BigDecimal.ZERO); +final RexLiteral fetch = builder.makeExactLiteral(BigDecimal.ZERO); +return new DrillLimitRel(node.getCluster(), node.getTraitSet(), node, offset, fetch); --- End diff -- I understand in your case, you only putDrillLimitRel. But you may want to make this Visitor more general, such that it could create any kind of LimitRel, including DrillLimitRel, LogicalLimitRel, DrillLimitPrel, etc. You can do that by define a LimitFactory, and pass to this Visitor. This's similar to what other Calcite rule would do, to make the code more general. > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3992) Unable to query Oracle
Eric Roma created DRILL-3992: Summary: Unable to query Oracle Key: DRILL-3992 URL: https://issues.apache.org/jira/browse/DRILL-3992 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.2.0 Environment: Windows 7 Enterprise 64-bit, Oracle 10g, Teradata 15.00 Reporter: Eric Roma Priority: Minor Fix For: 1.2.0 *See External Issue URL for Stack Overflow Post* *Appears to be similar issue at http://stackoverflow.com/questions/33370438/apache-drill-1-2-and-sql-server-jdbc** Using Apache Drill v1.2 and Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit in embedded mode. I'm curious if anyone has had any success connecting Apache Drill to an Oracle DB. I've updated the drill-override.conf with the following configurations (per documents): drill.exec: { cluster-id: "drillbits1", zk.connect: "localhost:2181", drill.exec.sys.store.provider.local.path = "/mypath" } and placed the ojdbc6.jar in \apache-drill-1.2.0\jars\3rdparty. I can successfully create the storage plug-in: { "type": "jdbc", "driver": "oracle.jdbc.driver.OracleDriver", "url": "jdbc:oracle:thin:@::", "username": "USERNAME", "password": "PASSWORD", "enabled": true } but when I issue a query such as: select * from ..`dual`; I get the following error: Query Failed: An Error Occurred org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: From line 1, column 15 to line 1, column 20: Table '..dual' not found [Error Id: 57a4153c-6378-4026-b90c-9bb727e131ae on :]. I've tried to query other schema/tables and get a similar result. I've also tried connecting to Teradata and get the same error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3992) Unable to query Oracle DB using JDBC Storage Plug-In
[ https://issues.apache.org/jira/browse/DRILL-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Eric Roma updated DRILL-3992: - Description: *See External Issue URL for Stack Overflow Post* *Appears to be similar issue at http://stackoverflow.com/questions/33370438/apache-drill-1-2-and-sql-server-jdbc* Using Apache Drill v1.2 and Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit in embedded mode. I'm curious if anyone has had any success connecting Apache Drill to an Oracle DB. I've updated the drill-override.conf with the following configurations (per documents): drill.exec: { cluster-id: "drillbits1", zk.connect: "localhost:2181", drill.exec.sys.store.provider.local.path = "/mypath" } and placed the ojdbc6.jar in \apache-drill-1.2.0\jars\3rdparty. I can successfully create the storage plug-in: { "type": "jdbc", "driver": "oracle.jdbc.driver.OracleDriver", "url": "jdbc:oracle:thin:@::", "username": "USERNAME", "password": "PASSWORD", "enabled": true } but when I issue a query such as: select * from ..`dual`; I get the following error: Query Failed: An Error Occurred org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: From line 1, column 15 to line 1, column 20: Table '..dual' not found [Error Id: 57a4153c-6378-4026-b90c-9bb727e131ae on :]. I've tried to query other schema/tables and get a similar result. I've also tried connecting to Teradata and get the same error. was: *See External Issue URL for Stack Overflow Post* *Appears to be similar issue at http://stackoverflow.com/questions/33370438/apache-drill-1-2-and-sql-server-jdbc** Using Apache Drill v1.2 and Oracle Database 10g Enterprise Edition Release 10.2.0.4.0 - 64bit in embedded mode. I'm curious if anyone has had any success connecting Apache Drill to an Oracle DB. I've updated the drill-override.conf with the following configurations (per documents): drill.exec: { cluster-id: "drillbits1", zk.connect: "localhost:2181", drill.exec.sys.store.provider.local.path = "/mypath" } and placed the ojdbc6.jar in \apache-drill-1.2.0\jars\3rdparty. I can successfully create the storage plug-in: { "type": "jdbc", "driver": "oracle.jdbc.driver.OracleDriver", "url": "jdbc:oracle:thin:@::", "username": "USERNAME", "password": "PASSWORD", "enabled": true } but when I issue a query such as: select * from ..`dual`; I get the following error: Query Failed: An Error Occurred org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: From line 1, column 15 to line 1, column 20: Table '..dual' not found [Error Id: 57a4153c-6378-4026-b90c-9bb727e131ae on :]. I've tried to query other schema/tables and get a similar result. I've also tried connecting to Teradata and get the same error. Summary: Unable to query Oracle DB using JDBC Storage Plug-In (was: Unable to query Oracle ) > Unable to query Oracle DB using JDBC Storage Plug-In > > > Key: DRILL-3992 > URL: https://issues.apache.org/jira/browse/DRILL-3992 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.2.0 > Environment: Windows 7 Enterprise 64-bit, Oracle 10g, Teradata 15.00 >Reporter: Eric Roma >Priority: Minor > Labels: newbie > Fix For: 1.2.0 > > > *See External Issue URL for Stack Overflow Post* > *Appears to be similar issue at > http://stackoverflow.com/questions/33370438/apache-drill-1-2-and-sql-server-jdbc* > Using Apache Drill v1.2 and Oracle Database 10g Enterprise Edition Release > 10.2.0.4.0 - 64bit in embedded mode. > I'm curious if anyone has had any success connecting Apache Drill to an > Oracle DB. I've updated the drill-override.conf with the following > configurations (per documents): > drill.exec: { > cluster-id: "drillbits1", > zk.connect: "localhost:2181", > drill.exec.sys.store.provider.local.path = "/mypath" > } > and placed the ojdbc6.jar in \apache-drill-1.2.0\jars\3rdparty. I can > successfully create the storage plug-in: > { > "type": "jdbc", > "driver": "oracle.jdbc.driver.OracleDriver", > "url": "jdbc:oracle:thin:@::", > "username": "USERNAME", > "password": "PASSWORD", > "enabled": true > } > but when I issue a query such as: > select * from ..`dual`; > I get the following error: > Query Failed: An Error Occurred > org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: > From line 1, column 15 to line 1, column 20: Table > '..dual' not found [Error Id: > 57a4153c-6378-4026-b90c-9bb727e131ae on :]. > I've tried to query other schema/tables and get a similar result. I've also > tried connecting to Teradata and get the same error. -- This message was sent
[jira] [Commented] (DRILL-3983) Small test improvements
[ https://issues.apache.org/jira/browse/DRILL-3983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979289#comment-14979289 ] ASF GitHub Bot commented on DRILL-3983: --- Github user julienledem commented on the pull request: https://github.com/apache/drill/pull/221#issuecomment-152001036 @adeneche Please see last commit. I made the output printing configurable so that it is less verbose in tests. https://github.com/apache/drill/commit/9b40f93122eb22055e9ebec287e5a5ebfa65a2fe > Small test improvements > --- > > Key: DRILL-3983 > URL: https://issues.apache.org/jira/browse/DRILL-3983 > Project: Apache Drill > Issue Type: Test >Reporter: Julien Le Dem >Assignee: Julien Le Dem > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979209#comment-14979209 ] ASF GitHub Bot commented on DRILL-3623: --- Github user julianhyde commented on a diff in the pull request: https://github.com/apache/drill/pull/193#discussion_r43316183 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java --- @@ -46,6 +51,32 @@ public static boolean containsLimit0(RelNode rel) { return visitor.isContains(); } + public static DrillRel addLimitOnTopOfLeafNodes(final DrillRel rel) { +final RelShuttleImpl shuttle = new RelShuttleImpl() { + + private RelNode addLimitAsParent(RelNode node) { +final RexBuilder builder = node.getCluster().getRexBuilder(); +final RexLiteral offset = builder.makeExactLiteral(BigDecimal.ZERO); +final RexLiteral fetch = builder.makeExactLiteral(BigDecimal.ZERO); +return new DrillLimitRel(node.getCluster(), node.getTraitSet(), node, offset, fetch); --- End diff -- Agree with @jinfengni. In more recent versions of Calcite, use RelBuilder.limit() or .sortLimit(). The RelBuilder will be configured to create the appropriate Drill variants of all RelNodes. It might also do some useful canonization/optimization. We recommend using RelBuilder for most tasks involving creating RelNodes. > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979213#comment-14979213 ] ASF GitHub Bot commented on DRILL-3623: --- Github user jinfengni commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-151987787 Please modify the title of JIRA DRILL-3623, since the new pull request is using a completely different approach to address the performance issue for "LIMIT 0". > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979292#comment-14979292 ] ASF GitHub Bot commented on DRILL-3623: --- Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152001143 What happened to the original strategy of short circuiting on schema'd files. This approach still means we have to pay for all the operation compilations for no reason. > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979189#comment-14979189 ] ASF GitHub Bot commented on DRILL-3623: --- Github user jinfengni commented on a diff in the pull request: https://github.com/apache/drill/pull/193#discussion_r43314473 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java --- @@ -46,6 +51,32 @@ public static boolean containsLimit0(RelNode rel) { return visitor.isContains(); } + public static DrillRel addLimitOnTopOfLeafNodes(final DrillRel rel) { +final RelShuttleImpl shuttle = new RelShuttleImpl() { + + private RelNode addLimitAsParent(RelNode node) { +final RexBuilder builder = node.getCluster().getRexBuilder(); +final RexLiteral offset = builder.makeExactLiteral(BigDecimal.ZERO); +final RexLiteral fetch = builder.makeExactLiteral(BigDecimal.ZERO); +return new DrillLimitRel(node.getCluster(), node.getTraitSet(), node, offset, fetch); + } + + @Override + public RelNode visit(TableScan scan) { --- End diff -- You also need override visitValues, since Value could be leaf operator as well, in addition to TableScan. > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979348#comment-14979348 ] ASF GitHub Bot commented on DRILL-3623: --- Github user jinfengni commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152013964 The original approach (skipping the execution phase for limit 0 completely), actually could potentially have issues in some cases, due to the difference in Calcite rule and Drill execution rule, in terms of how type is determined. For example, sum(int) in calcite is resolved to be int, while in Drill execution, we changed to bigint. Another case is implicit cast. Currently, there are some small differences between Calcite and Drill execution. That means, if we skip the execution for limit 0, then types which are resolved in Calcite could be different from the type if the query goes through Drill execution. For BI tool like Tableau, that means the type returned from "limit 0" query and type from a second query w/o "limit 0" could be different. If we want to avoid the above issues, we have to detect all those cases, which are painful. That's why Sudheesh and I are now more inclined to this new approach. > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (DRILL-3871) Exception on inner join when join predicate is int96 field generated by impala
[ https://issues.apache.org/jira/browse/DRILL-3871?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Deneche A. Hakim reassigned DRILL-3871: --- Assignee: Deneche A. Hakim (was: Parth Chandra) > Exception on inner join when join predicate is int96 field generated by impala > -- > > Key: DRILL-3871 > URL: https://issues.apache.org/jira/browse/DRILL-3871 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types >Affects Versions: 1.2.0 >Reporter: Victoria Markman >Assignee: Deneche A. Hakim >Priority: Critical > Labels: int96 > Fix For: 1.3.0 > > Attachments: tables.tar > > > Both tables in the join where created by impala, with column c_timestamp > being parquet int96. > {code} > 0: jdbc:drill:schema=dfs> select > . . . . . . . . . . . . > max(t1.c_timestamp), > . . . . . . . . . . . . > min(t1.c_timestamp), > . . . . . . . . . . . . > count(t1.c_timestamp) > . . . . . . . . . . . . > from > . . . . . . . . . . . . > imp_t1 t1 > . . . . . . . . . . . . > inner join > . . . . . . . . . . . . > imp_t2 t2 > . . . . . . . . . . . . > on (t1.c_timestamp = t2.c_timestamp) > . . . . . . . . . . . . > ; > java.lang.RuntimeException: java.sql.SQLException: SYSTEM ERROR: > TProtocolException: Required field 'uncompressed_page_size' was not found in > serialized data! Struct: PageHeader(type:null, uncompressed_page_size:0, > compressed_page_size:0) > Fragment 0:0 > [Error Id: eb6a5df8-fc59-409b-957a-59cb1079b5b8 on atsqa4-133.qa.lab:31010] > at sqlline.IncrementalRows.hasNext(IncrementalRows.java:73) > at > sqlline.TableOutputFormat$ResizingRowsProvider.next(TableOutputFormat.java:87) > at sqlline.TableOutputFormat.print(TableOutputFormat.java:118) > at sqlline.SqlLine.print(SqlLine.java:1583) > at sqlline.Commands.execute(Commands.java:852) > at sqlline.Commands.sql(Commands.java:751) > at sqlline.SqlLine.dispatch(SqlLine.java:738) > at sqlline.SqlLine.begin(SqlLine.java:612) > at sqlline.SqlLine.start(SqlLine.java:366) > at sqlline.SqlLine.main(SqlLine.java:259) > {code} > drillbit.log > {code} > 2015-09-30 21:15:45,710 [29f3aefe-3209-a6e6-0418-500dac60a339:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Took 0 ms to get file statuses > 2015-09-30 21:15:45,712 [29f3aefe-3209-a6e6-0418-500dac60a339:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of > 1 using 1 threads. Time: 1ms total, 1.645381ms avg, 1ms max. > 2015-09-30 21:15:45,712 [29f3aefe-3209-a6e6-0418-500dac60a339:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of > 1 using 1 threads. Earliest start: 1.332000 μs, Latest start: 1.332000 μs, > Average start: 1.332000 μs . > 2015-09-30 21:15:45,830 [29f3aefe-3209-a6e6-0418-500dac60a339:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 29f3aefe-3209-a6e6-0418-500dac60a339:0:0: State change requested > AWAITING_ALLOCATION --> RUNNING > 2015-09-30 21:15:45,830 [29f3aefe-3209-a6e6-0418-500dac60a339:frag:0:0] INFO > o.a.d.e.w.f.FragmentStatusReporter - > 29f3aefe-3209-a6e6-0418-500dac60a339:0:0: State to report: RUNNING > 2015-09-30 21:15:45,925 [29f3aefe-3209-a6e6-0418-500dac60a339:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 29f3aefe-3209-a6e6-0418-500dac60a339:0:0: State change requested RUNNING --> > FAILED > 2015-09-30 21:15:45,930 [29f3aefe-3209-a6e6-0418-500dac60a339:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 29f3aefe-3209-a6e6-0418-500dac60a339:0:0: State change requested FAILED --> > FINISHED > 2015-09-30 21:15:45,931 [29f3aefe-3209-a6e6-0418-500dac60a339:frag:0:0] ERROR > o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: TProtocolException: > Required field 'uncompressed_page_size' was not found in serialized data! > Struct: PageHeader(type:null, uncompressed_page_size:0, > compressed_page_size:0) > Fragment 0:0 > [Error Id: eb6a5df8-fc59-409b-957a-59cb1079b5b8 on atsqa4-133.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > TProtocolException: Required field 'uncompressed_page_size' was not found in > serialized data! Struct: PageHeader(type:null, uncompressed_page_size:0, > compressed_page_size:0) > Fragment 0:0 > [Error Id: eb6a5df8-fc59-409b-957a-59cb1079b5b8 on atsqa4-133.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534) > ~[drill-common-1.2.0-SNAPSHOT.jar:1.2.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:323) >
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979395#comment-14979395 ] ASF GitHub Bot commented on DRILL-3623: --- Github user jacques-n commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152018974 Got it. Thanks for the explanation. So this is a hack until we can solve those issues. I think we have to do this work, however. a 1-2 second response to a limit 0 query is too much. We should open up jiras for all of these inconsistency issues and then get Calcite and Drill in alignment. What do you think we're talking about: aggregation outputs, implicit casting. What else? > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3993) Rebase Drill on Calcite 1.5.0 release
Sudheesh Katkam created DRILL-3993: -- Summary: Rebase Drill on Calcite 1.5.0 release Key: DRILL-3993 URL: https://issues.apache.org/jira/browse/DRILL-3993 Project: Apache Drill Issue Type: Bug Components: Query Planning & Optimization Affects Versions: 1.2.0 Reporter: Sudheesh Katkam Calcite keeps moving, and now we need to catch up to Calcite 1.5, and ensure there are no regressions. Also, how do we resolve this 'catching up' issue in the long term? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3993) Rebase Drill on Calcite 1.5.0 release
[ https://issues.apache.org/jira/browse/DRILL-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979474#comment-14979474 ] Jacques Nadeau commented on DRILL-3993: --- We just need to get off the fork. [~jni], can you outline what are the three main issues? Maybe [~sudheeshkatkam] can help resolve them. > Rebase Drill on Calcite 1.5.0 release > - > > Key: DRILL-3993 > URL: https://issues.apache.org/jira/browse/DRILL-3993 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.2.0 >Reporter: Sudheesh Katkam > > Calcite keeps moving, and now we need to catch up to Calcite 1.5, and ensure > there are no regressions. > Also, how do we resolve this 'catching up' issue in the long term? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (DRILL-3993) Rebase Drill on Calcite 1.5.0 release
[ https://issues.apache.org/jira/browse/DRILL-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979477#comment-14979477 ] Jacques Nadeau edited comment on DRILL-3993 at 10/28/15 11:32 PM: -- Actually, I think I remembered: * Schema Caching * * Validator * AbstractConverter (e.g. trait pull-up) was (Author: jnadeau): Actually, I think I remembered: Schema Caching * Validator AbstractConverter (e.g. trait pull-up) > Rebase Drill on Calcite 1.5.0 release > - > > Key: DRILL-3993 > URL: https://issues.apache.org/jira/browse/DRILL-3993 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.2.0 >Reporter: Sudheesh Katkam > > Calcite keeps moving, and now we need to catch up to Calcite 1.5, and ensure > there are no regressions. > Also, how do we resolve this 'catching up' issue in the long term? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3993) Rebase Drill on Calcite 1.5.0 release
[ https://issues.apache.org/jira/browse/DRILL-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979477#comment-14979477 ] Jacques Nadeau commented on DRILL-3993: --- Actually, I think I remembered: Schema Caching * Validator AbstractConverter (e.g. trait pull-up) > Rebase Drill on Calcite 1.5.0 release > - > > Key: DRILL-3993 > URL: https://issues.apache.org/jira/browse/DRILL-3993 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.2.0 >Reporter: Sudheesh Katkam > > Calcite keeps moving, and now we need to catch up to Calcite 1.5, and ensure > there are no regressions. > Also, how do we resolve this 'catching up' issue in the long term? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979502#comment-14979502 ] ASF GitHub Bot commented on DRILL-3623: --- Github user sudheeshkatkam commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152034356 Also, on the execution side, I was actually hitting [DRILL-2288](https://issues.apache.org/jira/browse/DRILL-2288), where sending exactly one batch with schema and without data is not handled correctly by various RecordBatches. With a fix for that issue, we could add further optimization for schemaed tables (i.e. add the previous implementation) with this implementation as the fallback. > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3963) Read raw key value bytes from sequence files
[ https://issues.apache.org/jira/browse/DRILL-3963?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979526#comment-14979526 ] ASF GitHub Bot commented on DRILL-3963: --- Github user amithadke commented on the pull request: https://github.com/apache/drill/pull/214#issuecomment-152036324 @sudheeshkatkam I've added changes and tests for sequence file and avro. they both use hadoop's api to create recordreader. Thanks for helping out with the test. > Read raw key value bytes from sequence files > > > Key: DRILL-3963 > URL: https://issues.apache.org/jira/browse/DRILL-3963 > Project: Apache Drill > Issue Type: New Feature >Reporter: amit hadke >Assignee: amit hadke > > Sequence files store list of key-value pairs. Keys/values are of type hadoop > writable. > Provide a format plugin that reads raw bytes out of sequence files which can > be further deserialized by a udf(from hadoop writable -> drill type) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3993) Rebase Drill on Calcite 1.5.0 release
[ https://issues.apache.org/jira/browse/DRILL-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979538#comment-14979538 ] Julian Hyde commented on DRILL-3993: [~sudheeshkatkam], "Catching up" is always necessary when you separate two components into modules and version them separately. Changes in module A don't break module B's nightly builds, but B needs to periodically sync up, at a time of its choosing. I think that we put a lot of valuable features into Calcite that benefit Drill (some of them contributed by people who are also Drill committers), and I think we do a pretty good job at controlling change, so that things that do not directly benefit Drill at least do not break it. For example, we follow semantic versioning and do not remove APIs except in a major release. We have discovered with other projects that asking the downstream projects to kick the tires of a Calcite release in the run-up to a release is an effective way to find problems, and efficient in terms of time and effort for both projects. If there is anything else we can do in Calcite do make the process more efficient for Drill, let me know. > Rebase Drill on Calcite 1.5.0 release > - > > Key: DRILL-3993 > URL: https://issues.apache.org/jira/browse/DRILL-3993 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.2.0 >Reporter: Sudheesh Katkam > > Calcite keeps moving, and now we need to catch up to Calcite 1.5, and ensure > there are no regressions. > Also, how do we resolve this 'catching up' issue in the long term? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3993) Rebase Drill on Calcite 1.5.0 release
[ https://issues.apache.org/jira/browse/DRILL-3993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979492#comment-14979492 ] Jinfeng Ni commented on DRILL-3993: --- [~jnadeau], that's the three major things between Drill's forked Calcite and Calcite master. I'm working on the fist one. The third one, AbstractConverter, is also one of the reason for Drill's long physical planning time in some cases. If we remove AbstractConverter and find way for trait pull-up, I believe we could see planning time would be reduced significantly. > Rebase Drill on Calcite 1.5.0 release > - > > Key: DRILL-3993 > URL: https://issues.apache.org/jira/browse/DRILL-3993 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.2.0 >Reporter: Sudheesh Katkam > > Calcite keeps moving, and now we need to catch up to Calcite 1.5, and ensure > there are no regressions. > Also, how do we resolve this 'catching up' issue in the long term? -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979500#comment-14979500 ] ASF GitHub Bot commented on DRILL-3623: --- Github user jinfengni commented on the pull request: https://github.com/apache/drill/pull/193#issuecomment-152034091 Sudheesh and I feel this new approach is more like a big optimization step towards solving the performance issue for "limit 0" query, rather than hack solution : 1) It shows quite significantly reduction in query time, from hundreds of seconds to couple of seconds in some cases. That's a big improvement. 2) it would benefit not only schema-based query, but also schema-less query, while the original approach would only apply for schema-based query. I agree we should continue to optimize "limit 0" query. But for now, I think this new approach has its own merits. The aggregation /implicit casting are the two things that I can think of, if we go with the schema-based approach. > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3987) Create a POC VV extraction
[ https://issues.apache.org/jira/browse/DRILL-3987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979562#comment-14979562 ] Parth Chandra commented on DRILL-3987: -- I was thinking that we would start with something equivalent to the Parquet-Format project that fixes the format in an implementation (and language) independent way. That way, we can update the C++ implementation to keep in sync with the Java implementation as well. Also, agree with Hanifi's comment (vii), above. A single immutable vector descriptor and a lazily built schema descriptor would be just right. > Create a POC VV extraction > -- > > Key: DRILL-3987 > URL: https://issues.apache.org/jira/browse/DRILL-3987 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Jacques Nadeau >Assignee: Jacques Nadeau > > I'd like to start by looking at an extraction that pulls out the base > concepts of: > buffer allocation, value vectors and complexwriter/fieldreader. > I need to figure out how to resolve some of the cross-dependency issues (such > as the jdbc accessor connections). -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-3994) Build Fails on Windows after DRILL-3742
Sudheesh Katkam created DRILL-3994: -- Summary: Build Fails on Windows after DRILL-3742 Key: DRILL-3994 URL: https://issues.apache.org/jira/browse/DRILL-3994 Project: Apache Drill Issue Type: Bug Components: Tools, Build & Test Reporter: Sudheesh Katkam Priority: Critical Build fails on Windows on the latest master: {code} c:\drill> mvn clean install -DskipTests ... [INFO] Rat check: Summary of files. Unapproved: 0 unknown: 0 generated: 0 approved: 169 licence. [INFO] [INFO] <<< exec-maven-plugin:1.2.1:java (default) < validate @ drill-common <<< [INFO] [INFO] --- exec-maven-plugin:1.2.1:java (default) @ drill-common --- SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder". SLF4J: Defaulting to no-operation (NOP) logger implementation SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further details. Scanning: C:\drill\common\target\classes [WARNING] java.lang.reflect.InvocationTargetException at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.codehaus.mojo.exec.ExecJavaMojo$1.run(ExecJavaMojo.java:297) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.IllegalArgumentException: file:C:/drill/common/target/classes/ not in [file:/C:/drill/common/target/classes/] at org.apache.drill.common.scanner.BuildTimeScan.main(BuildTimeScan.java:129) ... 6 more [INFO] [INFO] Reactor Summary: [INFO] [INFO] Apache Drill Root POM .. SUCCESS [ 10.016 s] [INFO] tools/Parent Pom ... SUCCESS [ 1.062 s] [INFO] tools/freemarker codegen tooling ... SUCCESS [ 6.922 s] [INFO] Drill Protocol . SUCCESS [ 10.062 s] [INFO] Common (Logical Plan, Base expressions) FAILURE [ 9.954 s] [INFO] contrib/Parent Pom . SKIPPED [INFO] contrib/data/Parent Pom SKIPPED [INFO] contrib/data/tpch-sample-data .. SKIPPED [INFO] exec/Parent Pom SKIPPED [INFO] exec/Java Execution Engine . SKIPPED [INFO] exec/JDBC Driver using dependencies SKIPPED [INFO] JDBC JAR with all dependencies . SKIPPED [INFO] contrib/mongo-storage-plugin ... SKIPPED [INFO] contrib/hbase-storage-plugin ... SKIPPED [INFO] contrib/jdbc-storage-plugin SKIPPED [INFO] contrib/hive-storage-plugin/Parent Pom . SKIPPED [INFO] contrib/hive-storage-plugin/hive-exec-shaded ... SKIPPED [INFO] contrib/hive-storage-plugin/core ... SKIPPED [INFO] contrib/drill-gis-plugin ... SKIPPED [INFO] Packaging and Distribution Assembly SKIPPED [INFO] contrib/sqlline SKIPPED [INFO] [INFO] BUILD FAILURE [INFO] [INFO] Total time: 38.813 s [INFO] Finished at: 2015-10-28T12:17:19-07:00 [INFO] Final Memory: 67M/466M [INFO] [ERROR] Failed to execute goal org.codehaus.mojo:exec-maven-plugin:1.2.1:java (default) on project drill-common: An exception occured while executing the Java class. null: InvocationTargetException: file:C:/drill/common/target/classes/ not in [file:/C:/drill/common/target/classes/] -> [Help 1] [ERROR] [ERROR] To see the full stack trace of the errors, re-run Maven with the -e switch. [ERROR] Re-run Maven using the -X switch to enable full debug logging. [ERROR] [ERROR] For more information about the errors and possible solutions, please read the following articles: [ERROR] [Help 1] http://cwiki.apache.org/confluence/display/MAVEN/MojoExecutionException [ERROR] [ERROR] After correcting the problems, you can resume the build with the command [ERROR] mvn -rf :drill-common {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (DRILL-3756) Consider loosening up the Maven checkstyle audit
[ https://issues.apache.org/jira/browse/DRILL-3756?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edmon Begoli closed DRILL-3756. --- Resolution: Fixed We agreed that this style check can stay. > Consider loosening up the Maven checkstyle audit > > > Key: DRILL-3756 > URL: https://issues.apache.org/jira/browse/DRILL-3756 > Project: Apache Drill > Issue Type: Wish > Components: Tools, Build & Test >Affects Versions: 1.1.0 > Environment: Maven build on any platform. >Reporter: Edmon Begoli >Priority: Minor > Fix For: Future > > Original Estimate: 1h > Remaining Estimate: 1h > > A space in javadoc before the end of line causes Maven build to fail on > checkstyle audit. > [INFO] --- maven-checkstyle-plugin:2.12.1:check (checkstyle-validation) @ > drill-java-exec --- > [INFO] Starting audit... > for example > /drill/exec/java-exec/src/main/java/org/apache/drill/exec/store/StoragePlugin.java:30: > Line matches the illegal pattern '\s+$'. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3726) Drill is not properly interpreting CRLF (0d0a). CR gets read as content.
[ https://issues.apache.org/jira/browse/DRILL-3726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Edmon Begoli updated DRILL-3726: Description: When we query the last attribute of a text file, we get missing characters. Looking at the row through Drill, a \r is included at the end of the last attribute. Looking in a text editor, it's not embedded into that attribute. I'm thinking that Drill is not interpreting CRLF (0d0a) as a new line, only the LF, resulting in the CR becoming part of the last attribute. was: When we query the last attribute of a text file, we get missing characters. Looking at the row through Drill, a \r is included at the end of the last attribute. Looking in a text editor, it's not embedded into that attribute. I'm thinking that Drill is not interpreting CRLF (0d0a) as a new line, only the LF, resulting in the CR becoming part of the last attribute. > Drill is not properly interpreting CRLF (0d0a). CR gets read as content. > > > Key: DRILL-3726 > URL: https://issues.apache.org/jira/browse/DRILL-3726 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Text & CSV >Affects Versions: 1.1.0 > Environment: Linux RHEL 6.6, OSX 10.9 >Reporter: Edmon Begoli > Fix For: Future > > Original Estimate: 120h > Remaining Estimate: 120h > > When we query the last attribute of a text file, we get missing characters. > Looking at the row through Drill, a \r is included at the end of the last > attribute. > Looking in a text editor, it's not embedded into that attribute. > I'm thinking that Drill is not interpreting CRLF (0d0a) as a new line, only > the LF, resulting in the CR becoming part of the last attribute. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3623) Hive query hangs with limit 0 clause
[ https://issues.apache.org/jira/browse/DRILL-3623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979610#comment-14979610 ] ASF GitHub Bot commented on DRILL-3623: --- Github user sudheeshkatkam commented on a diff in the pull request: https://github.com/apache/drill/pull/193#discussion_r43339305 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/planner/sql/handlers/FindLimit0Visitor.java --- @@ -46,6 +51,32 @@ public static boolean containsLimit0(RelNode rel) { return visitor.isContains(); } + public static DrillRel addLimitOnTopOfLeafNodes(final DrillRel rel) { +final RelShuttleImpl shuttle = new RelShuttleImpl() { + + private RelNode addLimitAsParent(RelNode node) { +final RexBuilder builder = node.getCluster().getRexBuilder(); +final RexLiteral offset = builder.makeExactLiteral(BigDecimal.ZERO); +final RexLiteral fetch = builder.makeExactLiteral(BigDecimal.ZERO); +return new DrillLimitRel(node.getCluster(), node.getTraitSet(), node, offset, fetch); --- End diff -- Thank you Julian, RelBuilder seems perfect for this case. Jinfeng, for now, making this visitor more general and using RelBuilder needs bigger changes, so I am adding a TODO(DRILL-3993). > Hive query hangs with limit 0 clause > > > Key: DRILL-3623 > URL: https://issues.apache.org/jira/browse/DRILL-3623 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Hive >Affects Versions: 1.1.0 > Environment: MapR cluster >Reporter: Andries Engelbrecht >Assignee: Jinfeng Ni > Fix For: Future > > > Running a select * from hive.table limit 0 does not return (hangs). > Select * from hive.table limit 1 works fine > Hive table is about 6GB with 330 files with parquet using snappy compression. > Data types are int, bigint, string and double. > Querying directory with parquet files through the DFS plugin works fine > select * from dfs.root.`/user/hive/warehouse/database/table` limit 0; -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3992) Unable to query Oracle DB using JDBC Storage Plug-In
[ https://issues.apache.org/jira/browse/DRILL-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979625#comment-14979625 ] Jacques Nadeau commented on DRILL-3992: --- Note that for reference, here are example configurations I was using to do initial testing: https://github.com/jacques-n/drill/blob/DRILL-3992/contrib/storage-jdbc/src/test/resources/bootstrap-storage-plugins.json > Unable to query Oracle DB using JDBC Storage Plug-In > > > Key: DRILL-3992 > URL: https://issues.apache.org/jira/browse/DRILL-3992 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.2.0 > Environment: Windows 7 Enterprise 64-bit, Oracle 10g, Teradata 15.00 >Reporter: Eric Roma >Priority: Minor > Labels: newbie > Fix For: 1.2.0 > > > *See External Issue URL for Stack Overflow Post* > *Appears to be similar issue at > http://stackoverflow.com/questions/33370438/apache-drill-1-2-and-sql-server-jdbc* > Using Apache Drill v1.2 and Oracle Database 10g Enterprise Edition Release > 10.2.0.4.0 - 64bit in embedded mode. > I'm curious if anyone has had any success connecting Apache Drill to an > Oracle DB. I've updated the drill-override.conf with the following > configurations (per documents): > drill.exec: { > cluster-id: "drillbits1", > zk.connect: "localhost:2181", > drill.exec.sys.store.provider.local.path = "/mypath" > } > and placed the ojdbc6.jar in \apache-drill-1.2.0\jars\3rdparty. I can > successfully create the storage plug-in: > { > "type": "jdbc", > "driver": "oracle.jdbc.driver.OracleDriver", > "url": "jdbc:oracle:thin:@::", > "username": "USERNAME", > "password": "PASSWORD", > "enabled": true > } > but when I issue a query such as: > select * from ..`dual`; > I get the following error: > Query Failed: An Error Occurred > org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: > From line 1, column 15 to line 1, column 20: Table > '..dual' not found [Error Id: > 57a4153c-6378-4026-b90c-9bb727e131ae on :]. > I've tried to query other schema/tables and get a similar result. I've also > tried connecting to Teradata and get the same error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3992) Unable to query Oracle DB using JDBC Storage Plug-In
[ https://issues.apache.org/jira/browse/DRILL-3992?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14979623#comment-14979623 ] Jacques Nadeau commented on DRILL-3992: --- I have a fix which I believe resolves this issue. You can try it out by checking out the following commit and building Drill. https://github.com/jacques-n/drill/commit/b6a502652c8a8273802b79061b761d866871959b Let me know if this resolves your problem. > Unable to query Oracle DB using JDBC Storage Plug-In > > > Key: DRILL-3992 > URL: https://issues.apache.org/jira/browse/DRILL-3992 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning & Optimization >Affects Versions: 1.2.0 > Environment: Windows 7 Enterprise 64-bit, Oracle 10g, Teradata 15.00 >Reporter: Eric Roma >Priority: Minor > Labels: newbie > Fix For: 1.2.0 > > > *See External Issue URL for Stack Overflow Post* > *Appears to be similar issue at > http://stackoverflow.com/questions/33370438/apache-drill-1-2-and-sql-server-jdbc* > Using Apache Drill v1.2 and Oracle Database 10g Enterprise Edition Release > 10.2.0.4.0 - 64bit in embedded mode. > I'm curious if anyone has had any success connecting Apache Drill to an > Oracle DB. I've updated the drill-override.conf with the following > configurations (per documents): > drill.exec: { > cluster-id: "drillbits1", > zk.connect: "localhost:2181", > drill.exec.sys.store.provider.local.path = "/mypath" > } > and placed the ojdbc6.jar in \apache-drill-1.2.0\jars\3rdparty. I can > successfully create the storage plug-in: > { > "type": "jdbc", > "driver": "oracle.jdbc.driver.OracleDriver", > "url": "jdbc:oracle:thin:@::", > "username": "USERNAME", > "password": "PASSWORD", > "enabled": true > } > but when I issue a query such as: > select * from ..`dual`; > I get the following error: > Query Failed: An Error Occurred > org.apache.drill.common.exceptions.UserRemoteException: VALIDATION ERROR: > From line 1, column 15 to line 1, column 20: Table > '..dual' not found [Error Id: > 57a4153c-6378-4026-b90c-9bb727e131ae on :]. > I've tried to query other schema/tables and get a similar result. I've also > tried connecting to Teradata and get the same error. -- This message was sent by Atlassian JIRA (v6.3.4#6332)