[jira] [Commented] (DRILL-4155) Query with two-way join and flatten fails with "IllegalArgumentException: maxCapacity"
[ https://issues.apache.org/jira/browse/DRILL-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15037103#comment-15037103 ] Abhishek Girish commented on DRILL-4155: Looks like the profile doesn't get created (or possibly I couldn't find it). > Query with two-way join and flatten fails with "IllegalArgumentException: > maxCapacity" > -- > > Key: DRILL-4155 > URL: https://issues.apache.org/jira/browse/DRILL-4155 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow, Storage - JSON >Reporter: Abhishek Girish > Attachments: drillbit.log.txt > > > The following query on the Yelp Academic dataset fails to execute: > {code} > select u.name, b.name , flatten(b.categories) from > maprfs.yelp_tutorial.`yelp_academic_dataset_user.json` u, > maprfs.yelp_tutorial.`yelp_academic_dataset_business.json` b where > u.average_stars = b.stars limit 10 > Query Failed: An Error Occurred > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IllegalArgumentException: maxCapacity: -104845 (expected: >= 0) Fragment 1:0 > [Error Id: b0d99a6c-3434-49ce-8aa6-181993cdd853 on atsqa6c62.qa.lab:31010] > {code} > Tried on multiple setups in distributed mode - consistently fails. > Dataset can be accessed from : > https://s3.amazonaws.com/apache-drill/files/yelp.tgz (uncompressed tar > archive) > Log attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4155) Query with two-way join and flatten fails with "IllegalArgumentException: maxCapacity"
[ https://issues.apache.org/jira/browse/DRILL-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-4155: --- Attachment: (was: profile.json.txt) > Query with two-way join and flatten fails with "IllegalArgumentException: > maxCapacity" > -- > > Key: DRILL-4155 > URL: https://issues.apache.org/jira/browse/DRILL-4155 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow, Storage - JSON >Reporter: Abhishek Girish > Attachments: drillbit.log.txt > > > The following query on the Yelp Academic dataset fails to execute: > {code} > select u.name, b.name , flatten(b.categories) from > maprfs.yelp_tutorial.`yelp_academic_dataset_user.json` u, > maprfs.yelp_tutorial.`yelp_academic_dataset_business.json` b where > u.average_stars = b.stars limit 10 > Query Failed: An Error Occurred > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IllegalArgumentException: maxCapacity: -104845 (expected: >= 0) Fragment 1:0 > [Error Id: b0d99a6c-3434-49ce-8aa6-181993cdd853 on atsqa6c62.qa.lab:31010] > {code} > Tried on multiple setups in distributed mode - consistently fails. > Dataset can be accessed from : > https://s3.amazonaws.com/apache-drill/files/yelp.tgz (uncompressed tar > archive) > Log attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4155) Query with two-way join and flatten fails with "IllegalArgumentException: maxCapacity"
[ https://issues.apache.org/jira/browse/DRILL-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-4155: --- Attachment: profile.json.txt > Query with two-way join and flatten fails with "IllegalArgumentException: > maxCapacity" > -- > > Key: DRILL-4155 > URL: https://issues.apache.org/jira/browse/DRILL-4155 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow, Storage - JSON >Reporter: Abhishek Girish > Attachments: drillbit.log.txt, profile.json.txt > > > The following query on the Yelp Academic dataset fails to execute: > {code} > select u.name, b.name , flatten(b.categories) from > maprfs.yelp_tutorial.`yelp_academic_dataset_user.json` u, > maprfs.yelp_tutorial.`yelp_academic_dataset_business.json` b where > u.average_stars = b.stars limit 10 > Query Failed: An Error Occurred > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IllegalArgumentException: maxCapacity: -104845 (expected: >= 0) Fragment 1:0 > [Error Id: b0d99a6c-3434-49ce-8aa6-181993cdd853 on atsqa6c62.qa.lab:31010] > {code} > Tried on multiple setups in distributed mode - consistently fails. > Dataset can be accessed from : > https://s3.amazonaws.com/apache-drill/files/yelp.tgz (uncompressed tar > archive) > Log attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4155) Query with two-way join and flatten fails with "IllegalArgumentException: maxCapacity"
[ https://issues.apache.org/jira/browse/DRILL-4155?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-4155: --- Attachment: drillbit.log.txt > Query with two-way join and flatten fails with "IllegalArgumentException: > maxCapacity" > -- > > Key: DRILL-4155 > URL: https://issues.apache.org/jira/browse/DRILL-4155 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow, Storage - JSON >Reporter: Abhishek Girish > Attachments: drillbit.log.txt > > > The following query on the Yelp Academic dataset fails to execute: > {code} > select u.name, b.name , flatten(b.categories) from > maprfs.yelp_tutorial.`yelp_academic_dataset_user.json` u, > maprfs.yelp_tutorial.`yelp_academic_dataset_business.json` b where > u.average_stars = b.stars limit 10 > Query Failed: An Error Occurred > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IllegalArgumentException: maxCapacity: -104845 (expected: >= 0) Fragment 1:0 > [Error Id: b0d99a6c-3434-49ce-8aa6-181993cdd853 on atsqa6c62.qa.lab:31010] > {code} > Tried on multiple setups in distributed mode - consistently fails. > Dataset can be accessed from : > https://s3.amazonaws.com/apache-drill/files/yelp.tgz (uncompressed tar > archive) > Log attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4155) Query with two-way join and flatten fails with "IllegalArgumentException: maxCapacity"
Abhishek Girish created DRILL-4155: -- Summary: Query with two-way join and flatten fails with "IllegalArgumentException: maxCapacity" Key: DRILL-4155 URL: https://issues.apache.org/jira/browse/DRILL-4155 Project: Apache Drill Issue Type: Bug Components: Execution - Flow, Storage - JSON Reporter: Abhishek Girish Attachments: drillbit.log.txt The following query on the Yelp Academic dataset fails to execute: {code} select u.name, b.name , flatten(b.categories) from maprfs.yelp_tutorial.`yelp_academic_dataset_user.json` u, maprfs.yelp_tutorial.`yelp_academic_dataset_business.json` b where u.average_stars = b.stars limit 10 Query Failed: An Error Occurred org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: IllegalArgumentException: maxCapacity: -104845 (expected: >= 0) Fragment 1:0 [Error Id: b0d99a6c-3434-49ce-8aa6-181993cdd853 on atsqa6c62.qa.lab:31010] {code} Tried on multiple setups in distributed mode - consistently fails. Dataset can be accessed from : https://s3.amazonaws.com/apache-drill/files/yelp.tgz (uncompressed tar archive) Log attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4154) Metadata Caching : Upgrading cache to v2 from v1 corrupts the cache in some scenarios
[ https://issues.apache.org/jira/browse/DRILL-4154?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Rahul Challapalli updated DRILL-4154: - Attachment: fewtypes_varcharpartition.tar.tgz old-cache.txt broken-cache.txt Also I removed the cache file from a directory and copied another cache file. The data in the directory has not been modified. Now when I run a query over that directory, I see that the cache file is updated. to my knowledge this should not happen. Am I missing something? > Metadata Caching : Upgrading cache to v2 from v1 corrupts the cache in some > scenarios > - > > Key: DRILL-4154 > URL: https://issues.apache.org/jira/browse/DRILL-4154 > Project: Apache Drill > Issue Type: Bug >Reporter: Rahul Challapalli >Priority: Critical > Attachments: broken-cache.txt, fewtypes_varcharpartition.tar.tgz, > old-cache.txt > > > git.commit.id.abbrev=46c47a2 > I copied the data along with the cache file onto maprfs. Now I ran the > upgrade tool (https://github.com/parthchandra/drill-upgrade). Now I ran the > metadata_caching suite from the functional tests (concurrency 10) without the > datagen phase. I see 3 test failures and when I looked at the cache file it > seems to be containing wrong information for the varchar column. > Sample from the cache : > {code} > { > "name" : [ "varchar_col" ] > }, { > "name" : [ "float_col" ], > "mxValue" : 68797.22, > "nulls" : 0 > } > {code} > Now I followed the same steps and instead of running the suites I executed > the "REFRESH TABLE METADATA" command or any query on that folder, the cache > file seems to be created properly > I attached the data and cache files required. Let me know if you need anything -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4154) Metadata Caching : Upgrading cache to v2 from v1 corrupts the cache in some scenarios
Rahul Challapalli created DRILL-4154: Summary: Metadata Caching : Upgrading cache to v2 from v1 corrupts the cache in some scenarios Key: DRILL-4154 URL: https://issues.apache.org/jira/browse/DRILL-4154 Project: Apache Drill Issue Type: Bug Reporter: Rahul Challapalli Priority: Critical git.commit.id.abbrev=46c47a2 I copied the data along with the cache file onto maprfs. Now I ran the upgrade tool (https://github.com/parthchandra/drill-upgrade). Now I ran the metadata_caching suite from the functional tests (concurrency 10) without the datagen phase. I see 3 test failures and when I looked at the cache file it seems to be containing wrong information for the varchar column. Sample from the cache : {code} { "name" : [ "varchar_col" ] }, { "name" : [ "float_col" ], "mxValue" : 68797.22, "nulls" : 0 } {code} Now I followed the same steps and instead of running the suites I executed the "REFRESH TABLE METADATA" command or any query on that folder, the cache file seems to be created properly I attached the data and cache files required. Let me know if you need anything -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4109) NPE in RecordIterator
[ https://issues.apache.org/jira/browse/DRILL-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036994#comment-15036994 ] ASF GitHub Bot commented on DRILL-4109: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/282 > NPE in RecordIterator > - > > Key: DRILL-4109 > URL: https://issues.apache.org/jira/browse/DRILL-4109 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.4.0 >Reporter: Victoria Markman >Assignee: amit hadke >Priority: Blocker > Fix For: 1.4.0 > > Attachments: 29ac6c1b-9b33-3457-8bc8-9e2dff6ad438.sys.drill, > 29b41f37-4803-d7ce-e05f-912d1f65da79.sys.drill, drillbit.log, > drillbit.log.debug > > > 4 node cluster > 36GB of direct memory > 4GB heap memory > planner.memory.max_query_memory_per_node=2GB (default) > planner.enable_hashjoin = false > Spill directory has 6.4T of memory available: > {noformat} > [Tue Nov 17 18:23:18 /tmp/drill ] # df -H . > Filesystem Size Used Avail Use% Mounted on > localhost:/mapr 7.7T 1.4T 6.4T 18% /mapr > {noformat} > Run query below: > framework/resources/Advanced/tpcds/tpcds_sf100/original/query15.sql > drillbit.log > {code} > 2015-11-18 02:22:12,639 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:9] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Merging and spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_9/operator_17/7 > 2015-11-18 02:22:12,770 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:5] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Merging and spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_5/operator_17/7 > 2015-11-18 02:22:13,345 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:17] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_17/operator_17/7 > 2015-11-18 02:22:13,346 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_13/operator_16/1 > 2015-11-18 02:22:13,346 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] WARN > o.a.d.e.p.i.xsort.ExternalSortBatch - Starting to merge. 34 batch groups. > Current allocated memory: 2252186 > 2015-11-18 02:22:13,363 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 29b41f37-4803-d7ce-e05f-912d1f65da79:3:13: State change requested RUNNING --> > FAILED > 2015-11-18 02:22:13,370 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 29b41f37-4803-d7ce-e05f-912d1f65da79:3:13: State change requested FAILED --> > FINISHED > 2015-11-18 02:22:13,371 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] > ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException > Fragment 3:13 > [Error Id: c5d67dcb-16aa-4951-89f5-599b4b4eb54d on atsqa4-133.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > NullPointerException > Fragment 3:13 > [Error Id: c5d67dcb-16aa-4951-89f5-599b4b4eb54d on atsqa4-133.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534) > ~[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:321) > [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:184) > [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:290) > [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_71] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_71] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] > java.lang.NullPointerException: null > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4109) NPE in RecordIterator
[ https://issues.apache.org/jira/browse/DRILL-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036959#comment-15036959 ] ASF GitHub Bot commented on DRILL-4109: --- Github user StevenMPhillips commented on the pull request: https://github.com/apache/drill/pull/282#issuecomment-161478598 +1 > NPE in RecordIterator > - > > Key: DRILL-4109 > URL: https://issues.apache.org/jira/browse/DRILL-4109 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.4.0 >Reporter: Victoria Markman >Assignee: amit hadke >Priority: Blocker > Fix For: 1.4.0 > > Attachments: 29ac6c1b-9b33-3457-8bc8-9e2dff6ad438.sys.drill, > 29b41f37-4803-d7ce-e05f-912d1f65da79.sys.drill, drillbit.log, > drillbit.log.debug > > > 4 node cluster > 36GB of direct memory > 4GB heap memory > planner.memory.max_query_memory_per_node=2GB (default) > planner.enable_hashjoin = false > Spill directory has 6.4T of memory available: > {noformat} > [Tue Nov 17 18:23:18 /tmp/drill ] # df -H . > Filesystem Size Used Avail Use% Mounted on > localhost:/mapr 7.7T 1.4T 6.4T 18% /mapr > {noformat} > Run query below: > framework/resources/Advanced/tpcds/tpcds_sf100/original/query15.sql > drillbit.log > {code} > 2015-11-18 02:22:12,639 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:9] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Merging and spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_9/operator_17/7 > 2015-11-18 02:22:12,770 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:5] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Merging and spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_5/operator_17/7 > 2015-11-18 02:22:13,345 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:17] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_17/operator_17/7 > 2015-11-18 02:22:13,346 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_13/operator_16/1 > 2015-11-18 02:22:13,346 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] WARN > o.a.d.e.p.i.xsort.ExternalSortBatch - Starting to merge. 34 batch groups. > Current allocated memory: 2252186 > 2015-11-18 02:22:13,363 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 29b41f37-4803-d7ce-e05f-912d1f65da79:3:13: State change requested RUNNING --> > FAILED > 2015-11-18 02:22:13,370 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 29b41f37-4803-d7ce-e05f-912d1f65da79:3:13: State change requested FAILED --> > FINISHED > 2015-11-18 02:22:13,371 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] > ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException > Fragment 3:13 > [Error Id: c5d67dcb-16aa-4951-89f5-599b4b4eb54d on atsqa4-133.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > NullPointerException > Fragment 3:13 > [Error Id: c5d67dcb-16aa-4951-89f5-599b4b4eb54d on atsqa4-133.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534) > ~[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:321) > [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:184) > [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:290) > [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_71] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_71] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] > java.lang.NullPointerException: null > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Comment Edited] (DRILL-3572) Provide a simple interface to append metadata to files and directories (.drill)
[ https://issues.apache.org/jira/browse/DRILL-3572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036898#comment-15036898 ] Julien Le Dem edited comment on DRILL-3572 at 12/2/15 11:52 PM: I created separate sub-tickets for several aspects of the dotdrill file. Each one of them can be implemented independently assuming they are separate fields in a {{.drill}} file: {noformat} { version: ... format: { ... }, schema: { ... }, error_handling: { ... } } {noformat} was (Author: julienledem): I created separate sub-tickets for several aspects of the dotdrill file. Each one of them can be implemented independently assuming they are separate fields in a {{.drill}} file: {noformat} { version: ... format: { ... }, schema: { ... }, error_handling: { ... } } {noformat} > Provide a simple interface to append metadata to files and directories > (.drill) > --- > > Key: DRILL-3572 > URL: https://issues.apache.org/jira/browse/DRILL-3572 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Other >Reporter: Jacques Nadeau > Fix For: Future > > > We need a way to store small amounts of metadata about a file or a collection > of files. The current thinking was a way to have a "dot drill file" that > ascribes metadata to a particular asset. > Initial example file might be something that includes the following: > {code} > { > // Drill version identifier > version: "dd1" > > // Format Plugin Configuration > format: { > type: "httpd", > format: "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" > \"%{Cookie}i\""} > }, > > // Traits of underlying data (a.k.a physical properties) > traits: [ // traits of the underlying data > {type: "sort_nulls_first", columns: ["request.uri", "client.host"]} > {type: "unique", columns ["abc"]} > {type: "unqiue", columns ["xy", "zz"]} > ], > > // Mappings between directory names and exposed columns > dirs: [ > {skip: true}, // don't include this directory name in the directory path. > {name: "year", type: "integer"}, > {name: "month", type: "integer"}, > {name: "day", type: "integer"} > ], > // whether or not a user can add new columns to the table through insert > rigid_table: true > > } > {code} > We also need to support adding more machine-generated/managed data such as > statistics. That should be done using a separate file from the one that is > human description. > A user should be able to ascribe this metadata directly through the file > system as well as through sql commands such as > {code} > ALTER TABLE ADD METADATA ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-3572) Provide a simple interface to append metadata to files and directories (.drill)
[ https://issues.apache.org/jira/browse/DRILL-3572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036898#comment-15036898 ] Julien Le Dem commented on DRILL-3572: -- I created separate sub-tickets for several aspects of the dotdrill file. Each one of them can be implemented independently assuming they are separate fields in a {{.drill}} file: {noformat} { version: ... format: { ... }, schema: { ... }, error_handling: { ... } } {noformat} > Provide a simple interface to append metadata to files and directories > (.drill) > --- > > Key: DRILL-3572 > URL: https://issues.apache.org/jira/browse/DRILL-3572 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Other >Reporter: Jacques Nadeau > Fix For: Future > > > We need a way to store small amounts of metadata about a file or a collection > of files. The current thinking was a way to have a "dot drill file" that > ascribes metadata to a particular asset. > Initial example file might be something that includes the following: > {code} > { > // Drill version identifier > version: "dd1" > > // Format Plugin Configuration > format: { > type: "httpd", > format: "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" > \"%{Cookie}i\""} > }, > > // Traits of underlying data (a.k.a physical properties) > traits: [ // traits of the underlying data > {type: "sort_nulls_first", columns: ["request.uri", "client.host"]} > {type: "unique", columns ["abc"]} > {type: "unqiue", columns ["xy", "zz"]} > ], > > // Mappings between directory names and exposed columns > dirs: [ > {skip: true}, // don't include this directory name in the directory path. > {name: "year", type: "integer"}, > {name: "month", type: "integer"}, > {name: "day", type: "integer"} > ], > // whether or not a user can add new columns to the table through insert > rigid_table: true > > } > {code} > We also need to support adding more machine-generated/managed data such as > statistics. That should be done using a separate file from the one that is > human description. > A user should be able to ascribe this metadata directly through the file > system as well as through sql commands such as > {code} > ALTER TABLE ADD METADATA ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4066) support for format in dot drill file
[ https://issues.apache.org/jira/browse/DRILL-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036891#comment-15036891 ] Julien Le Dem commented on DRILL-4066: -- I'm currently not working on this as I'm focusing on something else for now. I did a little bit of investigation on the feasibility. My take is the DynamicDrillTable can have a compound selection object with different FormatPlugin associations for different subset of the paths to read. That will allow using multiple FormatPlugins in case all the dot drill files do not point to the same one. This is useful when the format has changed over time. > support for format in dot drill file > > > Key: DRILL-4066 > URL: https://issues.apache.org/jira/browse/DRILL-4066 > Project: Apache Drill > Issue Type: Sub-task > Components: Storage - Other >Reporter: Julien Le Dem > Fix For: Future > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4066) support for format in dot drill file
[ https://issues.apache.org/jira/browse/DRILL-4066?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem updated DRILL-4066: - Assignee: (was: Julien Le Dem) > support for format in dot drill file > > > Key: DRILL-4066 > URL: https://issues.apache.org/jira/browse/DRILL-4066 > Project: Apache Drill > Issue Type: Sub-task > Components: Storage - Other >Reporter: Julien Le Dem > Fix For: Future > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-3572) Provide a simple interface to append metadata to files and directories (.drill)
[ https://issues.apache.org/jira/browse/DRILL-3572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Julien Le Dem updated DRILL-3572: - Assignee: (was: Julien Le Dem) > Provide a simple interface to append metadata to files and directories > (.drill) > --- > > Key: DRILL-3572 > URL: https://issues.apache.org/jira/browse/DRILL-3572 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Other >Reporter: Jacques Nadeau > Fix For: Future > > > We need a way to store small amounts of metadata about a file or a collection > of files. The current thinking was a way to have a "dot drill file" that > ascribes metadata to a particular asset. > Initial example file might be something that includes the following: > {code} > { > // Drill version identifier > version: "dd1" > > // Format Plugin Configuration > format: { > type: "httpd", > format: "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\" > \"%{Cookie}i\""} > }, > > // Traits of underlying data (a.k.a physical properties) > traits: [ // traits of the underlying data > {type: "sort_nulls_first", columns: ["request.uri", "client.host"]} > {type: "unique", columns ["abc"]} > {type: "unqiue", columns ["xy", "zz"]} > ], > > // Mappings between directory names and exposed columns > dirs: [ > {skip: true}, // don't include this directory name in the directory path. > {name: "year", type: "integer"}, > {name: "month", type: "integer"}, > {name: "day", type: "integer"} > ], > // whether or not a user can add new columns to the table through insert > rigid_table: true > > } > {code} > We also need to support adding more machine-generated/managed data such as > statistics. That should be done using a separate file from the one that is > human description. > A user should be able to ascribe this metadata directly through the file > system as well as through sql commands such as > {code} > ALTER TABLE ADD METADATA ... > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4109) NPE in RecordIterator
[ https://issues.apache.org/jira/browse/DRILL-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036883#comment-15036883 ] Venki Korukanti commented on DRILL-4109: I just got to know that Vicky is OOF today. I will check if anybody else has the same configuration to verify. > NPE in RecordIterator > - > > Key: DRILL-4109 > URL: https://issues.apache.org/jira/browse/DRILL-4109 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.4.0 >Reporter: Victoria Markman >Assignee: amit hadke >Priority: Blocker > Fix For: 1.4.0 > > Attachments: 29ac6c1b-9b33-3457-8bc8-9e2dff6ad438.sys.drill, > 29b41f37-4803-d7ce-e05f-912d1f65da79.sys.drill, drillbit.log, > drillbit.log.debug > > > 4 node cluster > 36GB of direct memory > 4GB heap memory > planner.memory.max_query_memory_per_node=2GB (default) > planner.enable_hashjoin = false > Spill directory has 6.4T of memory available: > {noformat} > [Tue Nov 17 18:23:18 /tmp/drill ] # df -H . > Filesystem Size Used Avail Use% Mounted on > localhost:/mapr 7.7T 1.4T 6.4T 18% /mapr > {noformat} > Run query below: > framework/resources/Advanced/tpcds/tpcds_sf100/original/query15.sql > drillbit.log > {code} > 2015-11-18 02:22:12,639 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:9] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Merging and spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_9/operator_17/7 > 2015-11-18 02:22:12,770 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:5] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Merging and spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_5/operator_17/7 > 2015-11-18 02:22:13,345 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:17] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_17/operator_17/7 > 2015-11-18 02:22:13,346 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_13/operator_16/1 > 2015-11-18 02:22:13,346 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] WARN > o.a.d.e.p.i.xsort.ExternalSortBatch - Starting to merge. 34 batch groups. > Current allocated memory: 2252186 > 2015-11-18 02:22:13,363 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 29b41f37-4803-d7ce-e05f-912d1f65da79:3:13: State change requested RUNNING --> > FAILED > 2015-11-18 02:22:13,370 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 29b41f37-4803-d7ce-e05f-912d1f65da79:3:13: State change requested FAILED --> > FINISHED > 2015-11-18 02:22:13,371 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] > ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException > Fragment 3:13 > [Error Id: c5d67dcb-16aa-4951-89f5-599b4b4eb54d on atsqa4-133.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > NullPointerException > Fragment 3:13 > [Error Id: c5d67dcb-16aa-4951-89f5-599b4b4eb54d on atsqa4-133.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534) > ~[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:321) > [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:184) > [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:290) > [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_71] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_71] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] > java.lang.NullPointerException: null > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4134) Incorporate remaining patches from DRILL-1942 Allocator refactor
[ https://issues.apache.org/jira/browse/DRILL-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036869#comment-15036869 ] ASF GitHub Bot commented on DRILL-4134: --- Github user adeneche commented on a diff in the pull request: https://github.com/apache/drill/pull/283#discussion_r46493263 --- Diff: exec/memory/base/src/main/java/io/netty/buffer/DrillBuf.java --- @@ -230,20 +249,31 @@ public synchronized boolean release() { */ @Override public synchronized boolean release(int decrement) { +if (isEmpty) { + return false; +} -if(rootBuffer){ - final long newRefCnt = this.rootRefCnt.addAndGet(-decrement); - Preconditions.checkArgument(newRefCnt > -1, "Buffer has negative reference count."); - if (newRefCnt == 0) { -b.release(decrement); -acct.release(this, length); -return true; - }else{ -return false; - } -}else{ - return b.release(decrement); +if (decrement < 1) { + throw new IllegalStateException(String.format("release(%d) argument is not positive. Buffer Info: %s", + decrement, toVerboseString())); } + +if (BaseAllocator.DEBUG) { + historicalLog.recordEvent("release(%d)", decrement); --- End diff -- I think it would be helpful to also record the new `refCnt` > Incorporate remaining patches from DRILL-1942 Allocator refactor > > > Key: DRILL-4134 > URL: https://issues.apache.org/jira/browse/DRILL-4134 > Project: Apache Drill > Issue Type: Sub-task > Components: Execution - Flow >Reporter: Jacques Nadeau >Assignee: Jacques Nadeau > Fix For: 1.4.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-4124) Make all uses of AutoCloseables use addSuppressed exceptions to avoid noise in logs
[ https://issues.apache.org/jira/browse/DRILL-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venki Korukanti resolved DRILL-4124. Resolution: Fixed Fix Version/s: 1.4.0 > Make all uses of AutoCloseables use addSuppressed exceptions to avoid noise > in logs > --- > > Key: DRILL-4124 > URL: https://issues.apache.org/jira/browse/DRILL-4124 > Project: Apache Drill > Issue Type: Improvement >Reporter: Julien Le Dem >Assignee: Julien Le Dem > Fix For: 1.4.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4124) Make all uses of AutoCloseables use addSuppressed exceptions to avoid noise in logs
[ https://issues.apache.org/jira/browse/DRILL-4124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036859#comment-15036859 ] ASF GitHub Bot commented on DRILL-4124: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/281 > Make all uses of AutoCloseables use addSuppressed exceptions to avoid noise > in logs > --- > > Key: DRILL-4124 > URL: https://issues.apache.org/jira/browse/DRILL-4124 > Project: Apache Drill > Issue Type: Improvement >Reporter: Julien Le Dem >Assignee: Julien Le Dem > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4145) IndexOutOfBoundsException raised during select * query on S3 csv file
[ https://issues.apache.org/jira/browse/DRILL-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036816#comment-15036816 ] ASF GitHub Bot commented on DRILL-4145: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/287 > IndexOutOfBoundsException raised during select * query on S3 csv file > - > > Key: DRILL-4145 > URL: https://issues.apache.org/jira/browse/DRILL-4145 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.3.0 > Environment: Drill 1.3.0 on a 3 node distributed-mode cluster on AWS. > Data files on S3. > S3 storage plugin configuration: > { > "type": "file", > "enabled": true, > "connection": "s3a://", > "workspaces": { > "root": { > "location": "/", > "writable": false, > "defaultInputFormat": null > }, > "views": { > "location": "/processed", > "writable": true, > "defaultInputFormat": null > }, > "tmp": { > "location": "/tmp", > "writable": true, > "defaultInputFormat": null > } > }, > "formats": { > "psv": { > "type": "text", > "extensions": [ > "tbl" > ], > "delimiter": "|" > }, > "csv": { > "type": "text", > "extensions": [ > "csv" > ], > "extractHeader": true, > "delimiter": "," > }, > "tsv": { > "type": "text", > "extensions": [ > "tsv" > ], > "delimiter": "\t" > }, > "parquet": { > "type": "parquet" > }, > "json": { > "type": "json" > }, > "avro": { > "type": "avro" > }, > "sequencefile": { > "type": "sequencefile", > "extensions": [ > "seq" > ] > }, > "csvh": { > "type": "text", > "extensions": [ > "csvh", > "csv" > ], > "extractHeader": true, > "delimiter": "," > } > } > } >Reporter: Peter McTaggart >Assignee: Jacques Nadeau > Attachments: apps1-bad.csv, apps1.csv > > > When trying to query (via sqlline or WebUI) a .csv file I am getting an > IndexOutofBoundsException: > {noformat} 0: jdbc:drill:> select * from > s3data.root.`staging/data/apps1-bad.csv` limit 1; > Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 > (expected: range(0, 16384)) > Fragment 0:0 > [Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on > ip-X.compute.internal:31010] (state=,code=0) > 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1; > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > | FIELD_1 | FIELD_2| FIELD_3 | FIELD_4 | FIELD_5 | FIELD_6 > | FIELD_7 | FIELD_8 | FIELD_9 | FIELD_10 | FIELD_11 | > FIELD_12 | FIELD_13 | FIELD_14 | FIELD_15 | FIELD_16 | FIELD_17 | > FIELD_18 | FIELD_19 | FIELD_20 | FIELD_21 | FIELD_22 | > FIELD_23 | FIELD_24 | FIELD_25 | FIELD_26 | FIELD_27 | FIELD_28 | > FIELD_29 | FIELD_30 | FIELD_31 | FIELD_32 | FIELD_33 | FIELD_34 | > FIELD_35 | > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > | 489517 | 27/10/2015 02:05:27 | 261 | 1130232 | 0| > 925630488 | 0| 925630488 | -1 | 19531580547 | | > 27/10/2015 02:00:00 | | 30| 300 | 0 | 0 >| | | 27/10/2015 02:05:27 | 0 | 1 | 0 > | 35.0 | | | | 505 | 872.0 > | | aBc | | | | | > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+--
[jira] [Commented] (DRILL-4109) NPE in RecordIterator
[ https://issues.apache.org/jira/browse/DRILL-4109?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036757#comment-15036757 ] amit hadke commented on DRILL-4109: --- [~vicky] I pushed in my changes for DRILL-4125, Could you run query15.sql on latest change? Repo: https://github.com/amithadke/drill Branch: DRILL-4109 > NPE in RecordIterator > - > > Key: DRILL-4109 > URL: https://issues.apache.org/jira/browse/DRILL-4109 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.4.0 >Reporter: Victoria Markman >Assignee: amit hadke >Priority: Blocker > Fix For: 1.4.0 > > Attachments: 29ac6c1b-9b33-3457-8bc8-9e2dff6ad438.sys.drill, > 29b41f37-4803-d7ce-e05f-912d1f65da79.sys.drill, drillbit.log, > drillbit.log.debug > > > 4 node cluster > 36GB of direct memory > 4GB heap memory > planner.memory.max_query_memory_per_node=2GB (default) > planner.enable_hashjoin = false > Spill directory has 6.4T of memory available: > {noformat} > [Tue Nov 17 18:23:18 /tmp/drill ] # df -H . > Filesystem Size Used Avail Use% Mounted on > localhost:/mapr 7.7T 1.4T 6.4T 18% /mapr > {noformat} > Run query below: > framework/resources/Advanced/tpcds/tpcds_sf100/original/query15.sql > drillbit.log > {code} > 2015-11-18 02:22:12,639 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:9] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Merging and spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_9/operator_17/7 > 2015-11-18 02:22:12,770 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:5] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Merging and spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_5/operator_17/7 > 2015-11-18 02:22:13,345 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:17] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_17/operator_17/7 > 2015-11-18 02:22:13,346 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] INFO > o.a.d.e.p.i.xsort.ExternalSortBatch - Completed spilling to > /tmp/drill/spill/29b41f37-4803-d7ce-e05f-912d1f65da79/major_fragment_3/minor_fragment_13/operator_16/1 > 2015-11-18 02:22:13,346 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] WARN > o.a.d.e.p.i.xsort.ExternalSortBatch - Starting to merge. 34 batch groups. > Current allocated memory: 2252186 > 2015-11-18 02:22:13,363 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 29b41f37-4803-d7ce-e05f-912d1f65da79:3:13: State change requested RUNNING --> > FAILED > 2015-11-18 02:22:13,370 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 29b41f37-4803-d7ce-e05f-912d1f65da79:3:13: State change requested FAILED --> > FINISHED > 2015-11-18 02:22:13,371 [29b41f37-4803-d7ce-e05f-912d1f65da79:frag:3:13] > ERROR o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: NullPointerException > Fragment 3:13 > [Error Id: c5d67dcb-16aa-4951-89f5-599b4b4eb54d on atsqa4-133.qa.lab:31010] > org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: > NullPointerException > Fragment 3:13 > [Error Id: c5d67dcb-16aa-4951-89f5-599b4b4eb54d on atsqa4-133.qa.lab:31010] > at > org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:534) > ~[drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:321) > [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:184) > [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:290) > [drill-java-exec-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > [drill-common-1.4.0-SNAPSHOT.jar:1.4.0-SNAPSHOT] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_71] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_71] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_71] > java.lang.NullPointerException: null > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4127) HiveSchema.getSubSchema() should use lazy loading of all the table names
[ https://issues.apache.org/jira/browse/DRILL-4127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036677#comment-15036677 ] Jinfeng Ni commented on DRILL-4127: --- For a hive storage plugin with about 8 schema/databases, if I run a simple query like this: select count(*) from hive.table1; >From hive.log, we saw that the # of hive metastore api call as following: Without the patch. Impersonation is turned on. 1. # of get_all_databases API call: 31 2. # of get_all_tables API call: 30 3. # of get_table API call: 2 That explains that why some Drill users report that they saw Drill spent 20-30 seconds on planning for such simple query, making the query not "interactive" at all. > HiveSchema.getSubSchema() should use lazy loading of all the table names > > > Key: DRILL-4127 > URL: https://issues.apache.org/jira/browse/DRILL-4127 > Project: Apache Drill > Issue Type: Bug >Reporter: Jinfeng Ni >Assignee: Jinfeng Ni > > Currently, HiveSchema.getSubSchema() will pre-load all the table names when > it constructs the subschema, even though those tables names are not requested > at all. This could cause considerably big performance overhead, especially > when the hive schema contains large # of objects (thousands of tables/views > are not un-common in some use case). > In stead, we should change the loading of table names to on-demand. Only when > there is a request of get all table names, we load them into hive schema. > This should help "show schemas", since it only requires the schema name, not > the table names in the schema. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4134) Incorporate remaining patches from DRILL-1942 Allocator refactor
[ https://issues.apache.org/jira/browse/DRILL-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036659#comment-15036659 ] ASF GitHub Bot commented on DRILL-4134: --- Github user adeneche commented on a diff in the pull request: https://github.com/apache/drill/pull/283#discussion_r46477393 --- Diff: exec/memory/base/src/main/java/org/apache/drill/exec/memory/BaseAllocator.java --- @@ -0,0 +1,689 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.memory; + +import io.netty.buffer.ByteBufAllocator; +import io.netty.buffer.DrillBuf; +import io.netty.buffer.UnsafeDirectLittleEndian; + +import java.util.Arrays; +import java.util.IdentityHashMap; +import java.util.Set; +import java.util.concurrent.atomic.AtomicInteger; +import java.util.concurrent.atomic.AtomicLong; + +import org.apache.drill.common.HistoricalLog; +import org.apache.drill.exec.exception.OutOfMemoryException; +import org.apache.drill.exec.memory.AllocatorManager.BufferLedger; +import org.apache.drill.exec.ops.BufferManager; +import org.apache.drill.exec.util.AssertionUtil; + +import com.google.common.base.Preconditions; + +public abstract class BaseAllocator extends Accountant implements BufferAllocator { + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(BaseAllocator.class); + + public static final String DEBUG_ALLOCATOR = "drill.memory.debug.allocator"; + + private static final AtomicLong ID_GENERATOR = new AtomicLong(0); + private static final int CHUNK_SIZE = AllocatorManager.INNER_ALLOCATOR.getChunkSize(); + + public static final int DEBUG_LOG_LENGTH = 6; + public static final boolean DEBUG = AssertionUtil.isAssertionsEnabled() + || Boolean.parseBoolean(System.getProperty(DEBUG_ALLOCATOR, "false")); + private final Object DEBUG_LOCK = DEBUG ? new Object() : null; + + private final BaseAllocator parentAllocator; + private final ByteBufAllocator thisAsByteBufAllocator; + private final IdentityHashMap childAllocators; + private final DrillBuf empty; + + private volatile boolean isClosed = false; // the allocator has been closed + + // Package exposed for sharing between AllocatorManger and BaseAllocator objects + final long id = ID_GENERATOR.incrementAndGet(); // unique ID assigned to each allocator + final String name; + final RootAllocator root; + + // members used purely for debugging + private final IdentityHashMap childLedgers; + private final IdentityHashMap reservations; + private final HistoricalLog historicalLog; + + protected BaseAllocator( + final BaseAllocator parentAllocator, + final String name, + final long initReservation, + final long maxAllocation) throws OutOfMemoryException { +super(parentAllocator, initReservation, maxAllocation); + +if (parentAllocator != null) { + this.root = parentAllocator.root; + empty = parentAllocator.empty; +} else if (this instanceof RootAllocator) { + this.root = (RootAllocator) this; + empty = createEmpty(); +} else { + throw new IllegalStateException("An parent allocator must either carry a root or be the root."); +} + +this.parentAllocator = parentAllocator; +this.name = name; + +// TODO: DRILL-4131 +// this.thisAsByteBufAllocator = new DrillByteBufAllocator(this); +this.thisAsByteBufAllocator = AllocatorManager.INNER_ALLOCATOR.allocator; + +if (DEBUG) { + childAllocators = new IdentityHashMap<>(); + reservations = new IdentityHashMap<>(); + childLedgers = new IdentityHashMap<>(); + historicalLog = new HistoricalLog(DEBUG_LOG_LENGTH, "allocator[%d]", id); + hist("created by \"%s\", owned = %d", name.toString(), this.getAllocatedMemory());
[jira] [Commented] (DRILL-4134) Incorporate remaining patches from DRILL-1942 Allocator refactor
[ https://issues.apache.org/jira/browse/DRILL-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036655#comment-15036655 ] ASF GitHub Bot commented on DRILL-4134: --- Github user adeneche commented on a diff in the pull request: https://github.com/apache/drill/pull/283#discussion_r46477028 --- Diff: exec/memory/base/src/main/java/org/apache/drill/exec/memory/AllocatorManager.java --- @@ -0,0 +1,386 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.memory; + +import static org.apache.drill.exec.memory.BaseAllocator.indent; +import io.netty.buffer.DrillBuf; +import io.netty.buffer.PooledByteBufAllocatorL; +import io.netty.buffer.UnsafeDirectLittleEndian; + +import java.util.IdentityHashMap; +import java.util.concurrent.atomic.AtomicInteger; +import java.util.concurrent.atomic.AtomicLong; +import java.util.concurrent.locks.Lock; +import java.util.concurrent.locks.ReadWriteLock; +import java.util.concurrent.locks.ReentrantReadWriteLock; + +import org.apache.drill.common.HistoricalLog; +import org.apache.drill.exec.memory.BaseAllocator.Verbosity; +import org.apache.drill.exec.metrics.DrillMetrics; +import org.apache.drill.exec.ops.BufferManager; + +import com.carrotsearch.hppc.LongObjectOpenHashMap; +import com.google.common.base.Preconditions; + +/** + * Manages the relationship between one or more allocators and a particular UDLE. Ensures that one allocator owns the + * memory that multiple allocators may be referencing. Manages a BufferLedger between each of its associated allocators. + * This class is also responsible for managing when memory is allocated and returned to the Netty-based + * PooledByteBufAllocatorL. + * + * The only reason that this isn't package private is we're forced to put DrillBuf in Netty's package which need access + * to these objects or methods. + * + * Threading: AllocatorManager manages thread-safety internally. Operations within the context of a single BufferLedger + * are lockless in nature and can be leveraged by multiple threads. Operations that cross the context of two ledgers + * will acquire a lock on the AllocatorManager instance. Important note, there is one AllocatorManager per + * UnsafeDirectLittleEndian buffer allocation. As such, there will be thousands of these in a typical query. The + * contention of acquiring a lock on AllocatorManager should be very low. + * + */ +public class AllocatorManager { + // private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(AllocatorManager.class); + + private static final AtomicLong LEDGER_ID_GENERATOR = new AtomicLong(0); + static final PooledByteBufAllocatorL INNER_ALLOCATOR = new PooledByteBufAllocatorL(DrillMetrics.getInstance()); + + private final RootAllocator root; + private volatile BufferLedger owningLedger; + private final int size; + private final UnsafeDirectLittleEndian underlying; + private final ReadWriteLock lock = new ReentrantReadWriteLock(); + private final LongObjectOpenHashMap map = new LongObjectOpenHashMap<>(); + private final AutoCloseableLock readLock = new AutoCloseableLock(lock.readLock()); + private final AutoCloseableLock writeLock = new AutoCloseableLock(lock.writeLock()); + private final IdentityHashMap buffers = + BaseAllocator.DEBUG ? new IdentityHashMap() : null; + + AllocatorManager(BaseAllocator accountingAllocator, int size) { +Preconditions.checkNotNull(accountingAllocator); +this.root = accountingAllocator.root; +this.underlying = INNER_ALLOCATOR.allocate(size); +this.owningLedger = associate(accountingAllocator); +this.size = underlying.capacity(); + } + + /** + * Associate the existing underlying buffer with a new allocator. + * + * @param allocator + * The target alloc
[jira] [Commented] (DRILL-4145) IndexOutOfBoundsException raised during select * query on S3 csv file
[ https://issues.apache.org/jira/browse/DRILL-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036572#comment-15036572 ] John Omernik commented on DRILL-4145: - Looks like Steven pushed a change. Steven does that one line addition fix this? That's awesome if that's all it took! I did confirm that I have the same issue on MapRFS as well. Peter, the other issue I saw you mention was that adding the extractHeader to the csv didn't actually have the desired affect. That may be a bug too, do you want to open a Jira on that too? (It should work, and when I did my testing, it didn't either). Thanks for your work on this Peter, it's great to find bugs like this. Helps everyone! John > IndexOutOfBoundsException raised during select * query on S3 csv file > - > > Key: DRILL-4145 > URL: https://issues.apache.org/jira/browse/DRILL-4145 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.3.0 > Environment: Drill 1.3.0 on a 3 node distributed-mode cluster on AWS. > Data files on S3. > S3 storage plugin configuration: > { > "type": "file", > "enabled": true, > "connection": "s3a://", > "workspaces": { > "root": { > "location": "/", > "writable": false, > "defaultInputFormat": null > }, > "views": { > "location": "/processed", > "writable": true, > "defaultInputFormat": null > }, > "tmp": { > "location": "/tmp", > "writable": true, > "defaultInputFormat": null > } > }, > "formats": { > "psv": { > "type": "text", > "extensions": [ > "tbl" > ], > "delimiter": "|" > }, > "csv": { > "type": "text", > "extensions": [ > "csv" > ], > "extractHeader": true, > "delimiter": "," > }, > "tsv": { > "type": "text", > "extensions": [ > "tsv" > ], > "delimiter": "\t" > }, > "parquet": { > "type": "parquet" > }, > "json": { > "type": "json" > }, > "avro": { > "type": "avro" > }, > "sequencefile": { > "type": "sequencefile", > "extensions": [ > "seq" > ] > }, > "csvh": { > "type": "text", > "extensions": [ > "csvh", > "csv" > ], > "extractHeader": true, > "delimiter": "," > } > } > } >Reporter: Peter McTaggart >Assignee: Jacques Nadeau > Attachments: apps1-bad.csv, apps1.csv > > > When trying to query (via sqlline or WebUI) a .csv file I am getting an > IndexOutofBoundsException: > {noformat} 0: jdbc:drill:> select * from > s3data.root.`staging/data/apps1-bad.csv` limit 1; > Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 > (expected: range(0, 16384)) > Fragment 0:0 > [Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on > ip-X.compute.internal:31010] (state=,code=0) > 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1; > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > | FIELD_1 | FIELD_2| FIELD_3 | FIELD_4 | FIELD_5 | FIELD_6 > | FIELD_7 | FIELD_8 | FIELD_9 | FIELD_10 | FIELD_11 | > FIELD_12 | FIELD_13 | FIELD_14 | FIELD_15 | FIELD_16 | FIELD_17 | > FIELD_18 | FIELD_19 | FIELD_20 | FIELD_21 | FIELD_22 | > FIELD_23 | FIELD_24 | FIELD_25 | FIELD_26 | FIELD_27 | FIELD_28 | > FIELD_29 | FIELD_30 | FIELD_31 | FIELD_32 | FIELD_33 | FIELD_34 | > FIELD_35 | > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > | 489517 | 27/10/2015 02:05:27 | 261 | 1130232 | 0| > 925630488 | 0| 925630488 | -1 | 19531580547 | | > 27/10/2015 02:00:00 | | 30| 300 | 0 | 0 >| | | 27/10/2015 02:05:27 | 0 | 1 | 0 > | 35.0 | | |
[jira] [Commented] (DRILL-4134) Incorporate remaining patches from DRILL-1942 Allocator refactor
[ https://issues.apache.org/jira/browse/DRILL-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036515#comment-15036515 ] ASF GitHub Bot commented on DRILL-4134: --- Github user adeneche commented on a diff in the pull request: https://github.com/apache/drill/pull/283#discussion_r46468356 --- Diff: exec/memory/base/src/main/java/org/apache/drill/exec/memory/BaseAllocator.java --- @@ -0,0 +1,689 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.memory; + +import io.netty.buffer.ByteBufAllocator; +import io.netty.buffer.DrillBuf; +import io.netty.buffer.UnsafeDirectLittleEndian; + +import java.util.Arrays; +import java.util.IdentityHashMap; +import java.util.Set; +import java.util.concurrent.atomic.AtomicInteger; +import java.util.concurrent.atomic.AtomicLong; + +import org.apache.drill.common.HistoricalLog; +import org.apache.drill.exec.exception.OutOfMemoryException; +import org.apache.drill.exec.memory.AllocatorManager.BufferLedger; +import org.apache.drill.exec.ops.BufferManager; +import org.apache.drill.exec.util.AssertionUtil; + +import com.google.common.base.Preconditions; + +public abstract class BaseAllocator extends Accountant implements BufferAllocator { + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(BaseAllocator.class); + + public static final String DEBUG_ALLOCATOR = "drill.memory.debug.allocator"; + + private static final AtomicLong ID_GENERATOR = new AtomicLong(0); + private static final int CHUNK_SIZE = AllocatorManager.INNER_ALLOCATOR.getChunkSize(); + + public static final int DEBUG_LOG_LENGTH = 6; + public static final boolean DEBUG = AssertionUtil.isAssertionsEnabled() + || Boolean.parseBoolean(System.getProperty(DEBUG_ALLOCATOR, "false")); + private final Object DEBUG_LOCK = DEBUG ? new Object() : null; + + private final BaseAllocator parentAllocator; + private final ByteBufAllocator thisAsByteBufAllocator; + private final IdentityHashMap childAllocators; + private final DrillBuf empty; + + private volatile boolean isClosed = false; // the allocator has been closed + + // Package exposed for sharing between AllocatorManger and BaseAllocator objects + final long id = ID_GENERATOR.incrementAndGet(); // unique ID assigned to each allocator + final String name; + final RootAllocator root; + + // members used purely for debugging + private final IdentityHashMap childLedgers; + private final IdentityHashMap reservations; + private final HistoricalLog historicalLog; + + protected BaseAllocator( + final BaseAllocator parentAllocator, + final String name, + final long initReservation, + final long maxAllocation) throws OutOfMemoryException { +super(parentAllocator, initReservation, maxAllocation); + +if (parentAllocator != null) { + this.root = parentAllocator.root; + empty = parentAllocator.empty; +} else if (this instanceof RootAllocator) { + this.root = (RootAllocator) this; + empty = createEmpty(); +} else { + throw new IllegalStateException("An parent allocator must either carry a root or be the root."); +} + +this.parentAllocator = parentAllocator; +this.name = name; + +// TODO: DRILL-4131 +// this.thisAsByteBufAllocator = new DrillByteBufAllocator(this); +this.thisAsByteBufAllocator = AllocatorManager.INNER_ALLOCATOR.allocator; + +if (DEBUG) { + childAllocators = new IdentityHashMap<>(); + reservations = new IdentityHashMap<>(); + childLedgers = new IdentityHashMap<>(); + historicalLog = new HistoricalLog(DEBUG_LOG_LENGTH, "allocator[%d]", id); + hist("created by \"%s\", owned = %d", name.toString(), this.getAllocatedMemory());
[jira] [Updated] (DRILL-1760) Count on a map fails with SchemaChangeException
[ https://issues.apache.org/jira/browse/DRILL-1760?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hanifi Gunes updated DRILL-1760: Fix Version/s: (was: 1.4.0) > Count on a map fails with SchemaChangeException > --- > > Key: DRILL-1760 > URL: https://issues.apache.org/jira/browse/DRILL-1760 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Affects Versions: 1.0.0 >Reporter: Hanifi Gunes >Assignee: Hanifi Gunes > > Take yelp business dataset and run > {code:sql} > select count(attributes) from dfs.`/path/to/yelp-business.json` > {code} > you should read > {code:java} > org.apache.drill.exec.exception.SchemaChangeException: Failure while > materializing expression. > Error in expression at index -1. Error: Missing function implementation: > [count(MAP-REQUIRED)]. Full expression: --UNKNOWN EXPRESSION--. > at > org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.createAggregatorInternal(StreamingAggBatch.java:221) > [classes/:na] > at > org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.createAggregator(StreamingAggBatch.java:173) > [classes/:na] > at > org.apache.drill.exec.physical.impl.aggregate.StreamingAggBatch.buildSchema(StreamingAggBatch.java:89) > [classes/:na] > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema(IteratorValidatorBatchIterator.java:80) > [classes/:na] > at > org.apache.drill.exec.record.AbstractSingleRecordBatch.buildSchema(AbstractSingleRecordBatch.java:109) > [classes/:na] > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema(IteratorValidatorBatchIterator.java:80) > [classes/:na] > at > org.apache.drill.exec.physical.impl.svremover.RemovingRecordBatch.buildSchema(RemovingRecordBatch.java:64) > [classes/:na] > at > org.apache.drill.exec.physical.impl.validate.IteratorValidatorBatchIterator.buildSchema(IteratorValidatorBatchIterator.java:80) > [classes/:na] > at > org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.buildSchema(ScreenCreator.java:95) > [classes/:na] > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:111) > [classes/:na] > at > org.apache.drill.exec.work.WorkManager$RunnableWrapper.run(WorkManager.java:249) > [classes/:na] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > [na:1.7.0_65] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > [na:1.7.0_65] > at java.lang.Thread.run(Thread.java:745) [na:1.7.0_65] > {code} > I would expect to be able run count query on `attributes` field given that I > can run a select on the same field. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4134) Incorporate remaining patches from DRILL-1942 Allocator refactor
[ https://issues.apache.org/jira/browse/DRILL-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036481#comment-15036481 ] ASF GitHub Bot commented on DRILL-4134: --- Github user adeneche commented on a diff in the pull request: https://github.com/apache/drill/pull/283#discussion_r46466143 --- Diff: exec/memory/base/src/main/java/org/apache/drill/exec/memory/BaseAllocator.java --- @@ -0,0 +1,689 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.memory; + +import io.netty.buffer.ByteBufAllocator; +import io.netty.buffer.DrillBuf; +import io.netty.buffer.UnsafeDirectLittleEndian; + +import java.util.Arrays; +import java.util.IdentityHashMap; +import java.util.Set; +import java.util.concurrent.atomic.AtomicInteger; +import java.util.concurrent.atomic.AtomicLong; + +import org.apache.drill.common.HistoricalLog; +import org.apache.drill.exec.exception.OutOfMemoryException; +import org.apache.drill.exec.memory.AllocatorManager.BufferLedger; +import org.apache.drill.exec.ops.BufferManager; +import org.apache.drill.exec.util.AssertionUtil; + +import com.google.common.base.Preconditions; + +public abstract class BaseAllocator extends Accountant implements BufferAllocator { + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(BaseAllocator.class); + + public static final String DEBUG_ALLOCATOR = "drill.memory.debug.allocator"; + + private static final AtomicLong ID_GENERATOR = new AtomicLong(0); + private static final int CHUNK_SIZE = AllocatorManager.INNER_ALLOCATOR.getChunkSize(); + + public static final int DEBUG_LOG_LENGTH = 6; + public static final boolean DEBUG = AssertionUtil.isAssertionsEnabled() + || Boolean.parseBoolean(System.getProperty(DEBUG_ALLOCATOR, "false")); + private final Object DEBUG_LOCK = DEBUG ? new Object() : null; + + private final BaseAllocator parentAllocator; + private final ByteBufAllocator thisAsByteBufAllocator; + private final IdentityHashMap childAllocators; + private final DrillBuf empty; + + private volatile boolean isClosed = false; // the allocator has been closed + + // Package exposed for sharing between AllocatorManger and BaseAllocator objects + final long id = ID_GENERATOR.incrementAndGet(); // unique ID assigned to each allocator + final String name; + final RootAllocator root; + + // members used purely for debugging + private final IdentityHashMap childLedgers; + private final IdentityHashMap reservations; + private final HistoricalLog historicalLog; + + protected BaseAllocator( + final BaseAllocator parentAllocator, + final String name, + final long initReservation, + final long maxAllocation) throws OutOfMemoryException { +super(parentAllocator, initReservation, maxAllocation); + +if (parentAllocator != null) { + this.root = parentAllocator.root; + empty = parentAllocator.empty; +} else if (this instanceof RootAllocator) { + this.root = (RootAllocator) this; + empty = createEmpty(); +} else { + throw new IllegalStateException("An parent allocator must either carry a root or be the root."); +} + +this.parentAllocator = parentAllocator; +this.name = name; + +// TODO: DRILL-4131 +// this.thisAsByteBufAllocator = new DrillByteBufAllocator(this); +this.thisAsByteBufAllocator = AllocatorManager.INNER_ALLOCATOR.allocator; + +if (DEBUG) { + childAllocators = new IdentityHashMap<>(); + reservations = new IdentityHashMap<>(); + childLedgers = new IdentityHashMap<>(); + historicalLog = new HistoricalLog(DEBUG_LOG_LENGTH, "allocator[%d]", id); + hist("created by \"%s\", owned = %d", name.toString(), this.getAllocatedMemory());
[jira] [Commented] (DRILL-4111) turn tests off in travis as they don't work there
[ https://issues.apache.org/jira/browse/DRILL-4111?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036468#comment-15036468 ] ASF GitHub Bot commented on DRILL-4111: --- Github user julienledem commented on the pull request: https://github.com/apache/drill/pull/267#issuecomment-161412734 Thank you! > turn tests off in travis as they don't work there > - > > Key: DRILL-4111 > URL: https://issues.apache.org/jira/browse/DRILL-4111 > Project: Apache Drill > Issue Type: Task >Reporter: Julien Le Dem >Assignee: Julien Le Dem > Fix For: 1.4.0 > > > Since the travis build always fails, we should just turn it off for now. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4153) Query with "select columnName, *" fails with IOB
[ https://issues.apache.org/jira/browse/DRILL-4153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Girish updated DRILL-4153: --- Attachment: drillbit.log.txt > Query with "select columnName, *" fails with IOB > > > Key: DRILL-4153 > URL: https://issues.apache.org/jira/browse/DRILL-4153 > Project: Apache Drill > Issue Type: Bug > Components: SQL Parser >Reporter: Abhishek Girish > Attachments: drillbit.log.txt > > > Query with select columnName, * fails with IOB: > {code} > select c_customer_sk, * as c from dfs.tpcds_sf1_parquet.customer limit 1; > Query Failed: An Error Occurred > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > IndexOutOfBoundsException: index (-1) must not be negative [Error Id: > 05b29f77-7668-48e3-a423-a13f0fe9e79a on atsqa6c64.qa.lab:31010] > {code} > This issue isn't seen when * precedes columnName. > Log attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4153) Query with "select columnName, *" fails with IOB
Abhishek Girish created DRILL-4153: -- Summary: Query with "select columnName, *" fails with IOB Key: DRILL-4153 URL: https://issues.apache.org/jira/browse/DRILL-4153 Project: Apache Drill Issue Type: Bug Components: SQL Parser Reporter: Abhishek Girish Query with select columnName, * fails with IOB: {code} select c_customer_sk, * as c from dfs.tpcds_sf1_parquet.customer limit 1; Query Failed: An Error Occurred org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: IndexOutOfBoundsException: index (-1) must not be negative [Error Id: 05b29f77-7668-48e3-a423-a13f0fe9e79a on atsqa6c64.qa.lab:31010] {code} This issue isn't seen when * precedes columnName. Log attached. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-2419) UDF that returns string representation of expression type
[ https://issues.apache.org/jira/browse/DRILL-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid resolved DRILL-2419. Resolution: Fixed Fix Version/s: (was: Future) 1.3.0 Fixed in eb6325dc9b59291582cd7d3c3e5d02efd5d15906. > UDF that returns string representation of expression type > - > > Key: DRILL-2419 > URL: https://issues.apache.org/jira/browse/DRILL-2419 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Reporter: Victoria Markman >Assignee: Steven Phillips > Fix For: 1.3.0 > > > Suggested name: typeof (credit goes to Aman) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-2419) UDF that returns string representation of expression type
[ https://issues.apache.org/jira/browse/DRILL-2419?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Mehant Baid updated DRILL-2419: --- Assignee: Steven Phillips (was: Mehant Baid) > UDF that returns string representation of expression type > - > > Key: DRILL-2419 > URL: https://issues.apache.org/jira/browse/DRILL-2419 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Reporter: Victoria Markman >Assignee: Steven Phillips > Fix For: Future > > > Suggested name: typeof (credit goes to Aman) -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4134) Incorporate remaining patches from DRILL-1942 Allocator refactor
[ https://issues.apache.org/jira/browse/DRILL-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036320#comment-15036320 ] ASF GitHub Bot commented on DRILL-4134: --- Github user adeneche commented on a diff in the pull request: https://github.com/apache/drill/pull/283#discussion_r46453393 --- Diff: exec/memory/base/src/main/java/org/apache/drill/exec/memory/BaseAllocator.java --- @@ -0,0 +1,689 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.memory; + +import io.netty.buffer.ByteBufAllocator; +import io.netty.buffer.DrillBuf; +import io.netty.buffer.UnsafeDirectLittleEndian; + +import java.util.Arrays; +import java.util.IdentityHashMap; +import java.util.Set; +import java.util.concurrent.atomic.AtomicInteger; +import java.util.concurrent.atomic.AtomicLong; + +import org.apache.drill.common.HistoricalLog; +import org.apache.drill.exec.exception.OutOfMemoryException; +import org.apache.drill.exec.memory.AllocatorManager.BufferLedger; +import org.apache.drill.exec.ops.BufferManager; +import org.apache.drill.exec.util.AssertionUtil; + +import com.google.common.base.Preconditions; + +public abstract class BaseAllocator extends Accountant implements BufferAllocator { + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(BaseAllocator.class); + + public static final String DEBUG_ALLOCATOR = "drill.memory.debug.allocator"; + + private static final AtomicLong ID_GENERATOR = new AtomicLong(0); + private static final int CHUNK_SIZE = AllocatorManager.INNER_ALLOCATOR.getChunkSize(); + + public static final int DEBUG_LOG_LENGTH = 6; + public static final boolean DEBUG = AssertionUtil.isAssertionsEnabled() + || Boolean.parseBoolean(System.getProperty(DEBUG_ALLOCATOR, "false")); + private final Object DEBUG_LOCK = DEBUG ? new Object() : null; + + private final BaseAllocator parentAllocator; + private final ByteBufAllocator thisAsByteBufAllocator; + private final IdentityHashMap childAllocators; + private final DrillBuf empty; + + private volatile boolean isClosed = false; // the allocator has been closed + + // Package exposed for sharing between AllocatorManger and BaseAllocator objects + final long id = ID_GENERATOR.incrementAndGet(); // unique ID assigned to each allocator + final String name; + final RootAllocator root; + + // members used purely for debugging + private final IdentityHashMap childLedgers; + private final IdentityHashMap reservations; + private final HistoricalLog historicalLog; + + protected BaseAllocator( + final BaseAllocator parentAllocator, + final String name, + final long initReservation, + final long maxAllocation) throws OutOfMemoryException { +super(parentAllocator, initReservation, maxAllocation); + +if (parentAllocator != null) { + this.root = parentAllocator.root; + empty = parentAllocator.empty; +} else if (this instanceof RootAllocator) { + this.root = (RootAllocator) this; + empty = createEmpty(); +} else { + throw new IllegalStateException("An parent allocator must either carry a root or be the root."); +} + +this.parentAllocator = parentAllocator; +this.name = name; + +// TODO: DRILL-4131 +// this.thisAsByteBufAllocator = new DrillByteBufAllocator(this); +this.thisAsByteBufAllocator = AllocatorManager.INNER_ALLOCATOR.allocator; + +if (DEBUG) { + childAllocators = new IdentityHashMap<>(); + reservations = new IdentityHashMap<>(); + childLedgers = new IdentityHashMap<>(); + historicalLog = new HistoricalLog(DEBUG_LOG_LENGTH, "allocator[%d]", id); + hist("created by \"%s\", owned = %d", name.toString(), this.getAllocatedMemory());
[jira] [Commented] (DRILL-4134) Incorporate remaining patches from DRILL-1942 Allocator refactor
[ https://issues.apache.org/jira/browse/DRILL-4134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036314#comment-15036314 ] ASF GitHub Bot commented on DRILL-4134: --- Github user adeneche commented on a diff in the pull request: https://github.com/apache/drill/pull/283#discussion_r46452877 --- Diff: exec/memory/base/src/main/java/org/apache/drill/exec/memory/BaseAllocator.java --- @@ -0,0 +1,689 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.memory; + +import io.netty.buffer.ByteBufAllocator; +import io.netty.buffer.DrillBuf; +import io.netty.buffer.UnsafeDirectLittleEndian; + +import java.util.Arrays; +import java.util.IdentityHashMap; +import java.util.Set; +import java.util.concurrent.atomic.AtomicInteger; +import java.util.concurrent.atomic.AtomicLong; + +import org.apache.drill.common.HistoricalLog; +import org.apache.drill.exec.exception.OutOfMemoryException; +import org.apache.drill.exec.memory.AllocatorManager.BufferLedger; +import org.apache.drill.exec.ops.BufferManager; +import org.apache.drill.exec.util.AssertionUtil; + +import com.google.common.base.Preconditions; + +public abstract class BaseAllocator extends Accountant implements BufferAllocator { + private static final org.slf4j.Logger logger = org.slf4j.LoggerFactory.getLogger(BaseAllocator.class); + + public static final String DEBUG_ALLOCATOR = "drill.memory.debug.allocator"; + + private static final AtomicLong ID_GENERATOR = new AtomicLong(0); + private static final int CHUNK_SIZE = AllocatorManager.INNER_ALLOCATOR.getChunkSize(); + + public static final int DEBUG_LOG_LENGTH = 6; + public static final boolean DEBUG = AssertionUtil.isAssertionsEnabled() + || Boolean.parseBoolean(System.getProperty(DEBUG_ALLOCATOR, "false")); + private final Object DEBUG_LOCK = DEBUG ? new Object() : null; + + private final BaseAllocator parentAllocator; + private final ByteBufAllocator thisAsByteBufAllocator; + private final IdentityHashMap childAllocators; + private final DrillBuf empty; + + private volatile boolean isClosed = false; // the allocator has been closed + + // Package exposed for sharing between AllocatorManger and BaseAllocator objects + final long id = ID_GENERATOR.incrementAndGet(); // unique ID assigned to each allocator + final String name; + final RootAllocator root; + + // members used purely for debugging + private final IdentityHashMap childLedgers; + private final IdentityHashMap reservations; + private final HistoricalLog historicalLog; + + protected BaseAllocator( + final BaseAllocator parentAllocator, + final String name, + final long initReservation, + final long maxAllocation) throws OutOfMemoryException { +super(parentAllocator, initReservation, maxAllocation); + +if (parentAllocator != null) { + this.root = parentAllocator.root; + empty = parentAllocator.empty; +} else if (this instanceof RootAllocator) { + this.root = (RootAllocator) this; + empty = createEmpty(); +} else { + throw new IllegalStateException("An parent allocator must either carry a root or be the root."); +} + +this.parentAllocator = parentAllocator; +this.name = name; + +// TODO: DRILL-4131 +// this.thisAsByteBufAllocator = new DrillByteBufAllocator(this); +this.thisAsByteBufAllocator = AllocatorManager.INNER_ALLOCATOR.allocator; + +if (DEBUG) { + childAllocators = new IdentityHashMap<>(); + reservations = new IdentityHashMap<>(); + childLedgers = new IdentityHashMap<>(); + historicalLog = new HistoricalLog(DEBUG_LOG_LENGTH, "allocator[%d]", id); + hist("created by \"%s\", owned = %d", name.toString(), this.getAllocatedMemory());
[jira] [Created] (DRILL-4152) Add additional logging and metrics to the Parquet reader
Parth Chandra created DRILL-4152: Summary: Add additional logging and metrics to the Parquet reader Key: DRILL-4152 URL: https://issues.apache.org/jira/browse/DRILL-4152 Project: Apache Drill Issue Type: Bug Components: Storage - Parquet Reporter: Parth Chandra Assignee: Parth Chandra In some cases, we see the Parquet reader as the bottleneck in reading from the file system. RWSpeedTest is able to read 10x faster than the Parquet reader so reading from disk is not the issue. This issue is to add more instrumentation to the Parquet reader so speed bottlenecks can be better diagnosed. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (DRILL-4151) CSV Support with multiline header
Jaroslaw Sosnicki created DRILL-4151: Summary: CSV Support with multiline header Key: DRILL-4151 URL: https://issues.apache.org/jira/browse/DRILL-4151 Project: Apache Drill Issue Type: Wish Components: Functions - Drill Reporter: Jaroslaw Sosnicki The modern data sources produce CSV files with two header lines: first line contains field descriptions while second line contains filed types Would be feasible to implement such a format in to DRILL as additional storage format type? This example demonstrates an output CSV header from one of the data sources. LDEV_COUNT,MONITORED_LDEV_COUNT,READ_IO_COUNT,READ_IO_RATE,READ_HIT_IO_COUNT,READ_HIT_RATE,WRITE_IO_COUNT,WRITE_IO_RATE,WRITE_HIT_IO_COUNT,WRITE_HIT_RATE,READ_MBYTES,READ_XFER_RATE,WRITE_MBYTES,WRITE_XFER_RATE,INTERVAL,INPUT_RECORD_TYPE,RECORD_TIME ulong,ulong,double,double,double,double,double,double,double,double,double,double,double,double,ulong,string(8),time_t -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-4081) Handle schema changes in ExternalSort
[ https://issues.apache.org/jira/browse/DRILL-4081?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venki Korukanti resolved DRILL-4081. Resolution: Fixed Fix Version/s: 1.4.0 > Handle schema changes in ExternalSort > - > > Key: DRILL-4081 > URL: https://issues.apache.org/jira/browse/DRILL-4081 > Project: Apache Drill > Issue Type: Improvement >Reporter: Steven Phillips >Assignee: Steven Phillips > Fix For: 1.4.0 > > > This improvement will make use of the Union vector to handle schema changes. > When a new schema appears, the schema will be "merged" with the previous > schema. The result will be a new schema that uses Union type to store the > columns where this is a type conflict. All of the batches (including the > batches that have already arrived) will be coerced into this new schema. > A new comparison function will be included to handle the comparison of Union > type. Comparison of union type will work as follows: > 1. All numeric types can be mutually compared, and will be compared using > Drill implicit cast rules. > 2. All other types will not be compared against other types, but only among > values of the same type. > 3. There will be an overall precedence of types with regards to ordering. > This precedence is not yet defined, but will be as part of the work on this > issue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-4094) Respect -DskipTests=true for JDBC plugin tests
[ https://issues.apache.org/jira/browse/DRILL-4094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venki Korukanti resolved DRILL-4094. Resolution: Fixed Fix Version/s: 1.4.0 > Respect -DskipTests=true for JDBC plugin tests > -- > > Key: DRILL-4094 > URL: https://issues.apache.org/jira/browse/DRILL-4094 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Other >Reporter: Andrew >Assignee: Andrew >Priority: Trivial > Fix For: 1.4.0 > > > The maven config for the JDBC storage plugin does not respect the -DskipTests > option. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-4047) Select with options
[ https://issues.apache.org/jira/browse/DRILL-4047?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venki Korukanti resolved DRILL-4047. Resolution: Fixed Fix Version/s: 1.4.0 > Select with options > --- > > Key: DRILL-4047 > URL: https://issues.apache.org/jira/browse/DRILL-4047 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Relational Operators >Reporter: Julien Le Dem >Assignee: Julien Le Dem > Fix For: 1.4.0 > > > Add a mechanism to pass parameters down to the StoragePlugin when writing a > Select statement. > Some discussion here: > http://mail-archives.apache.org/mod_mbox/drill-dev/201510.mbox/%3CCAO%2Bvc4AcGK3%2B3QYvQV1-xPPdpG3Tc%2BfG%3D0xDGEUPrhd6ktHv5Q%40mail.gmail.com%3E > http://mail-archives.apache.org/mod_mbox/drill-dev/201511.mbox/%3ccao+vc4clzylvjevisfjqtcyxb-zsmfy4bqrm-jhbidwzgqf...@mail.gmail.com%3E -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4063) Missing files/classes needed for S3a access
[ https://issues.apache.org/jira/browse/DRILL-4063?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venki Korukanti updated DRILL-4063: --- Fix Version/s: 1.3.0 > Missing files/classes needed for S3a access > --- > > Key: DRILL-4063 > URL: https://issues.apache.org/jira/browse/DRILL-4063 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Other >Affects Versions: 1.3.0 > Environment: All >Reporter: Nathan Griffith >Assignee: Abhijit Pol > Labels: aws, aws-s3, s3, storage > Fix For: 1.3.0 > > > Specifying > {code} > "connection": "s3a://" > {code} > results in the following error: > {code} > Error: SYSTEM ERROR: ClassNotFoundException: Class > org.apache.hadoop.fs.s3a.S3AFileSystem not found > {code} > I can fix this by dropping in these files from the hadoop binary tarball: > hadoop-aws-2.6.2.jar > aws-java-sdk-1.7.4.jar > And then adding this to my core-site.xml: > {code:xml} > > fs.s3a.access.key > ACCESSKEY > > > fs.s3a.secret.key > SECRETKEY > > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Closed] (DRILL-3997) JDBC v1.2 Error: java.lang.IndexOutOfBoundsException: Index: 0
[ https://issues.apache.org/jira/browse/DRILL-3997?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jaroslaw Sosnicki closed DRILL-3997. Resolution: Not A Problem No longer a problem > JDBC v1.2 Error: java.lang.IndexOutOfBoundsException: Index: 0 > -- > > Key: DRILL-3997 > URL: https://issues.apache.org/jira/browse/DRILL-3997 > Project: Apache Drill > Issue Type: Bug > Components: Client - JDBC >Affects Versions: 1.2.0 > Environment: Windows Linux >Reporter: Jaroslaw Sosnicki > > Connecting to Apache Drill V1.2 using JDBC driver supplied with v1.2 > configured on Squirrel 3.7 produces this error: > Error: Drill_Dev v1.2: java.sql.SQLException: Unexpected RuntimeException: > java.lang.IndexOutOfBoundsException: Index: 0 > A connection alias configured using v1.1 version of JDBC driver does not > produce this error and connection succeeds. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-4053) Reduce metadata cache file size
[ https://issues.apache.org/jira/browse/DRILL-4053?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venki Korukanti resolved DRILL-4053. Resolution: Fixed > Reduce metadata cache file size > --- > > Key: DRILL-4053 > URL: https://issues.apache.org/jira/browse/DRILL-4053 > Project: Apache Drill > Issue Type: Improvement > Components: Metadata >Affects Versions: 1.3.0 >Reporter: Parth Chandra >Assignee: Parth Chandra > Fix For: 1.4.0 > > > The parquet metadata cache file has fair amount of redundant metadata that > causes the size of the cache file to bloat. Two things that we can reduce are > : > 1) Schema is repeated for every row group. We can keep a merged schema > (similar to what was discussed for insert into functionality) 2) The max and > min value in the stats are used for partition pruning when the values are the > same. We can keep the maxValue only and that too only if it is the same as > the minValue. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Resolved] (DRILL-4108) Query on csv file w/ header fails with an exception when non existing column is requested
[ https://issues.apache.org/jira/browse/DRILL-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Venki Korukanti resolved DRILL-4108. Resolution: Fixed Assignee: Abhijit Pol > Query on csv file w/ header fails with an exception when non existing column > is requested > - > > Key: DRILL-4108 > URL: https://issues.apache.org/jira/browse/DRILL-4108 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Text & CSV >Affects Versions: 1.3.0 >Reporter: Abhi Pol >Assignee: Abhijit Pol > Fix For: 1.4.0 > > > Drill query on a csv file with header requesting column(s) that do not exists > in header fails with an exception. > *Current behavior:* once extractHeader is enabled, query columns must be > columns from the header > *Expected behavior:* non existing columns should appear with 'null' values > like default drill behavior > {noformat} > 0: jdbc:drill:zk=local> select Category from dfs.`/tmp/cars.csvh` limit 10; > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.drill.exec.store.easy.text.compliant.FieldVarCharOutput.(FieldVarCharOutput.java:104) > at > org.apache.drill.exec.store.easy.text.compliant.CompliantTextRecordReader.setup(CompliantTextRecordReader.java:118) > at > org.apache.drill.exec.physical.impl.ScanBatch.(ScanBatch.java:108) > at > org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:198) > at > org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35) > at > org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28) > at > org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:151) > at > org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174) > at > org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131) > at > org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174) > at > org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131) > at > org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174) > at > org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131) > at > org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174) > at > org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:105) > at > org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230) > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Error: SYSTEM ERROR: ArrayIndexOutOfBoundsException: -1 > Fragment 0:0 > [Error Id: f272960e-fa2f-408e-918c-722190398cd3 on blackhole:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (DRILL-4147) Union All operator runs in a single fragment
[ https://issues.apache.org/jira/browse/DRILL-4147?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15036014#comment-15036014 ] ASF GitHub Bot commented on DRILL-4147: --- GitHub user hsuanyi opened a pull request: https://github.com/apache/drill/pull/288 DRILL-4147: Change UnionPrel's DrillDistributionTrait to be ANY to al… …low Union-All to be done in parallel You can merge this pull request into a Git repository by running: $ git pull https://github.com/hsuanyi/incubator-drill DRILL-4147 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/288.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #288 commit 9f31a4c04c2cb219237519070b35d5fae3010908 Author: Hsuan-Yi Chu Date: 2015-12-02T00:46:51Z DRILL-4147: Change UnionPrel's DrillDistributionTrait to be ANY to allow Union-All to be done in parallel > Union All operator runs in a single fragment > > > Key: DRILL-4147 > URL: https://issues.apache.org/jira/browse/DRILL-4147 > Project: Apache Drill > Issue Type: Bug >Reporter: amit hadke >Assignee: Sean Hsuan-Yi Chu > > A User noticed that running select from a single directory is much faster > than union all on two directories. > (https://drill.apache.org/blog/2014/12/09/running-sql-queries-on-amazon-s3/#comment-2349732267) > > It seems like UNION ALL operator doesn't parallelize sub scans (its using > SINGLETON for distribution type). Everything is ran in single fragment. > We may have to use SubsetTransformer in UnionAllPrule. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (DRILL-4145) IndexOutOfBoundsException raised during select * query on S3 csv file
[ https://issues.apache.org/jira/browse/DRILL-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Phillips updated DRILL-4145: --- Assignee: Jacques Nadeau (was: Steven Phillips) > IndexOutOfBoundsException raised during select * query on S3 csv file > - > > Key: DRILL-4145 > URL: https://issues.apache.org/jira/browse/DRILL-4145 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.3.0 > Environment: Drill 1.3.0 on a 3 node distributed-mode cluster on AWS. > Data files on S3. > S3 storage plugin configuration: > { > "type": "file", > "enabled": true, > "connection": "s3a://", > "workspaces": { > "root": { > "location": "/", > "writable": false, > "defaultInputFormat": null > }, > "views": { > "location": "/processed", > "writable": true, > "defaultInputFormat": null > }, > "tmp": { > "location": "/tmp", > "writable": true, > "defaultInputFormat": null > } > }, > "formats": { > "psv": { > "type": "text", > "extensions": [ > "tbl" > ], > "delimiter": "|" > }, > "csv": { > "type": "text", > "extensions": [ > "csv" > ], > "extractHeader": true, > "delimiter": "," > }, > "tsv": { > "type": "text", > "extensions": [ > "tsv" > ], > "delimiter": "\t" > }, > "parquet": { > "type": "parquet" > }, > "json": { > "type": "json" > }, > "avro": { > "type": "avro" > }, > "sequencefile": { > "type": "sequencefile", > "extensions": [ > "seq" > ] > }, > "csvh": { > "type": "text", > "extensions": [ > "csvh", > "csv" > ], > "extractHeader": true, > "delimiter": "," > } > } > } >Reporter: Peter McTaggart >Assignee: Jacques Nadeau > Attachments: apps1-bad.csv, apps1.csv > > > When trying to query (via sqlline or WebUI) a .csv file I am getting an > IndexOutofBoundsException: > {noformat} 0: jdbc:drill:> select * from > s3data.root.`staging/data/apps1-bad.csv` limit 1; > Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 > (expected: range(0, 16384)) > Fragment 0:0 > [Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on > ip-X.compute.internal:31010] (state=,code=0) > 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1; > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > | FIELD_1 | FIELD_2| FIELD_3 | FIELD_4 | FIELD_5 | FIELD_6 > | FIELD_7 | FIELD_8 | FIELD_9 | FIELD_10 | FIELD_11 | > FIELD_12 | FIELD_13 | FIELD_14 | FIELD_15 | FIELD_16 | FIELD_17 | > FIELD_18 | FIELD_19 | FIELD_20 | FIELD_21 | FIELD_22 | > FIELD_23 | FIELD_24 | FIELD_25 | FIELD_26 | FIELD_27 | FIELD_28 | > FIELD_29 | FIELD_30 | FIELD_31 | FIELD_32 | FIELD_33 | FIELD_34 | > FIELD_35 | > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > | 489517 | 27/10/2015 02:05:27 | 261 | 1130232 | 0| > 925630488 | 0| 925630488 | -1 | 19531580547 | | > 27/10/2015 02:00:00 | | 30| 300 | 0 | 0 >| | | 27/10/2015 02:05:27 | 0 | 1 | 0 > | 35.0 | | | | 505 | 872.0 > | | aBc | | | | | > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---
[jira] [Commented] (DRILL-4145) IndexOutOfBoundsException raised during select * query on S3 csv file
[ https://issues.apache.org/jira/browse/DRILL-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035590#comment-15035590 ] ASF GitHub Bot commented on DRILL-4145: --- GitHub user StevenMPhillips opened a pull request: https://github.com/apache/drill/pull/287 DRILL-4145: Handle empty final field in Text reader correctly You can merge this pull request into a Git repository by running: $ git pull https://github.com/StevenMPhillips/drill drill-4145 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/287.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #287 commit 8f56250aeb29d5d21bcdc6c727cec89607150224 Author: Steven Phillips Date: 2015-12-02T10:09:20Z DRILL-4145: Handle empty final field in Text reader correctly > IndexOutOfBoundsException raised during select * query on S3 csv file > - > > Key: DRILL-4145 > URL: https://issues.apache.org/jira/browse/DRILL-4145 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.3.0 > Environment: Drill 1.3.0 on a 3 node distributed-mode cluster on AWS. > Data files on S3. > S3 storage plugin configuration: > { > "type": "file", > "enabled": true, > "connection": "s3a://", > "workspaces": { > "root": { > "location": "/", > "writable": false, > "defaultInputFormat": null > }, > "views": { > "location": "/processed", > "writable": true, > "defaultInputFormat": null > }, > "tmp": { > "location": "/tmp", > "writable": true, > "defaultInputFormat": null > } > }, > "formats": { > "psv": { > "type": "text", > "extensions": [ > "tbl" > ], > "delimiter": "|" > }, > "csv": { > "type": "text", > "extensions": [ > "csv" > ], > "extractHeader": true, > "delimiter": "," > }, > "tsv": { > "type": "text", > "extensions": [ > "tsv" > ], > "delimiter": "\t" > }, > "parquet": { > "type": "parquet" > }, > "json": { > "type": "json" > }, > "avro": { > "type": "avro" > }, > "sequencefile": { > "type": "sequencefile", > "extensions": [ > "seq" > ] > }, > "csvh": { > "type": "text", > "extensions": [ > "csvh", > "csv" > ], > "extractHeader": true, > "delimiter": "," > } > } > } >Reporter: Peter McTaggart > Attachments: apps1-bad.csv, apps1.csv > > > When trying to query (via sqlline or WebUI) a .csv file I am getting an > IndexOutofBoundsException: > {noformat} 0: jdbc:drill:> select * from > s3data.root.`staging/data/apps1-bad.csv` limit 1; > Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 > (expected: range(0, 16384)) > Fragment 0:0 > [Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on > ip-X.compute.internal:31010] (state=,code=0) > 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1; > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > | FIELD_1 | FIELD_2| FIELD_3 | FIELD_4 | FIELD_5 | FIELD_6 > | FIELD_7 | FIELD_8 | FIELD_9 | FIELD_10 | FIELD_11 | > FIELD_12 | FIELD_13 | FIELD_14 | FIELD_15 | FIELD_16 | FIELD_17 | > FIELD_18 | FIELD_19 | FIELD_20 | FIELD_21 | FIELD_22 | > FIELD_23 | FIELD_24 | FIELD_25 | FIELD_26 | FIELD_27 | FIELD_28 | > FIELD_29 | FIELD_30 | FIELD_31 | FIELD_32 | FIELD_33 | FIELD_34 | > FIELD_35 | > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > | 489517 | 27/10/2015 02:05:27 | 261 | 1130232 | 0| > 925630488 | 0| 925630488 | -1 | 19531580547 | | > 27/10/2015 02:00
[jira] [Assigned] (DRILL-4145) IndexOutOfBoundsException raised during select * query on S3 csv file
[ https://issues.apache.org/jira/browse/DRILL-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steven Phillips reassigned DRILL-4145: -- Assignee: Steven Phillips > IndexOutOfBoundsException raised during select * query on S3 csv file > - > > Key: DRILL-4145 > URL: https://issues.apache.org/jira/browse/DRILL-4145 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.3.0 > Environment: Drill 1.3.0 on a 3 node distributed-mode cluster on AWS. > Data files on S3. > S3 storage plugin configuration: > { > "type": "file", > "enabled": true, > "connection": "s3a://", > "workspaces": { > "root": { > "location": "/", > "writable": false, > "defaultInputFormat": null > }, > "views": { > "location": "/processed", > "writable": true, > "defaultInputFormat": null > }, > "tmp": { > "location": "/tmp", > "writable": true, > "defaultInputFormat": null > } > }, > "formats": { > "psv": { > "type": "text", > "extensions": [ > "tbl" > ], > "delimiter": "|" > }, > "csv": { > "type": "text", > "extensions": [ > "csv" > ], > "extractHeader": true, > "delimiter": "," > }, > "tsv": { > "type": "text", > "extensions": [ > "tsv" > ], > "delimiter": "\t" > }, > "parquet": { > "type": "parquet" > }, > "json": { > "type": "json" > }, > "avro": { > "type": "avro" > }, > "sequencefile": { > "type": "sequencefile", > "extensions": [ > "seq" > ] > }, > "csvh": { > "type": "text", > "extensions": [ > "csvh", > "csv" > ], > "extractHeader": true, > "delimiter": "," > } > } > } >Reporter: Peter McTaggart >Assignee: Steven Phillips > Attachments: apps1-bad.csv, apps1.csv > > > When trying to query (via sqlline or WebUI) a .csv file I am getting an > IndexOutofBoundsException: > {noformat} 0: jdbc:drill:> select * from > s3data.root.`staging/data/apps1-bad.csv` limit 1; > Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 > (expected: range(0, 16384)) > Fragment 0:0 > [Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on > ip-X.compute.internal:31010] (state=,code=0) > 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1; > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > | FIELD_1 | FIELD_2| FIELD_3 | FIELD_4 | FIELD_5 | FIELD_6 > | FIELD_7 | FIELD_8 | FIELD_9 | FIELD_10 | FIELD_11 | > FIELD_12 | FIELD_13 | FIELD_14 | FIELD_15 | FIELD_16 | FIELD_17 | > FIELD_18 | FIELD_19 | FIELD_20 | FIELD_21 | FIELD_22 | > FIELD_23 | FIELD_24 | FIELD_25 | FIELD_26 | FIELD_27 | FIELD_28 | > FIELD_29 | FIELD_30 | FIELD_31 | FIELD_32 | FIELD_33 | FIELD_34 | > FIELD_35 | > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > | 489517 | 27/10/2015 02:05:27 | 261 | 1130232 | 0| > 925630488 | 0| 925630488 | -1 | 19531580547 | | > 27/10/2015 02:00:00 | | 30| 300 | 0 | 0 >| | | 27/10/2015 02:05:27 | 0 | 1 | 0 > | 35.0 | | | | 505 | 872.0 > | | aBc | | | | | > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+--
[jira] [Commented] (DRILL-4145) IndexOutOfBoundsException raised during select * query on S3 csv file
[ https://issues.apache.org/jira/browse/DRILL-4145?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035589#comment-15035589 ] Steven Phillips commented on DRILL-4145: There is a bug in the case where there is an empty string for the last field. Basically, when the parser sees the pattern , the parser calls the "endEmptyField()" method of the TextInput. This was ok when using the RepeatedVarCharInput, because calling this method resulted in an empty string element being added to the array. But in the FieldVarCharOutput, ending the field doesn't do anything unless you first start the field. > IndexOutOfBoundsException raised during select * query on S3 csv file > - > > Key: DRILL-4145 > URL: https://issues.apache.org/jira/browse/DRILL-4145 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.3.0 > Environment: Drill 1.3.0 on a 3 node distributed-mode cluster on AWS. > Data files on S3. > S3 storage plugin configuration: > { > "type": "file", > "enabled": true, > "connection": "s3a://", > "workspaces": { > "root": { > "location": "/", > "writable": false, > "defaultInputFormat": null > }, > "views": { > "location": "/processed", > "writable": true, > "defaultInputFormat": null > }, > "tmp": { > "location": "/tmp", > "writable": true, > "defaultInputFormat": null > } > }, > "formats": { > "psv": { > "type": "text", > "extensions": [ > "tbl" > ], > "delimiter": "|" > }, > "csv": { > "type": "text", > "extensions": [ > "csv" > ], > "extractHeader": true, > "delimiter": "," > }, > "tsv": { > "type": "text", > "extensions": [ > "tsv" > ], > "delimiter": "\t" > }, > "parquet": { > "type": "parquet" > }, > "json": { > "type": "json" > }, > "avro": { > "type": "avro" > }, > "sequencefile": { > "type": "sequencefile", > "extensions": [ > "seq" > ] > }, > "csvh": { > "type": "text", > "extensions": [ > "csvh", > "csv" > ], > "extractHeader": true, > "delimiter": "," > } > } > } >Reporter: Peter McTaggart > Attachments: apps1-bad.csv, apps1.csv > > > When trying to query (via sqlline or WebUI) a .csv file I am getting an > IndexOutofBoundsException: > {noformat} 0: jdbc:drill:> select * from > s3data.root.`staging/data/apps1-bad.csv` limit 1; > Error: SYSTEM ERROR: IndexOutOfBoundsException: index: 16384, length: 4 > (expected: range(0, 16384)) > Fragment 0:0 > [Error Id: be9856d2-0b80-4b9c-94a4-a1ca38ec5db0 on > ip-X.compute.internal:31010] (state=,code=0) > 0: jdbc:drill:> select * from s3data.root.`staging/data/apps1.csv` limit 1; > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > | FIELD_1 | FIELD_2| FIELD_3 | FIELD_4 | FIELD_5 | FIELD_6 > | FIELD_7 | FIELD_8 | FIELD_9 | FIELD_10 | FIELD_11 | > FIELD_12 | FIELD_13 | FIELD_14 | FIELD_15 | FIELD_16 | FIELD_17 | > FIELD_18 | FIELD_19 | FIELD_20 | FIELD_21 | FIELD_22 | > FIELD_23 | FIELD_24 | FIELD_25 | FIELD_26 | FIELD_27 | FIELD_28 | > FIELD_29 | FIELD_30 | FIELD_31 | FIELD_32 | FIELD_33 | FIELD_34 | > FIELD_35 | > +--+--+--+--+--++--++--+--+---+--+---+---+---+---+---+---+---+--+---+---+---+---+---+---+---+---+---+---+---+---+---+---+---+ > | 489517 | 27/10/2015 02:05:27 | 261 | 1130232 | 0| > 925630488 | 0| 925630488 | -1 | 19531580547 | | > 27/10/2015 02:00:00 | | 30| 300 | 0 | 0 >| | | 27/10/2015 02:05:27 | 0 | 1 | 0 > | 35.0 | | | | 505 | 872.0 > | | aBc | | | | | > +--+---
[jira] [Commented] (DRILL-4108) Query on csv file w/ header fails with an exception when non existing column is requested
[ https://issues.apache.org/jira/browse/DRILL-4108?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15035584#comment-15035584 ] ASF GitHub Bot commented on DRILL-4108: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/269 > Query on csv file w/ header fails with an exception when non existing column > is requested > - > > Key: DRILL-4108 > URL: https://issues.apache.org/jira/browse/DRILL-4108 > Project: Apache Drill > Issue Type: Bug > Components: Storage - Text & CSV >Affects Versions: 1.3.0 >Reporter: Abhi Pol > Fix For: 1.4.0 > > > Drill query on a csv file with header requesting column(s) that do not exists > in header fails with an exception. > *Current behavior:* once extractHeader is enabled, query columns must be > columns from the header > *Expected behavior:* non existing columns should appear with 'null' values > like default drill behavior > {noformat} > 0: jdbc:drill:zk=local> select Category from dfs.`/tmp/cars.csvh` limit 10; > java.lang.ArrayIndexOutOfBoundsException: -1 > at > org.apache.drill.exec.store.easy.text.compliant.FieldVarCharOutput.(FieldVarCharOutput.java:104) > at > org.apache.drill.exec.store.easy.text.compliant.CompliantTextRecordReader.setup(CompliantTextRecordReader.java:118) > at > org.apache.drill.exec.physical.impl.ScanBatch.(ScanBatch.java:108) > at > org.apache.drill.exec.store.dfs.easy.EasyFormatPlugin.getReaderBatch(EasyFormatPlugin.java:198) > at > org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:35) > at > org.apache.drill.exec.store.dfs.easy.EasyReaderBatchCreator.getBatch(EasyReaderBatchCreator.java:28) > at > org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:151) > at > org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174) > at > org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131) > at > org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174) > at > org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131) > at > org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174) > at > org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch(ImplCreator.java:131) > at > org.apache.drill.exec.physical.impl.ImplCreator.getChildren(ImplCreator.java:174) > at > org.apache.drill.exec.physical.impl.ImplCreator.getRootExec(ImplCreator.java:105) > at > org.apache.drill.exec.physical.impl.ImplCreator.getExec(ImplCreator.java:79) > at > org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:230) > at > org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615) > at java.lang.Thread.run(Thread.java:745) > Error: SYSTEM ERROR: ArrayIndexOutOfBoundsException: -1 > Fragment 0:0 > [Error Id: f272960e-fa2f-408e-918c-722190398cd3 on blackhole:31010] > (state=,code=0) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)