[jira] [Commented] (DRILL-7787) Apache drill failed to start

2020-09-08 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17192314#comment-17192314
 ] 

Abhishek Girish commented on DRILL-7787:


[~shivamsaxena] you can use the master docker images. 1.19.0 isn't released 
yet, so you will need to try on master.

> Apache drill failed to start
> 
>
> Key: DRILL-7787
> URL: https://issues.apache.org/jira/browse/DRILL-7787
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Om Prasad Surapu
>Priority: Major
> Fix For: 1.19.0
>
>
> Hi Team,
> I have apache drill cluster setup with apache-drill-1.17.0 and started in 
> distributed mode (with zookeeper). Drill started and no issues reported.
>  
> Have installed apache-drill-1.18.0 to fix  DRILL-7786 but drill failed to 
> start with below exception. I have tried zookeeper version 3.5.8 and 3.4.11). 
> Could you help me out to fix this issue?
> Exception in thread "main" 
> org.apache.drill.exec.exception.DrillbitStartupException: Failure during 
> initial startup of Drillbit.
>  at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:588)
>  at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:554)
>  at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:550)
> Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: unable 
> to put 
>  at 
> org.apache.drill.exec.coord.zk.ZookeeperClient.putIfAbsent(ZookeeperClient.java:326)
>  at 
> org.apache.drill.exec.store.sys.store.ZookeeperPersistentStore.putIfAbsent(ZookeeperPersistentStore.java:119)
>  at 
> org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.prepareStores(RemoteFunctionRegistry.java:201)
>  at 
> org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.init(RemoteFunctionRegistry.java:108)
>  at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:233)
>  at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:584)
>  ... 2 more
> Caused by: org.apache.zookeeper.KeeperException$UnimplementedException: 
> KeeperErrorCode = Unimplemented for /drill/udf/registry
>  at org.apache.zookeeper.KeeperException.create(KeeperException.java:106)
>  at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>  at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:1637)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl$17.call(CreateBuilderImpl.java:1180)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl$17.call(CreateBuilderImpl.java:1156)
>  at 
> org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:67)
>  at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:81)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:1153)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:607)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:597)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:51)
>  at 
> org.apache.drill.exec.coord.zk.ZookeeperClient.putIfAbsent(ZookeeperClient.java:318)
>  ... 7 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7751) Add Storage Plugin for Splunk

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7751?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7751:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Add Storage Plugin for Splunk
> -
>
> Key: DRILL-7751
> URL: https://issues.apache.org/jira/browse/DRILL-7751
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Other
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.19.0
>
>
> # Drill Connector for Splunk
> This plugin enables Drill to query Splunk. 
> ## Configuration
> To connect Drill to Splunk, create a new storage plugin with the following 
> configuration:
> To successfully connect, Splunk uses port `8089` for interfaces.  This port 
> must be open for Drill to query Splunk. 
> ```json
> {
>"type":"splunk",
>"username": "admin",
>"password": "changeme",
>"hostname": "localhost",
>"port": 8089,
>"earliestTime": "-14d",
>"latestTime": "now",
>"enabled": false
> }
> ```
> ## Understanding Splunk's Data Model
> Splunk's primary use case is analyzing event logs with a timestamp. As such, 
> data is indexed by the timestamp, with the most recent data being indexed 
> first.  By default, Splunk
>  will sort the data in reverse chronological order.  Large Splunk 
> installations will put older data into buckets of hot, warm and cold storage 
> with the "cold" storage on the
>   slowest and cheapest disks.
>   
> With this understood, it is **very** important to put time boundaries on your 
> Splunk queries. The Drill plugin allows you to set default values in the 
> configuration such that every
>  query you run will be bounded by these boundaries.  Alternatively, you can 
> set the time boundaries at query time.  In either case, you will achieve the 
> best performance when
>   you are asking Splunk for the smallest amount of data possible.
>   
> ## Understanding Drill's Data Model with Splunk
> Drill treats Splunk indexes as tables. Splunk's access model does not 
> restrict to the catalog, but does restrict access to the actual data. It is 
> therefore possible that you can
>  see the names of indexes to which you do not have access.  You can view the 
> list of available indexes with a `SHOW TABLES IN splunk` query.
>   
> ```
> apache drill> SHOW TABLES IN splunk;
> +--++
> | TABLE_SCHEMA |   TABLE_NAME   |
> +--++
> | splunk   | summary|
> | splunk   | splunklogger   |
> | splunk   | _thefishbucket |
> | splunk   | _audit |
> | splunk   | _internal  |
> | splunk   | _introspection |
> | splunk   | main   |
> | splunk   | history|
> | splunk   | _telemetry |
> +--++
> 9 rows selected (0.304 seconds)
> ```
> To query Splunk from Drill, use the following format: 
> ```sql
> SELECT 
> FROM splunk.
> ```
>   
>  ## Bounding Your Queries
>   When you learn to query Splunk via their interface, the first thing you 
> learn is to bound your queries so that they are looking at the shortest time 
> span possible. When using
>Drill to query Splunk, it is advisable to do the same thing, and Drill 
> offers two ways to accomplish this: via the configuration and at query time.
>
>   ### Bounding your Queries at Query Time
>   The easiest way to bound your query is to do so at querytime via special 
> filters in the `WHERE` clause. There are two special fields, `earliestTime` 
> and `latestTime` which can
>be set to bound the query. If they are not set, the query will be bounded 
> to the defaults set in the configuration.
>
>You can use any of the time formats specified in the Splunk documentation 
> here:   
>   
> https://docs.splunk.com/Documentation/Splunk/8.0.3/SearchReference/SearchTimeModifiers
>   
>   So if you wanted to see your data for the last 15 minutes, you could 
> execute the following query:
> ```sql
> SELECT 
> FROM splunk.
> WHERE earliestTime='-15m' AND latestTime='now'
> ```
> The variables set in a query override the defaults from the configuration. 
>   
>  ## Data Types
>   Splunk does not have sophisticated data types and unfortunately does not 
> provide metadata from its query results.  With the exception of the fields 
> below, Drill will interpret
>all fields as `VARCHAR` and hence you will have to convert them to the 
> appropriate data type at query time.
>   
>    Timestamp Fields
>   * `_indextime`
>   * `_time` 
>   
>    Numeric Fields
>   * `date_hour` 
>   * `date_mday`
>   * `date_minute`
>   * `date_second` 
>   * `date_year`
>   * `linecount`
>   
>  ### Nested Data
>  Splunk has two different types of nested data which 

[jira] [Updated] (DRILL-7763) Add Limit Pushdown to File Based Storage Plugins

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7763?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7763:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Add Limit Pushdown to File Based Storage Plugins
> 
>
> Key: DRILL-7763
> URL: https://issues.apache.org/jira/browse/DRILL-7763
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.19.0
>
>
> As currently implemented, when querying a file, Drill will read the entire 
> file even if a limit is specified in the query.  This PR does a few things:
>  # Refactors the EasyGroupScan, EasySubScan, and EasyFormatConfig to allow 
> the option of pushing down limits.
>  # Applies this to all the EVF based format plugins which are: LogRegex, 
> PCAP, SPSS, Esri, Excel and Text (CSV). 
> Due to JSON's fluid schema, it would be unwise to adopt the limit pushdown as 
> it could result in very inconsistent schemata.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7223) Make the timeout in TimedCallable a configurable boot time parameter

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7223?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7223:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Make the timeout in TimedCallable a configurable boot time parameter
> 
>
> Key: DRILL-7223
> URL: https://issues.apache.org/jira/browse/DRILL-7223
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Aman Sinha
>Assignee: Boaz Ben-Zvi
>Priority: Minor
> Fix For: 1.19.0
>
>
> The 
> [TimedCallable.TIMEOUT_PER_RUNNABLE_IN_MSECS|https://github.com/apache/drill/blob/master/exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedCallable.java#L52]
>  is currently an internal Drill constant defined as 15 secs. This has been 
> there from day 1 of the introduction. Drill's TimedCallable implements the 
> Java concurrency's Callable interface to create timed threads. It is used by 
> the REFRESH METADATA command which creates multiple threads on the Foreman 
> node to gather Parquet metadata to build the metadata cache.
> Depending on the load on the system or for very large scale number of parquet 
> files (millions) it is possible to exceed this timeout.  While the exact root 
> cause of exceeding the timeout is being investigated, it makes sense to make 
> this timeout a configurable parameter to aid with large scale testing. This 
> JIRA is to make this a configurable bootstrapping option in the 
> drill-override.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-6953) Merge row set-based JSON reader

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-6953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-6953:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Merge row set-based JSON reader
> ---
>
> Key: DRILL-6953
> URL: https://issues.apache.org/jira/browse/DRILL-6953
> Project: Apache Drill
>  Issue Type: Sub-task
>Affects Versions: 1.15.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.19.0
>
>
> The final step in the ongoing "result set loader" saga is to merge the 
> revised JSON reader into master. This reader does two key things:
> * Demonstrates the prototypical "late schema" style of data reading (discover 
> schema while reading).
> * Implements many tricks and hacks to handle schema changes while loading.
> * Shows that, even with all these tricks, the only true solution is to 
> actually have a schema.
> The new JSON reader:
> * Uses an expanded state machine when parsing rather than the complex set of 
> if-statements in the current version.
> * Handles reading a run of nulls before seeing the first data value (as long 
> as the data value shows up in the first record batch).
> * Uses the result-set loader to generate fixed-size batches regardless of the 
> complexity, depth of structure, or width of variable-length fields.
> While the JSON reader itself is helpful, the key contribution is that it 
> shows how to use the entire kit of parts: result set loader, projection 
> framework, and so on. Since the projection framework can handle an external 
> schema, it is also a handy foundation for the ongoing schema project.
> Key work to complete after this merger will be to reconcile actual data with 
> the external schema. For example, if we know a column is supposed to be a 
> VarChar, then read the column as a VarChar regardless of the type JSON itself 
> picks. Or, if a column is supposed to be a Double, then convert Int and 
> String JSON values into Doubles.
> The Row Set framework was designed to allow inserting custom column writers. 
> This would be a great opportunity to do the work needed to create them. Then, 
> use the new JSON framework to allow parsing a JSON field as a specified Drill 
> type.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7728) Drill SPI framework

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7728?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7728:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Drill SPI framework
> ---
>
> Key: DRILL-7728
> URL: https://issues.apache.org/jira/browse/DRILL-7728
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.18.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.19.0
>
>
> Provide the basic framework to load an extension in Drill, modelled after the 
> Java Service Provider concept. Excludes full class loader isolation for now.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7554) Convert LTSV Format Plugin to EVF

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7554?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7554:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Convert LTSV Format Plugin to EVF
> -
>
> Key: DRILL-7554
> URL: https://issues.apache.org/jira/browse/DRILL-7554
> Project: Apache Drill
>  Issue Type: Improvement
>  Components: Storage - Text  CSV
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.19.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7535) Convert Ltsv to EVF

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7535?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7535:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Convert Ltsv to EVF
> ---
>
> Key: DRILL-7535
> URL: https://issues.apache.org/jira/browse/DRILL-7535
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Arina Ielchiieva
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.19.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7729) Use java.time in column accessors

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7729?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7729:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Use java.time in column accessors
> -
>
> Key: DRILL-7729
> URL: https://issues.apache.org/jira/browse/DRILL-7729
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.19.0
>
>
> Use {{java.time}} classes in the column accessors, except for {{Interval}}, 
> which has no {{java.time}} equivalent. Doing so allows us to create a row-set 
> version of Drill's JSON writer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7112) Code Cleanup for HTTPD Format Plugin

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7112:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Code Cleanup for HTTPD Format Plugin
> 
>
> Key: DRILL-7112
> URL: https://issues.apache.org/jira/browse/DRILL-7112
> Project: Apache Drill
>  Issue Type: New Feature
>Affects Versions: 1.15.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Minor
> Fix For: 1.19.0
>
>
> Address code clean up issues cited in 
> https://github.com/apache/drill/pull/1635.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7733) Use streaming for REST JSON queries

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7733?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7733:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Use streaming for REST JSON queries
> ---
>
> Key: DRILL-7733
> URL: https://issues.apache.org/jira/browse/DRILL-7733
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.19.0
>
>
> Several uses on the user and dev mail lists have complained about the memory 
> overhead when running a REST JSON query: {{http:://node:8047/query.json}}. 
> The current implementation buffers the entire result set in memory, then lets 
> Jersey/Jetty convert the results to JSON. The result is very heavy heap use 
> for larger query result sets.
> This ticket requests a change to use streaming. As each batch arrives at the 
> Screen operator, convert that batch to JSON and directly stream the results 
> to the client network connection, much as is done for the native client 
> connection.
> For backward compatibility, the form of the JSON must be the same as the 
> current API.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7458) Base storage plugin framework

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7458:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Base storage plugin framework
> -
>
> Key: DRILL-7458
> URL: https://issues.apache.org/jira/browse/DRILL-7458
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>  Labels: doc-impacting
> Fix For: 1.19.0
>
>
> The "Easy" framework allows third-parties to add format plugins to Drill with 
> moderate effort. (The process could be easier, but "Easy" makes it as simple 
> as possible given the current structure.)
> At present, no such "starter" framework exists for storage plugins. Further, 
> multiple storage plugins have implemented filter push down, seemingly by 
> copying large blocks of code.
> This ticket offers a "base" framework for storage plugins and for filter 
> push-downs. The framework builds on the EVF, allowing plugins to also support 
> project push down.
> The framework has a "test mule" storage plugin to verify functionality, and 
> was used as the basis of an REST-like plugin.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-4232) Support for EXCEPT set operator

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-4232?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-4232:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Support for EXCEPT set operator
> ---
>
> Key: DRILL-4232
> URL: https://issues.apache.org/jira/browse/DRILL-4232
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Query Planning  Optimization
>Reporter: Victoria Markman
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.19.0
>
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7558) Generalize filter push-down planner phase

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7558:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Generalize filter push-down planner phase
> -
>
> Key: DRILL-7558
> URL: https://issues.apache.org/jira/browse/DRILL-7558
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.18.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.19.0
>
>
> DRILL-7458 provides a base framework for storage plugins, including a 
> simplified filter push-down mechanism. [~volodymyr] notes that it may be 
> *too* simple:
> {quote}
> What about the case when this rule was applied for one filter, but planner at 
> some point pushed another filter above the scan, for example, if we have such 
> case:
> {code}
> Filter(a=2)
>   Join(t1.b=t2.b, type=inner)
> Filter(b=3)
> Scan(t1)
> Scan(t2)
> {code}
> Filter b=3 will be pushed into scan, planner will push filter above join:
> {code}
> Join(t1.b=t2.b, type=inner)
> Filter(a=2)
> Scan(t1, b=3)
> Scan(t2)
> {code}
> In this case, check whether filter was pushed is not enough.
> {quote}
> Drill divides planning into a number of *phases*, each defined by a set of 
> *rules*. Most storage plugins perform filter push-down during the physical 
> planning stage. However, by this point, Drill has already decided on the 
> degree of parallelism: it is too late to use filter push-down to set the 
> degree of parallelism. Yet, if using something like a REST API, we want to 
> use filters to help us shard the query (that is, to set the degree of 
> parallelism.)
>  
> DRILL-7458 performs filter push-down at *logical* planning time to work 
> around the above limitation. (In Drill, there are three different phases that 
> could be considered the logical phase, depending on which planning options 
> are set to control Calcite.)
> [~volodymyr] points out that the the logical plan phase may be wrong because 
> it will perform rewrites of the type he cited.
> Thus, we need to research where to insert filter push down. It must come:
> * After rewrites of the kind described above.
> * After join equivalence computations. (See DRILL-7556.)
> * Before the decision is made about the number of minor fragments.
> The goal of this ticket is to either:
> * Research to identify an existing phase which satisfies these requirements, 
> or
> * Create a new phase.
> Due to the way Calcite works, it is not a good idea to have a single phase 
> handle two tasks that depend on one another. That is, we cannot combine 
> filter push down in a phase which defines the filters, nor can we add filter 
> push-down in a phase that choose parallelism.
> Background: Calcite is a rule-based query planner inspired by 
> [Volcano|https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf].
> The above issue is a flaw with rule-based planners and was identified as 
> early as the [Cascades query framework 
> paper|https://www.csd.uoc.gr/~hy460/pdf/CascadesFrameworkForQueryOptimization.pdf]
>  which was the follow-up to Volcano.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7550) Add Storage Plugin for Cassandra

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7550:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Add Storage Plugin for Cassandra
> 
>
> Key: DRILL-7550
> URL: https://issues.apache.org/jira/browse/DRILL-7550
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.19.0
>
>
> Apache Cassandra is a free and open-source, distributed, wide column store, 
> NoSQL database management system designed to handle large amounts of data 
> across many commodity servers, providing high availability with no single 
> point of failure. [1]
> This PR would enable Drill to query Cassandra data stores.
>  
> [1]: https://en.wikipedia.org/wiki/Apache_Cassandra



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7551) Improve Error Reporting

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7551:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Improve Error Reporting
> ---
>
> Key: DRILL-7551
> URL: https://issues.apache.org/jira/browse/DRILL-7551
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.19.0
>
>
> This Jira is to serve as a master Jira issue to improve the usability of 
> error messages. Instead of dumping stack traces, the overall goal is to give 
> the user something that can actually explain:
>  # What went wrong
>  # How to fix 
> Work that relates to this, should be created as subtasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7712) Fix issues after ZK upgrade

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7712:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Fix issues after ZK upgrade
> ---
>
> Key: DRILL-7712
> URL: https://issues.apache.org/jira/browse/DRILL-7712
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Arina Ielchiieva
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.19.0
>
>
> Warnings during jdbc-all build (absent when building with Mapr profile):
> {noformat}
> netty-transport-native-epoll-4.1.45.Final.jar, 
> netty-transport-native-epoll-4.0.48.Final-linux-x86_64.jar define 46 
> overlapping classes: 
>   - io.netty.channel.epoll.AbstractEpollStreamChannel$2
>   - io.netty.channel.epoll.AbstractEpollServerChannel$EpollServerSocketUnsafe
>   - io.netty.channel.epoll.EpollDatagramChannel
>   - io.netty.channel.epoll.AbstractEpollStreamChannel$SpliceInChannelTask
>   - io.netty.channel.epoll.NativeDatagramPacketArray
>   - io.netty.channel.epoll.EpollSocketChannelConfig
>   - io.netty.channel.epoll.EpollTcpInfo
>   - io.netty.channel.epoll.EpollEventArray
>   - io.netty.channel.epoll.EpollEventLoop
>   - io.netty.channel.epoll.EpollSocketChannel
>   - 36 more...
> netty-transport-native-unix-common-4.1.45.Final.jar, 
> netty-transport-native-epoll-4.0.48.Final-linux-x86_64.jar define 15 
> overlapping classes: 
>   - io.netty.channel.unix.Errors$NativeConnectException
>   - io.netty.channel.unix.ServerDomainSocketChannel
>   - io.netty.channel.unix.DomainSocketAddress
>   - io.netty.channel.unix.Socket
>   - io.netty.channel.unix.NativeInetAddress
>   - io.netty.channel.unix.DomainSocketChannelConfig
>   - io.netty.channel.unix.Errors$NativeIoException
>   - io.netty.channel.unix.DomainSocketReadMode
>   - io.netty.channel.unix.ErrorsStaticallyReferencedJniMethods
>   - io.netty.channel.unix.UnixChannel
>   - 5 more...
> maven-shade-plugin has detected that some class files are
> present in two or more JARs. When this happens, only one
> single version of the class is copied to the uber jar.
> Usually this is not harmful and you can skip these warnings,
> otherwise try to manually exclude artifacts based on
> mvn dependency:tree -Ddetail=true and the above output.
> See http://maven.apache.org/plugins/maven-shade-plugin/
> {noformat}
> Additional warning build with Mapr profile:
> {noformat}
> The following patterns were never triggered in this artifact inclusion filter:
> o  'org.apache.zookeeper:zookeeper-jute'
> {noformat}
> NPEs in tests (though tests do not fail):
> {noformat}
> [INFO] Running org.apache.drill.exec.coord.zk.TestZookeeperClient
> 4880
> java.lang.NullPointerException
> 4881
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269)
> 4882
>   at 
> org.apache.zookeeper.server.ZKDatabase.fastForwardDataBase(ZKDatabase.java:251)
> 4883
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:583)
> 4884
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:546)
> 4885
>   at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.shutdown(NIOServerCnxnFactory.java:
> {noformat}
> {noformat}
> [INFO] Running org.apache.drill.exec.coord.zk.TestEphemeralStore
> 5278
> java.lang.NullPointerException
> 5279
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269)
> 5280
>   at org.apache.zookeepe
> {noformat}
> {noformat}
> [INFO] Running org.apache.drill.yarn.zk.TestAmRegistration
> 6767
> java.lang.NullPointerException
> 6768
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269)
> 6769
>   at 
> org.apache.zookeeper.server.ZKDatabase.fastForwardDataBase(ZKDatabase.java:251)
> 6770
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:583)
> 6771
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:546)
> 6772
>   at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.shutdown(NIOServerCnxnFactory.java:929)
> 6773
>   at org.apache.curator.t
> {noformat}
> {noformat}
> org.apache.drill.yarn.client.TestCommandLineOptions
> 6823
> java.lang.NullPointerException
> 6824
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269)
> 6825
>   at 
> org.apache.zookeeper.server.ZKDatabase.fastForwardDataBase(ZKDatabase.java:251)
> 6826
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:583)
> 6827
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:546)
> 6828
>   

[jira] [Updated] (DRILL-7366) Improve Null Handling for UDFs with Complex Output

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7366:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Improve Null Handling for UDFs with Complex Output
> --
>
> Key: DRILL-7366
> URL: https://issues.apache.org/jira/browse/DRILL-7366
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.19.0
>
>
> If there is a UDF which has a complex field (Map or List) as output, Drill 
> does not allow the UDF to have nullable input and it creates additional 
> complexity when writing these kinds of UDFs. 
> I therefore would like to propose that two options be added to the 
> FunctionTemplate for null handling:  {{EMPTY_LIST_IF_NULL}}, and 
> {{EMPTY_MAP_IF_NULL}} which, would simplify UDF creation.  I'm envisioning 
> that if either of these options were selected, and the UDF receives any null 
> value as input, the UDF will return either an empty map or list. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7597) Read selected JSON colums as JSON text

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7597:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Read selected JSON colums as JSON text
> --
>
> Key: DRILL-7597
> URL: https://issues.apache.org/jira/browse/DRILL-7597
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.19.0
>
>
> See DRILL-7598. The use case wishes to read selected JSON columns as JSON 
> text rather than parsing the JSON into a relational structure as is done 
> today in the JSON reader.
> The JSON reader supports "all text mode", but, despite the name, this mode 
> only works for scalars (primitives) such as numbers. It does not work for 
> structured types such as objects or arrays: such types are always parsed into 
> Drill structures (which causes the conflict describe in DRILL-7598.)
> Instead, we need a feature to read an entire JSON value, including structure, 
> as a JSON string.
> This feature would work best when the user can parse some parts of a JSON 
> input file into relational structure, others as JSON. (This is the use case 
> which the user list user faced.) So, we need a way to do that.
> Drill has a "provided schema" feature, which, at present, is used only for 
> text files (and recently with limited support in Avro.) We are working on a 
> project to add such support for JSON.
> Perhaps we can leverage this feature to allow the JSON reader to read chunks 
> of JSON as text which can be manipulated by those future JSON functions. In 
> the example, column "c" would be read as JSON text; Drill would not attempt 
> to parse it into a relational structure.
> As it turns out, the "new" JSON reader we're working on originally had a 
> feature to do just that, but we took it out because we were not sure it was 
> needed. Sounds like we should restore it as part of our "provided schema" 
> support. It could work this way: if you CREATE SCHEMA with column "c" as 
> VARCHAR (maybe with a hint to read as JSON), the JSON parser would read the 
> entire nested structure as JSON without trying to parse it into a relational 
> structure.
> This ticket asks to build the concept:
>  * Allow a `CREATE SCHEMA` option (to be designed) to designate a JSON field 
> to be read as JSON.
>  * Implement the "read column as JSON" feature in the new EVF-based JSON 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7557) Revise "Base" storage plugin filter-push down listerner with a builder

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7557:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Revise "Base" storage plugin filter-push down listerner with a builder
> --
>
> Key: DRILL-7557
> URL: https://issues.apache.org/jira/browse/DRILL-7557
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.18.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.19.0
>
>
> DRILL-7458 introduces a base framework for storage plugins and includes a 
> simplified mechanism for filter push down. Part of that mechanism includes a 
> "listener", with the bulk of the work done in a single method:
> {code:java}
> Pair> transform(GroupScan groupScan,
>   List> andTerms, Pair DisjunctionFilterSpec> orTerm);
> {code}
> Reviewers correctly pointed out that this method might be a bit too complex.
> The listener pattern pretty much forced the present design. To improve it, 
> we'd want to use a different design; maybe some kind of builder which might:
> * Accept the CNF and DNF terms via dedicated methods.
> * Perform a processing step.
> * Provide a number of methods to communicate the results, such as 1) whether 
> a new group scan is needed, 2) any CNF terms to retain, and 3) any DNF terms 
> to retain.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7531) Convert format plugins to EVF

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7531:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Convert format plugins to EVF
> -
>
> Key: DRILL-7531
> URL: https://issues.apache.org/jira/browse/DRILL-7531
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Arina Ielchiieva
>Priority: Major
> Fix For: 1.19.0
>
>
> This is umbrella Jira to track down process of converting format plugins to 
> EVF.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7671) Fix builds for cdh and hdp profiles

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7671:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Fix builds for cdh and hdp profiles
> ---
>
> Key: DRILL-7671
> URL: https://issues.apache.org/jira/browse/DRILL-7671
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.19.0
>
>
> cdh and hdp profiles use too obsolete versions of Hadoop and other libraries. 
> So when attempting to build the project with these profiles, the build fails 
> with compilation errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7325) Many operators do not set container record count

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7325:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Many operators do not set container record count
> 
>
> Key: DRILL-7325
> URL: https://issues.apache.org/jira/browse/DRILL-7325
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.19.0
>
>
> See DRILL-7324. The following are problems found because some operators fail 
> to set the record count for their containers.
> h4. Scan
> TestComplexTypeReader, on cluster setup, using the PojoRecordReader:
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from ScanBatch
> ScanBatch: Container record count not set
> Reason: ScanBatch never sets the record count of its container (this is a 
> generic issue, not specific to the PojoRecordReader).
> h4. Filter
> {{TestComplexTypeReader.testNonExistentFieldConverting()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from FilterRecordBatch
> FilterRecordBatch: Container record count not set
> {noformat}
> h4. Hash Join
> {{TestComplexTypeReader.test_array()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from HashJoinBatch
> HashJoinBatch: Container record count not set
> {noformat}
> Occurs on the first batch in which the hash join returns {{OK_NEW_SCHEMA}} 
> with no records.
> h4. Project
> TestCsvWithHeaders.testEmptyFile()}} (when the text reader returned empty, 
> schema-only batches):
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from ProjectRecordBatch
> ProjectRecordBatch: Container record count not set
> {noformat}
> Occurs in {{ProjectRecordBatch.handleNullInput()}}: it sets up the schema but 
> does not set the value count to 0.
> h4. Unordered Receiver
> {{TestCsvWithSchema.testMultiFileSchema()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from UnorderedReceiverBatch
> UnorderedReceiverBatch: Container record count not set
> {noformat}
> The problem is that {{RecordBatchLoader.load()}} does not set the container 
> record count.
> h4. Streaming Aggregate
> {{TestJsonReader.testSumWithTypeCase()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from StreamingAggBatch
> StreamingAggBatch: Container record count not set
> {noformat}
> The problem is that {{StreamingAggBatch.buildSchema()}} does not set the 
> container record count to 0.
> h4. Limit
> {{TestJsonReader.testDrill_1419()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from LimitRecordBatch
> LimitRecordBatch: Container record count not set
> {noformat}
> None of the paths in {{LimitRecordBatch.innerNext()}} set the container 
> record count.
> h4. Union All
> {{TestJsonReader.testKvgenWithUnionAll()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from UnionAllRecordBatch
> UnionAllRecordBatch: Container record count not set
> {noformat}
> When {{UnionAllRecordBatch}} calls 
> {{VectorAccessibleUtilities.setValueCount()}}, it did not also set the 
> container count.
> h4. Hash Aggregate
> {{TestJsonReader.drill_4479()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from HashAggBatch
> HashAggBatch: Container record count not set
> {noformat}
> Problem is that {{HashAggBatch.buildSchema()}} does not set the container 
> record count to 0 for the first, empty, batch sent for {{OK_NEW_SCHEMA.}}
> h4. And Many More
> I turns out that most operators fail to set one of the many row count 
> variables somewhere in their code path: maybe in the schema setup path, maybe 
> when building a batch along one of the many paths that operators follow. 
> Further, we have multiple row counts that must be set:
> * Values in each vector ({{setValueCount()}},
> * Row count in the container ({{setRecordCount()}}), which must be the same 
> as the vector value count.
> * Row count in the operator (batch), which is the (possibly filtered) count 
> of records presented to downstream operators. It must be less than or equal 
> to the container row count (except for an SV4.)
> * The SV2 record count, which is the number of entries in the SV2 and must be 
> the same as the batch row count (and less or equal to the container row 
> count.)
> * The SV2 actual bactch record count, which must be the same as the container 
> row count.
> * The SV4 record count, which must be the same as the batch record count. 
> 

[jira] [Updated] (DRILL-7556) Generalize the "Base" storage plugin filter push down mechanism

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7556:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Generalize the "Base" storage plugin filter push down mechanism
> ---
>
> Key: DRILL-7556
> URL: https://issues.apache.org/jira/browse/DRILL-7556
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.18.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.19.0
>
>
> DRILL-7458 adds a Base framework for storage plugins which includes a 
> simplified representation of filters that can be pushed down into Drill. It 
> makes the assumption that plugins can generally only handle filters of the 
> form:
> {code}
> column relop constant
> {code}
> For example, {{`foo` < 10}} or {{`bar` = "Fred"}}. (The code "flips" 
> expressions of the form {{constant relop column}}.)
> [~volodymyr] suggests this is too narrow and suggests two additional cases:
> {code}
> column-expr relop constant
> fn(column) = conststant
> {code}
> Examples:
> {code:sql}
> foo + 10 = 20
> substr(bar, 2, 6) = 'Fred'
> {code}
> The first case should be handled by a general expression rewriter: simplify 
> constant expressions:
> {code:sql}
> foo + 10 = 20 --> foo = 10
> {code}
> Then, filter-push down need only handle the simplified expression rather than 
> every push-down mechanism needing to do the simplification.
> For this ticket, we wish to handle the second case: any expression that 
> contains a single column associated with the target table. Provide a new 
> push-down node to handle the non-relop case so that simple plugins can simply 
> ignore such expressions, but more complex plugins (such as Parquet) can 
> optionally handle them.
> A second improvement is to handle the more complex case: two or more columns, 
> all of which come from the same target table. For example:
> {code:sql}
> foo + bar = 20
> {code}
> Where both {{foo}} and {{bar}} are from the same table. It would be a very 
> sophisticated plugin indeed (maybe the JDBC storage plugin) which can handle 
> this case, but it should be available.
> As part of this work, we must handle join-equivalent columns:
> {code:sql}
> SELECT ... FROM t1, t2
>   WHERE t1.a = t2.b
>   AND t1.a = 20
> {code}
> If the plugin for table {{t2}} can handle filter push-down, then the 
> expression {{t1.a = 20}} is join-equivalent to {{t2.b = 20}}.
> It is not clear if the Drill logical plan already handles join equivalence. 
> If not, it should be added. If so, the filter push-down mechanism should add 
> documentation that describes how the mechanism works.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7270) Fix non-https dependency urls and add checksum checks

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7270:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Fix non-https dependency urls and add checksum checks
> -
>
> Key: DRILL-7270
> URL: https://issues.apache.org/jira/browse/DRILL-7270
> Project: Apache Drill
>  Issue Type: Task
>  Components: Security
>Affects Versions: 1.16.0
>Reporter: Arina Ielchiieva
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.19.0
>
>
> Review any build scripts and configurations for insecure urls and make 
> appropriate fixes to use secure urls.
> Projects like Lucene do checksum whitelists of all their build dependencies, 
> and you may wish to consider that as a
> protection against threats beyond just MITM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7284) reusing the hashCodes computed at exchange nodes

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7284:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> reusing the hashCodes computed at exchange nodes
> 
>
> Key: DRILL-7284
> URL: https://issues.apache.org/jira/browse/DRILL-7284
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Weijie Tong
>Assignee: Weijie Tong
>Priority: Major
> Fix For: 1.19.0
>
>
> To HashJoin or HashAggregate, we will shuffle the input data according to 
> hashCodes of join conditions or group by keys at the exchange nodes. This 
> computing of the hash codes will be redo at the HashJoin or HashAggregate 
> nodes. We could send the computed hashCodes of exchange nodes to the upper 
> nodes. So the HashJoin or HashAggregate nodes will not need to do the hash 
> computing again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7192) Drill limits rows when autoLimit is disabled

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7192:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Drill limits rows when autoLimit is disabled
> 
>
> Key: DRILL-7192
> URL: https://issues.apache.org/jira/browse/DRILL-7192
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.19.0
>
>
> In DRILL-7048 was implemented autoLimit for JDBC and rest clients.
> *Steps to reproduce the issue:*
>  1. Check that autoLimit was disabled, if not, disable it and restart Drill.
>  2. Submit any query, and verify that rows count is correct, for example,
> {code:sql}
> SELECT * FROM cp.`employee.json`;
> {code}
> returns 1,155 rows
>  3. Enable autoLimit for sqlLine sqlLine client:
> {code:sql}
> !set rowLimit 10
> {code}
> 4. Submit the same query and verify that the result has 10 rows.
>  5. Disable autoLimit:
> {code:sql}
> !set rowLimit 0
> {code}
> 6. Submit the same query, but for this time, *it returns 10 rows instead of 
> 1,155*.
> Correct rows count is returned only after creating a new connection.
> The same issue is also observed for SQuirreL SQL client, but for example, for 
> Postgres, it works correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7525) Convert SequenceFiles to EVF

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7525:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Convert SequenceFiles to EVF
> 
>
> Key: DRILL-7525
> URL: https://issues.apache.org/jira/browse/DRILL-7525
> Project: Apache Drill
>  Issue Type: Sub-task
>Affects Versions: 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.19.0
>
>
> Convert SequenceFiles to EVF



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7133) Duplicate Corrupt PCAP Functionality in PCAP-NG Plugin

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7133:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Duplicate Corrupt PCAP Functionality in PCAP-NG Plugin
> --
>
> Key: DRILL-7133
> URL: https://issues.apache.org/jira/browse/DRILL-7133
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.19.0
>
>
> There was a JIRA (https://issues.apache.org/jira/browse/DRILL-7032) which 
> resulted in some improvements to the PCAP format plugin which converted the 
> TCP flags to boolean format and also added a {{is_corrupt}} boolean field.  
> This field allows users to look for packets that are corrupt. 
> Unfortunately, this functionality is not duplicated in the PCAP-NG format 
> plugin, so this JIRA proposes to do that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7621) Refactor ExecConstants and PlannerSettings constant classes

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7621:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Refactor ExecConstants and PlannerSettings constant classes
> ---
>
> Key: DRILL-7621
> URL: https://issues.apache.org/jira/browse/DRILL-7621
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Igor Guzenko
>Assignee: Igor Guzenko
>Priority: Major
> Fix For: 1.19.0
>
>
> According to [the 
> discussion|http://mail-archives.apache.org/mod_mbox/drill-dev/202003.mbox/%3CBCB4CFC2-8BC5-43C6-8BD4-956F66F6D0D3%40gmail.com%3E],
>  it makes sense to split the classes into multiple constant interfaces and 
> get rid of validator constants. Then the validator instances won't be used 
> for getting option values and the general approach will be getting type 
> specific option value by string key from config instance. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7787) Apache drill failed to start

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7787:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Apache drill failed to start
> 
>
> Key: DRILL-7787
> URL: https://issues.apache.org/jira/browse/DRILL-7787
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Om Prasad Surapu
>Priority: Major
> Fix For: 1.19.0
>
>
> Hi Team,
> I have apache drill cluster setup with apache-drill-1.17.0 and started in 
> distributed mode (with zookeeper). Drill started and no issues reported.
>  
> Have installed apache-drill-1.18.0 to fix  DRILL-7786 but drill failed to 
> start with below exception. I have tried zookeeper version 3.5.8 and 3.4.11). 
> Could you help me out to fix this issue?
> Exception in thread "main" 
> org.apache.drill.exec.exception.DrillbitStartupException: Failure during 
> initial startup of Drillbit.
>  at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:588)
>  at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:554)
>  at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:550)
> Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: unable 
> to put 
>  at 
> org.apache.drill.exec.coord.zk.ZookeeperClient.putIfAbsent(ZookeeperClient.java:326)
>  at 
> org.apache.drill.exec.store.sys.store.ZookeeperPersistentStore.putIfAbsent(ZookeeperPersistentStore.java:119)
>  at 
> org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.prepareStores(RemoteFunctionRegistry.java:201)
>  at 
> org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.init(RemoteFunctionRegistry.java:108)
>  at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:233)
>  at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:584)
>  ... 2 more
> Caused by: org.apache.zookeeper.KeeperException$UnimplementedException: 
> KeeperErrorCode = Unimplemented for /drill/udf/registry
>  at org.apache.zookeeper.KeeperException.create(KeeperException.java:106)
>  at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>  at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:1637)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl$17.call(CreateBuilderImpl.java:1180)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl$17.call(CreateBuilderImpl.java:1156)
>  at 
> org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:67)
>  at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:81)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:1153)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:607)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:597)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:51)
>  at 
> org.apache.drill.exec.coord.zk.ZookeeperClient.putIfAbsent(ZookeeperClient.java:318)
>  ... 7 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7526) Assertion Error when only type is used with schema in table function

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7526:
---
Fix Version/s: (was: 1.18.0)
   1.19.0

> Assertion Error when only type is used with schema in table function
> 
>
> Key: DRILL-7526
> URL: https://issues.apache.org/jira/browse/DRILL-7526
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Arina Ielchiieva
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.19.0
>
>
> {{org.apache.drill.TestSchemaWithTableFunction}}
> {noformat}
>   @Test
>   public void testWithTypeAndSchema() {
> String query = "select Year from 
> table(dfs.`store/text/data/cars.csvh`(type=> 'text', " +
>   "schema=>'inline=(`Year` int)')) where Make = 'Ford'";
> queryBuilder().sql(query).print();
>   }
> {noformat}
> {noformat}
> Caused by: java.lang.AssertionError: BOOLEAN
>   at 
> org.apache.calcite.sql.type.SqlTypeExplicitPrecedenceList.compareTypePrecedence(SqlTypeExplicitPrecedenceList.java:140)
>   at org.apache.calcite.sql.SqlUtil.bestMatch(SqlUtil.java:687)
>   at 
> org.apache.calcite.sql.SqlUtil.filterRoutinesByTypePrecedence(SqlUtil.java:656)
>   at 
> org.apache.calcite.sql.SqlUtil.lookupSubjectRoutines(SqlUtil.java:515)
>   at org.apache.calcite.sql.SqlUtil.lookupRoutine(SqlUtil.java:435)
>   at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:240)
>   at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:218)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5640)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5627)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1692)
>   at 
> org.apache.calcite.sql.validate.ProcedureNamespace.validateImpl(ProcedureNamespace.java:53)
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1009)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:969)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3129)
>   at 
> org.apache.drill.exec.planner.sql.conversion.DrillValidator.validateFrom(DrillValidator.java:63)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3111)
>   at 
> org.apache.drill.exec.planner.sql.conversion.DrillValidator.validateFrom(DrillValidator.java:63)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3383)
>   at 
> org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1009)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:969)
>   at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:216)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:944)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:651)
>   at 
> org.apache.drill.exec.planner.sql.conversion.SqlConverter.validate(SqlConverter.java:189)
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode(DefaultSqlHandler.java:648)
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:196)
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:170)
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:283)
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan(DrillSqlWorker.java:163)
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:128)
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:93)
>   at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:590)
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:275)
>   ... 1 more
> {noformat}
> Note: when other format options are used or schema is used alone, everything 
> works fine.
> See test examples: 
> 

[jira] [Updated] (DRILL-7551) Improve Error Reporting

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7551?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7551:
---
Target Version/s: 1.19.0

> Improve Error Reporting
> ---
>
> Key: DRILL-7551
> URL: https://issues.apache.org/jira/browse/DRILL-7551
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.18.0
>
>
> This Jira is to serve as a master Jira issue to improve the usability of 
> error messages. Instead of dumping stack traces, the overall goal is to give 
> the user something that can actually explain:
>  # What went wrong
>  # How to fix 
> Work that relates to this, should be created as subtasks.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7550) Add Storage Plugin for Cassandra

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7550:
---
Target Version/s: 1.19.0

> Add Storage Plugin for Cassandra
> 
>
> Key: DRILL-7550
> URL: https://issues.apache.org/jira/browse/DRILL-7550
> Project: Apache Drill
>  Issue Type: New Feature
>  Components: Storage - Other
>Affects Versions: 1.18.0
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.18.0
>
>
> Apache Cassandra is a free and open-source, distributed, wide column store, 
> NoSQL database management system designed to handle large amounts of data 
> across many commodity servers, providing high availability with no single 
> point of failure. [1]
> This PR would enable Drill to query Cassandra data stores.
>  
> [1]: https://en.wikipedia.org/wiki/Apache_Cassandra



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7284) reusing the hashCodes computed at exchange nodes

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7284?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7284:
---
Target Version/s: 1.19.0

> reusing the hashCodes computed at exchange nodes
> 
>
> Key: DRILL-7284
> URL: https://issues.apache.org/jira/browse/DRILL-7284
> Project: Apache Drill
>  Issue Type: New Feature
>Reporter: Weijie Tong
>Assignee: Weijie Tong
>Priority: Major
> Fix For: 1.18.0
>
>
> To HashJoin or HashAggregate, we will shuffle the input data according to 
> hashCodes of join conditions or group by keys at the exchange nodes. This 
> computing of the hash codes will be redo at the HashJoin or HashAggregate 
> nodes. We could send the computed hashCodes of exchange nodes to the upper 
> nodes. So the HashJoin or HashAggregate nodes will not need to do the hash 
> computing again.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7597) Read selected JSON colums as JSON text

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7597?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7597:
---
Target Version/s: 1.19.0

> Read selected JSON colums as JSON text
> --
>
> Key: DRILL-7597
> URL: https://issues.apache.org/jira/browse/DRILL-7597
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.18.0
>
>
> See DRILL-7598. The use case wishes to read selected JSON columns as JSON 
> text rather than parsing the JSON into a relational structure as is done 
> today in the JSON reader.
> The JSON reader supports "all text mode", but, despite the name, this mode 
> only works for scalars (primitives) such as numbers. It does not work for 
> structured types such as objects or arrays: such types are always parsed into 
> Drill structures (which causes the conflict describe in DRILL-7598.)
> Instead, we need a feature to read an entire JSON value, including structure, 
> as a JSON string.
> This feature would work best when the user can parse some parts of a JSON 
> input file into relational structure, others as JSON. (This is the use case 
> which the user list user faced.) So, we need a way to do that.
> Drill has a "provided schema" feature, which, at present, is used only for 
> text files (and recently with limited support in Avro.) We are working on a 
> project to add such support for JSON.
> Perhaps we can leverage this feature to allow the JSON reader to read chunks 
> of JSON as text which can be manipulated by those future JSON functions. In 
> the example, column "c" would be read as JSON text; Drill would not attempt 
> to parse it into a relational structure.
> As it turns out, the "new" JSON reader we're working on originally had a 
> feature to do just that, but we took it out because we were not sure it was 
> needed. Sounds like we should restore it as part of our "provided schema" 
> support. It could work this way: if you CREATE SCHEMA with column "c" as 
> VARCHAR (maybe with a hint to read as JSON), the JSON parser would read the 
> entire nested structure as JSON without trying to parse it into a relational 
> structure.
> This ticket asks to build the concept:
>  * Allow a `CREATE SCHEMA` option (to be designed) to designate a JSON field 
> to be read as JSON.
>  * Implement the "read column as JSON" feature in the new EVF-based JSON 
> reader.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7526) Assertion Error when only type is used with schema in table function

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7526?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7526:
---
Target Version/s: 1.19.0

> Assertion Error when only type is used with schema in table function
> 
>
> Key: DRILL-7526
> URL: https://issues.apache.org/jira/browse/DRILL-7526
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Arina Ielchiieva
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.18.0
>
>
> {{org.apache.drill.TestSchemaWithTableFunction}}
> {noformat}
>   @Test
>   public void testWithTypeAndSchema() {
> String query = "select Year from 
> table(dfs.`store/text/data/cars.csvh`(type=> 'text', " +
>   "schema=>'inline=(`Year` int)')) where Make = 'Ford'";
> queryBuilder().sql(query).print();
>   }
> {noformat}
> {noformat}
> Caused by: java.lang.AssertionError: BOOLEAN
>   at 
> org.apache.calcite.sql.type.SqlTypeExplicitPrecedenceList.compareTypePrecedence(SqlTypeExplicitPrecedenceList.java:140)
>   at org.apache.calcite.sql.SqlUtil.bestMatch(SqlUtil.java:687)
>   at 
> org.apache.calcite.sql.SqlUtil.filterRoutinesByTypePrecedence(SqlUtil.java:656)
>   at 
> org.apache.calcite.sql.SqlUtil.lookupSubjectRoutines(SqlUtil.java:515)
>   at org.apache.calcite.sql.SqlUtil.lookupRoutine(SqlUtil.java:435)
>   at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:240)
>   at org.apache.calcite.sql.SqlFunction.deriveType(SqlFunction.java:218)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5640)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl$DeriveTypeVisitor.visit(SqlValidatorImpl.java:5627)
>   at org.apache.calcite.sql.SqlCall.accept(SqlCall.java:139)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.deriveTypeImpl(SqlValidatorImpl.java:1692)
>   at 
> org.apache.calcite.sql.validate.ProcedureNamespace.validateImpl(ProcedureNamespace.java:53)
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1009)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:969)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3129)
>   at 
> org.apache.drill.exec.planner.sql.conversion.DrillValidator.validateFrom(DrillValidator.java:63)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateFrom(SqlValidatorImpl.java:3111)
>   at 
> org.apache.drill.exec.planner.sql.conversion.DrillValidator.validateFrom(DrillValidator.java:63)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateSelect(SqlValidatorImpl.java:3383)
>   at 
> org.apache.calcite.sql.validate.SelectNamespace.validateImpl(SelectNamespace.java:60)
>   at 
> org.apache.calcite.sql.validate.AbstractNamespace.validate(AbstractNamespace.java:84)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateNamespace(SqlValidatorImpl.java:1009)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateQuery(SqlValidatorImpl.java:969)
>   at org.apache.calcite.sql.SqlSelect.validate(SqlSelect.java:216)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validateScopedExpression(SqlValidatorImpl.java:944)
>   at 
> org.apache.calcite.sql.validate.SqlValidatorImpl.validate(SqlValidatorImpl.java:651)
>   at 
> org.apache.drill.exec.planner.sql.conversion.SqlConverter.validate(SqlConverter.java:189)
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateNode(DefaultSqlHandler.java:648)
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.validateAndConvert(DefaultSqlHandler.java:196)
>   at 
> org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:170)
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:283)
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPhysicalPlan(DrillSqlWorker.java:163)
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:128)
>   at 
> org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:93)
>   at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:590)
>   at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:275)
>   ... 1 more
> {noformat}
> Note: when other format options are used or schema is used alone, everything 
> works fine.
> See test examples: 
> 

[jira] [Updated] (DRILL-7787) Apache drill failed to start

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7787:
---
Target Version/s: 1.19.0

> Apache drill failed to start
> 
>
> Key: DRILL-7787
> URL: https://issues.apache.org/jira/browse/DRILL-7787
> Project: Apache Drill
>  Issue Type: Bug
>Reporter: Om Prasad Surapu
>Priority: Major
> Fix For: 1.18.0
>
>
> Hi Team,
> I have apache drill cluster setup with apache-drill-1.17.0 and started in 
> distributed mode (with zookeeper). Drill started and no issues reported.
>  
> Have installed apache-drill-1.18.0 to fix  DRILL-7786 but drill failed to 
> start with below exception. I have tried zookeeper version 3.5.8 and 3.4.11). 
> Could you help me out to fix this issue?
> Exception in thread "main" 
> org.apache.drill.exec.exception.DrillbitStartupException: Failure during 
> initial startup of Drillbit.
>  at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:588)
>  at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:554)
>  at org.apache.drill.exec.server.Drillbit.main(Drillbit.java:550)
> Caused by: org.apache.drill.common.exceptions.DrillRuntimeException: unable 
> to put 
>  at 
> org.apache.drill.exec.coord.zk.ZookeeperClient.putIfAbsent(ZookeeperClient.java:326)
>  at 
> org.apache.drill.exec.store.sys.store.ZookeeperPersistentStore.putIfAbsent(ZookeeperPersistentStore.java:119)
>  at 
> org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.prepareStores(RemoteFunctionRegistry.java:201)
>  at 
> org.apache.drill.exec.expr.fn.registry.RemoteFunctionRegistry.init(RemoteFunctionRegistry.java:108)
>  at org.apache.drill.exec.server.Drillbit.run(Drillbit.java:233)
>  at org.apache.drill.exec.server.Drillbit.start(Drillbit.java:584)
>  ... 2 more
> Caused by: org.apache.zookeeper.KeeperException$UnimplementedException: 
> KeeperErrorCode = Unimplemented for /drill/udf/registry
>  at org.apache.zookeeper.KeeperException.create(KeeperException.java:106)
>  at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>  at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:1637)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl$17.call(CreateBuilderImpl.java:1180)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl$17.call(CreateBuilderImpl.java:1156)
>  at 
> org.apache.curator.connection.StandardConnectionHandlingPolicy.callWithRetry(StandardConnectionHandlingPolicy.java:67)
>  at org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:81)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:1153)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:607)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:597)
>  at 
> org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:51)
>  at 
> org.apache.drill.exec.coord.zk.ZookeeperClient.putIfAbsent(ZookeeperClient.java:318)
>  ... 7 more



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7192) Drill limits rows when autoLimit is disabled

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7192?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7192:
---
Target Version/s: 1.19.0

> Drill limits rows when autoLimit is disabled
> 
>
> Key: DRILL-7192
> URL: https://issues.apache.org/jira/browse/DRILL-7192
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.18.0
>
>
> In DRILL-7048 was implemented autoLimit for JDBC and rest clients.
> *Steps to reproduce the issue:*
>  1. Check that autoLimit was disabled, if not, disable it and restart Drill.
>  2. Submit any query, and verify that rows count is correct, for example,
> {code:sql}
> SELECT * FROM cp.`employee.json`;
> {code}
> returns 1,155 rows
>  3. Enable autoLimit for sqlLine sqlLine client:
> {code:sql}
> !set rowLimit 10
> {code}
> 4. Submit the same query and verify that the result has 10 rows.
>  5. Disable autoLimit:
> {code:sql}
> !set rowLimit 0
> {code}
> 6. Submit the same query, but for this time, *it returns 10 rows instead of 
> 1,155*.
> Correct rows count is returned only after creating a new connection.
> The same issue is also observed for SQuirreL SQL client, but for example, for 
> Postgres, it works correctly.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7558) Generalize filter push-down planner phase

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7558?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7558:
---
Target Version/s: 1.19.0

> Generalize filter push-down planner phase
> -
>
> Key: DRILL-7558
> URL: https://issues.apache.org/jira/browse/DRILL-7558
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.18.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.18.0
>
>
> DRILL-7458 provides a base framework for storage plugins, including a 
> simplified filter push-down mechanism. [~volodymyr] notes that it may be 
> *too* simple:
> {quote}
> What about the case when this rule was applied for one filter, but planner at 
> some point pushed another filter above the scan, for example, if we have such 
> case:
> {code}
> Filter(a=2)
>   Join(t1.b=t2.b, type=inner)
> Filter(b=3)
> Scan(t1)
> Scan(t2)
> {code}
> Filter b=3 will be pushed into scan, planner will push filter above join:
> {code}
> Join(t1.b=t2.b, type=inner)
> Filter(a=2)
> Scan(t1, b=3)
> Scan(t2)
> {code}
> In this case, check whether filter was pushed is not enough.
> {quote}
> Drill divides planning into a number of *phases*, each defined by a set of 
> *rules*. Most storage plugins perform filter push-down during the physical 
> planning stage. However, by this point, Drill has already decided on the 
> degree of parallelism: it is too late to use filter push-down to set the 
> degree of parallelism. Yet, if using something like a REST API, we want to 
> use filters to help us shard the query (that is, to set the degree of 
> parallelism.)
>  
> DRILL-7458 performs filter push-down at *logical* planning time to work 
> around the above limitation. (In Drill, there are three different phases that 
> could be considered the logical phase, depending on which planning options 
> are set to control Calcite.)
> [~volodymyr] points out that the the logical plan phase may be wrong because 
> it will perform rewrites of the type he cited.
> Thus, we need to research where to insert filter push down. It must come:
> * After rewrites of the kind described above.
> * After join equivalence computations. (See DRILL-7556.)
> * Before the decision is made about the number of minor fragments.
> The goal of this ticket is to either:
> * Research to identify an existing phase which satisfies these requirements, 
> or
> * Create a new phase.
> Due to the way Calcite works, it is not a good idea to have a single phase 
> handle two tasks that depend on one another. That is, we cannot combine 
> filter push down in a phase which defines the filters, nor can we add filter 
> push-down in a phase that choose parallelism.
> Background: Calcite is a rule-based query planner inspired by 
> [Volcano|https://paperhub.s3.amazonaws.com/dace52a42c07f7f8348b08dc2b186061.pdf].
> The above issue is a flaw with rule-based planners and was identified as 
> early as the [Cascades query framework 
> paper|https://www.csd.uoc.gr/~hy460/pdf/CascadesFrameworkForQueryOptimization.pdf]
>  which was the follow-up to Volcano.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7712) Fix issues after ZK upgrade

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7712?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7712:
---
Target Version/s: 1.19.0

> Fix issues after ZK upgrade
> ---
>
> Key: DRILL-7712
> URL: https://issues.apache.org/jira/browse/DRILL-7712
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.18.0
>Reporter: Arina Ielchiieva
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.18.0
>
>
> Warnings during jdbc-all build (absent when building with Mapr profile):
> {noformat}
> netty-transport-native-epoll-4.1.45.Final.jar, 
> netty-transport-native-epoll-4.0.48.Final-linux-x86_64.jar define 46 
> overlapping classes: 
>   - io.netty.channel.epoll.AbstractEpollStreamChannel$2
>   - io.netty.channel.epoll.AbstractEpollServerChannel$EpollServerSocketUnsafe
>   - io.netty.channel.epoll.EpollDatagramChannel
>   - io.netty.channel.epoll.AbstractEpollStreamChannel$SpliceInChannelTask
>   - io.netty.channel.epoll.NativeDatagramPacketArray
>   - io.netty.channel.epoll.EpollSocketChannelConfig
>   - io.netty.channel.epoll.EpollTcpInfo
>   - io.netty.channel.epoll.EpollEventArray
>   - io.netty.channel.epoll.EpollEventLoop
>   - io.netty.channel.epoll.EpollSocketChannel
>   - 36 more...
> netty-transport-native-unix-common-4.1.45.Final.jar, 
> netty-transport-native-epoll-4.0.48.Final-linux-x86_64.jar define 15 
> overlapping classes: 
>   - io.netty.channel.unix.Errors$NativeConnectException
>   - io.netty.channel.unix.ServerDomainSocketChannel
>   - io.netty.channel.unix.DomainSocketAddress
>   - io.netty.channel.unix.Socket
>   - io.netty.channel.unix.NativeInetAddress
>   - io.netty.channel.unix.DomainSocketChannelConfig
>   - io.netty.channel.unix.Errors$NativeIoException
>   - io.netty.channel.unix.DomainSocketReadMode
>   - io.netty.channel.unix.ErrorsStaticallyReferencedJniMethods
>   - io.netty.channel.unix.UnixChannel
>   - 5 more...
> maven-shade-plugin has detected that some class files are
> present in two or more JARs. When this happens, only one
> single version of the class is copied to the uber jar.
> Usually this is not harmful and you can skip these warnings,
> otherwise try to manually exclude artifacts based on
> mvn dependency:tree -Ddetail=true and the above output.
> See http://maven.apache.org/plugins/maven-shade-plugin/
> {noformat}
> Additional warning build with Mapr profile:
> {noformat}
> The following patterns were never triggered in this artifact inclusion filter:
> o  'org.apache.zookeeper:zookeeper-jute'
> {noformat}
> NPEs in tests (though tests do not fail):
> {noformat}
> [INFO] Running org.apache.drill.exec.coord.zk.TestZookeeperClient
> 4880
> java.lang.NullPointerException
> 4881
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269)
> 4882
>   at 
> org.apache.zookeeper.server.ZKDatabase.fastForwardDataBase(ZKDatabase.java:251)
> 4883
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:583)
> 4884
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:546)
> 4885
>   at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.shutdown(NIOServerCnxnFactory.java:
> {noformat}
> {noformat}
> [INFO] Running org.apache.drill.exec.coord.zk.TestEphemeralStore
> 5278
> java.lang.NullPointerException
> 5279
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269)
> 5280
>   at org.apache.zookeepe
> {noformat}
> {noformat}
> [INFO] Running org.apache.drill.yarn.zk.TestAmRegistration
> 6767
> java.lang.NullPointerException
> 6768
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269)
> 6769
>   at 
> org.apache.zookeeper.server.ZKDatabase.fastForwardDataBase(ZKDatabase.java:251)
> 6770
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:583)
> 6771
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:546)
> 6772
>   at 
> org.apache.zookeeper.server.NIOServerCnxnFactory.shutdown(NIOServerCnxnFactory.java:929)
> 6773
>   at org.apache.curator.t
> {noformat}
> {noformat}
> org.apache.drill.yarn.client.TestCommandLineOptions
> 6823
> java.lang.NullPointerException
> 6824
>   at 
> org.apache.zookeeper.server.persistence.FileTxnSnapLog.fastForwardFromEdits(FileTxnSnapLog.java:269)
> 6825
>   at 
> org.apache.zookeeper.server.ZKDatabase.fastForwardDataBase(ZKDatabase.java:251)
> 6826
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:583)
> 6827
>   at 
> org.apache.zookeeper.server.ZooKeeperServer.shutdown(ZooKeeperServer.java:546)
> 6828
>   at org.apac
> {noformat}




[jira] [Updated] (DRILL-7531) Convert format plugins to EVF

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7531?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7531:
---
Target Version/s: 1.19.0

> Convert format plugins to EVF
> -
>
> Key: DRILL-7531
> URL: https://issues.apache.org/jira/browse/DRILL-7531
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Arina Ielchiieva
>Priority: Major
> Fix For: 1.18.0
>
>
> This is umbrella Jira to track down process of converting format plugins to 
> EVF.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7556) Generalize the "Base" storage plugin filter push down mechanism

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7556?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7556:
---
Target Version/s: 1.19.0

> Generalize the "Base" storage plugin filter push down mechanism
> ---
>
> Key: DRILL-7556
> URL: https://issues.apache.org/jira/browse/DRILL-7556
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.18.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.18.0
>
>
> DRILL-7458 adds a Base framework for storage plugins which includes a 
> simplified representation of filters that can be pushed down into Drill. It 
> makes the assumption that plugins can generally only handle filters of the 
> form:
> {code}
> column relop constant
> {code}
> For example, {{`foo` < 10}} or {{`bar` = "Fred"}}. (The code "flips" 
> expressions of the form {{constant relop column}}.)
> [~volodymyr] suggests this is too narrow and suggests two additional cases:
> {code}
> column-expr relop constant
> fn(column) = conststant
> {code}
> Examples:
> {code:sql}
> foo + 10 = 20
> substr(bar, 2, 6) = 'Fred'
> {code}
> The first case should be handled by a general expression rewriter: simplify 
> constant expressions:
> {code:sql}
> foo + 10 = 20 --> foo = 10
> {code}
> Then, filter-push down need only handle the simplified expression rather than 
> every push-down mechanism needing to do the simplification.
> For this ticket, we wish to handle the second case: any expression that 
> contains a single column associated with the target table. Provide a new 
> push-down node to handle the non-relop case so that simple plugins can simply 
> ignore such expressions, but more complex plugins (such as Parquet) can 
> optionally handle them.
> A second improvement is to handle the more complex case: two or more columns, 
> all of which come from the same target table. For example:
> {code:sql}
> foo + bar = 20
> {code}
> Where both {{foo}} and {{bar}} are from the same table. It would be a very 
> sophisticated plugin indeed (maybe the JDBC storage plugin) which can handle 
> this case, but it should be available.
> As part of this work, we must handle join-equivalent columns:
> {code:sql}
> SELECT ... FROM t1, t2
>   WHERE t1.a = t2.b
>   AND t1.a = 20
> {code}
> If the plugin for table {{t2}} can handle filter push-down, then the 
> expression {{t1.a = 20}} is join-equivalent to {{t2.b = 20}}.
> It is not clear if the Drill logical plan already handles join equivalence. 
> If not, it should be added. If so, the filter push-down mechanism should add 
> documentation that describes how the mechanism works.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7671) Fix builds for cdh and hdp profiles

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7671?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7671:
---
Target Version/s: 1.19.0

> Fix builds for cdh and hdp profiles
> ---
>
> Key: DRILL-7671
> URL: https://issues.apache.org/jira/browse/DRILL-7671
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.18.0
>
>
> cdh and hdp profiles use too obsolete versions of Hadoop and other libraries. 
> So when attempting to build the project with these profiles, the build fails 
> with compilation errors.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7621) Refactor ExecConstants and PlannerSettings constant classes

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7621?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7621:
---
Target Version/s: 1.19.0

> Refactor ExecConstants and PlannerSettings constant classes
> ---
>
> Key: DRILL-7621
> URL: https://issues.apache.org/jira/browse/DRILL-7621
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Igor Guzenko
>Assignee: Igor Guzenko
>Priority: Major
> Fix For: 1.18.0
>
>
> According to [the 
> discussion|http://mail-archives.apache.org/mod_mbox/drill-dev/202003.mbox/%3CBCB4CFC2-8BC5-43C6-8BD4-956F66F6D0D3%40gmail.com%3E],
>  it makes sense to split the classes into multiple constant interfaces and 
> get rid of validator constants. Then the validator instances won't be used 
> for getting option values and the general approach will be getting type 
> specific option value by string key from config instance. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7270) Fix non-https dependency urls and add checksum checks

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7270:
---
Target Version/s: 1.19.0

> Fix non-https dependency urls and add checksum checks
> -
>
> Key: DRILL-7270
> URL: https://issues.apache.org/jira/browse/DRILL-7270
> Project: Apache Drill
>  Issue Type: Task
>  Components: Security
>Affects Versions: 1.16.0
>Reporter: Arina Ielchiieva
>Assignee: Bohdan Kazydub
>Priority: Major
> Fix For: 1.18.0
>
>
> Review any build scripts and configurations for insecure urls and make 
> appropriate fixes to use secure urls.
> Projects like Lucene do checksum whitelists of all their build dependencies, 
> and you may wish to consider that as a
> protection against threats beyond just MITM.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7133) Duplicate Corrupt PCAP Functionality in PCAP-NG Plugin

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7133:
---
Target Version/s: 1.19.0

> Duplicate Corrupt PCAP Functionality in PCAP-NG Plugin
> --
>
> Key: DRILL-7133
> URL: https://issues.apache.org/jira/browse/DRILL-7133
> Project: Apache Drill
>  Issue Type: Improvement
>Reporter: Charles Givre
>Assignee: Charles Givre
>Priority: Major
> Fix For: 1.18.0
>
>
> There was a JIRA (https://issues.apache.org/jira/browse/DRILL-7032) which 
> resulted in some improvements to the PCAP format plugin which converted the 
> TCP flags to boolean format and also added a {{is_corrupt}} boolean field.  
> This field allows users to look for packets that are corrupt. 
> Unfortunately, this functionality is not duplicated in the PCAP-NG format 
> plugin, so this JIRA proposes to do that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7366) Improve Null Handling for UDFs with Complex Output

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7366?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7366:
---
Target Version/s: 1.19.0

> Improve Null Handling for UDFs with Complex Output
> --
>
> Key: DRILL-7366
> URL: https://issues.apache.org/jira/browse/DRILL-7366
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Charles Givre
>Priority: Major
> Fix For: 1.18.0
>
>
> If there is a UDF which has a complex field (Map or List) as output, Drill 
> does not allow the UDF to have nullable input and it creates additional 
> complexity when writing these kinds of UDFs. 
> I therefore would like to propose that two options be added to the 
> FunctionTemplate for null handling:  {{EMPTY_LIST_IF_NULL}}, and 
> {{EMPTY_MAP_IF_NULL}} which, would simplify UDF creation.  I'm envisioning 
> that if either of these options were selected, and the UDF receives any null 
> value as input, the UDF will return either an empty map or list. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7325) Many operators do not set container record count

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7325?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7325:
---
Target Version/s: 1.19.0

> Many operators do not set container record count
> 
>
> Key: DRILL-7325
> URL: https://issues.apache.org/jira/browse/DRILL-7325
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.18.0
>
>
> See DRILL-7324. The following are problems found because some operators fail 
> to set the record count for their containers.
> h4. Scan
> TestComplexTypeReader, on cluster setup, using the PojoRecordReader:
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from ScanBatch
> ScanBatch: Container record count not set
> Reason: ScanBatch never sets the record count of its container (this is a 
> generic issue, not specific to the PojoRecordReader).
> h4. Filter
> {{TestComplexTypeReader.testNonExistentFieldConverting()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from FilterRecordBatch
> FilterRecordBatch: Container record count not set
> {noformat}
> h4. Hash Join
> {{TestComplexTypeReader.test_array()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from HashJoinBatch
> HashJoinBatch: Container record count not set
> {noformat}
> Occurs on the first batch in which the hash join returns {{OK_NEW_SCHEMA}} 
> with no records.
> h4. Project
> TestCsvWithHeaders.testEmptyFile()}} (when the text reader returned empty, 
> schema-only batches):
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from ProjectRecordBatch
> ProjectRecordBatch: Container record count not set
> {noformat}
> Occurs in {{ProjectRecordBatch.handleNullInput()}}: it sets up the schema but 
> does not set the value count to 0.
> h4. Unordered Receiver
> {{TestCsvWithSchema.testMultiFileSchema()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from UnorderedReceiverBatch
> UnorderedReceiverBatch: Container record count not set
> {noformat}
> The problem is that {{RecordBatchLoader.load()}} does not set the container 
> record count.
> h4. Streaming Aggregate
> {{TestJsonReader.testSumWithTypeCase()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from StreamingAggBatch
> StreamingAggBatch: Container record count not set
> {noformat}
> The problem is that {{StreamingAggBatch.buildSchema()}} does not set the 
> container record count to 0.
> h4. Limit
> {{TestJsonReader.testDrill_1419()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from LimitRecordBatch
> LimitRecordBatch: Container record count not set
> {noformat}
> None of the paths in {{LimitRecordBatch.innerNext()}} set the container 
> record count.
> h4. Union All
> {{TestJsonReader.testKvgenWithUnionAll()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from UnionAllRecordBatch
> UnionAllRecordBatch: Container record count not set
> {noformat}
> When {{UnionAllRecordBatch}} calls 
> {{VectorAccessibleUtilities.setValueCount()}}, it did not also set the 
> container count.
> h4. Hash Aggregate
> {{TestJsonReader.drill_4479()}}:
> {noformat}
> ERROR o.a.d.e.p.i.validate.BatchValidator - Found one or more vector errors 
> from HashAggBatch
> HashAggBatch: Container record count not set
> {noformat}
> Problem is that {{HashAggBatch.buildSchema()}} does not set the container 
> record count to 0 for the first, empty, batch sent for {{OK_NEW_SCHEMA.}}
> h4. And Many More
> I turns out that most operators fail to set one of the many row count 
> variables somewhere in their code path: maybe in the schema setup path, maybe 
> when building a batch along one of the many paths that operators follow. 
> Further, we have multiple row counts that must be set:
> * Values in each vector ({{setValueCount()}},
> * Row count in the container ({{setRecordCount()}}), which must be the same 
> as the vector value count.
> * Row count in the operator (batch), which is the (possibly filtered) count 
> of records presented to downstream operators. It must be less than or equal 
> to the container row count (except for an SV4.)
> * The SV2 record count, which is the number of entries in the SV2 and must be 
> the same as the batch row count (and less or equal to the container row 
> count.)
> * The SV2 actual bactch record count, which must be the same as the container 
> row count.
> * The SV4 record count, which must be the same as the batch record count. 
> With an SV4, the batch consists of 

[jira] [Updated] (DRILL-7525) Convert SequenceFiles to EVF

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7525?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7525:
---
Target Version/s: 1.19.0

> Convert SequenceFiles to EVF
> 
>
> Key: DRILL-7525
> URL: https://issues.apache.org/jira/browse/DRILL-7525
> Project: Apache Drill
>  Issue Type: Sub-task
>Affects Versions: 1.17.0
>Reporter: Arina Ielchiieva
>Assignee: Arina Ielchiieva
>Priority: Major
> Fix For: 1.18.0
>
>
> Convert SequenceFiles to EVF



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7557) Revise "Base" storage plugin filter-push down listerner with a builder

2020-09-06 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7557?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7557:
---
Target Version/s: 1.19.0

> Revise "Base" storage plugin filter-push down listerner with a builder
> --
>
> Key: DRILL-7557
> URL: https://issues.apache.org/jira/browse/DRILL-7557
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.18.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.18.0
>
>
> DRILL-7458 introduces a base framework for storage plugins and includes a 
> simplified mechanism for filter push down. Part of that mechanism includes a 
> "listener", with the bulk of the work done in a single method:
> {code:java}
> Pair> transform(GroupScan groupScan,
>   List> andTerms, Pair DisjunctionFilterSpec> orTerm);
> {code}
> Reviewers correctly pointed out that this method might be a bit too complex.
> The listener pattern pretty much forced the present design. To improve it, 
> we'd want to use a different design; maybe some kind of builder which might:
> * Accept the CNF and DNF terms via dedicated methods.
> * Perform a processing step.
> * Provide a number of methods to communicate the results, such as 1) whether 
> a new group scan is needed, 2) any CNF terms to retain, and 3) any DNF terms 
> to retain.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7785) Some hive tables fail with UndeclaredThrowableException

2020-08-31 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187936#comment-17187936
 ] 

Abhishek Girish commented on DRILL-7785:


Thanks [~volodymyr] and [~angozhiy]! 

> Some hive tables fail with UndeclaredThrowableException
> ---
>
> Key: DRILL-7785
> URL: https://issues.apache.org/jira/browse/DRILL-7785
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow, Storage - Hive
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Vova Vysotskyi
>Priority: Major
>
> Query: 
> {code}
> Functional/hive/hive_storage/fileformats/orc/transactional/orc_table_clustered_bucketed.sql
> select * from hive_orc_transactional.orc_table_clustered_bucketed
> {code}
> Exception:
> {code}
> java.sql.SQLException: EXECUTION_ERROR ERROR: 
> java.lang.reflect.UndeclaredThrowableException
> Failed to setup reader: HiveDefaultRecordReader
> Fragment: 0:0
> [Error Id: 323434cc-7bd2-4551-94d4-a5925f6a66af on drill80:31010]
>   (org.apache.drill.common.exceptions.ExecutionSetupException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30
> 
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():257
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1669
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.util.concurrent.ExecutionException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.getDoneValue():553
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.get():534
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.FluentFuture$TrustedFuture.get():88
> 
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():252
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1669
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.lang.reflect.UndeclaredThrowableException) null
> org.apache.hadoop.security.UserGroupInformation.doAs():1687
> org.apache.drill.exec.ops.OperatorContextImpl$1.call():101
> 
> 

[jira] [Commented] (DRILL-7785) Some hive tables fail with UndeclaredThrowableException

2020-08-30 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187410#comment-17187410
 ] 

Abhishek Girish commented on DRILL-7785:


I observed it for a few other Hive test cases. Modified title accordingly

> Some hive tables fail with UndeclaredThrowableException
> ---
>
> Key: DRILL-7785
> URL: https://issues.apache.org/jira/browse/DRILL-7785
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow, Storage - Hive
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Vova Vysotskyi
>Priority: Major
>
> Query: 
> {code}
> Functional/hive/hive_storage/fileformats/orc/transactional/orc_table_clustered_bucketed.sql
> select * from hive_orc_transactional.orc_table_clustered_bucketed
> {code}
> Exception:
> {code}
> java.sql.SQLException: EXECUTION_ERROR ERROR: 
> java.lang.reflect.UndeclaredThrowableException
> Failed to setup reader: HiveDefaultRecordReader
> Fragment: 0:0
> [Error Id: 323434cc-7bd2-4551-94d4-a5925f6a66af on drill80:31010]
>   (org.apache.drill.common.exceptions.ExecutionSetupException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30
> 
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():257
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1669
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.util.concurrent.ExecutionException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.getDoneValue():553
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.get():534
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.FluentFuture$TrustedFuture.get():88
> 
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():252
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1669
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.lang.reflect.UndeclaredThrowableException) null
> org.apache.hadoop.security.UserGroupInformation.doAs():1687
> 

[jira] [Updated] (DRILL-7785) Some hive tables fail with UndeclaredThrowableException

2020-08-30 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7785:
---
Summary: Some hive tables fail with UndeclaredThrowableException  (was: 
Hive Clustered Bucketed ORC transactional table fails with 
UndeclaredThrowableException)

> Some hive tables fail with UndeclaredThrowableException
> ---
>
> Key: DRILL-7785
> URL: https://issues.apache.org/jira/browse/DRILL-7785
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow, Storage - Hive
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Vova Vysotskyi
>Priority: Major
>
> Query: 
> {code}
> Functional/hive/hive_storage/fileformats/orc/transactional/orc_table_clustered_bucketed.sql
> select * from hive_orc_transactional.orc_table_clustered_bucketed
> {code}
> Exception:
> {code}
> java.sql.SQLException: EXECUTION_ERROR ERROR: 
> java.lang.reflect.UndeclaredThrowableException
> Failed to setup reader: HiveDefaultRecordReader
> Fragment: 0:0
> [Error Id: 323434cc-7bd2-4551-94d4-a5925f6a66af on drill80:31010]
>   (org.apache.drill.common.exceptions.ExecutionSetupException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30
> 
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():257
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1669
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.util.concurrent.ExecutionException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.getDoneValue():553
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.get():534
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.FluentFuture$TrustedFuture.get():88
> 
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():252
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1669
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.lang.reflect.UndeclaredThrowableException) null
> org.apache.hadoop.security.UserGroupInformation.doAs():1687
> 

[jira] [Commented] (DRILL-7785) Hive Clustered Bucketed ORC transactional table fails with UndeclaredThrowableException

2020-08-30 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187333#comment-17187333
 ] 

Abhishek Girish commented on DRILL-7785:


[~agozhiy] can you please help to see if this is seen on previous releases of 
Drill (can help determine if this is a regression for 1.18.0). Maybe that can 
also help Vova with a repro setup.

> Hive Clustered Bucketed ORC transactional table fails with 
> UndeclaredThrowableException
> ---
>
> Key: DRILL-7785
> URL: https://issues.apache.org/jira/browse/DRILL-7785
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow, Storage - Hive
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Vova Vysotskyi
>Priority: Major
>
> Query: 
> {code}
> Functional/hive/hive_storage/fileformats/orc/transactional/orc_table_clustered_bucketed.sql
> select * from hive_orc_transactional.orc_table_clustered_bucketed
> {code}
> Exception:
> {code}
> java.sql.SQLException: EXECUTION_ERROR ERROR: 
> java.lang.reflect.UndeclaredThrowableException
> Failed to setup reader: HiveDefaultRecordReader
> Fragment: 0:0
> [Error Id: 323434cc-7bd2-4551-94d4-a5925f6a66af on drill80:31010]
>   (org.apache.drill.common.exceptions.ExecutionSetupException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30
> 
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():257
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1669
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.util.concurrent.ExecutionException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.getDoneValue():553
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.get():534
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.FluentFuture$TrustedFuture.get():88
> 
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():252
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1669
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused 

[jira] [Commented] (DRILL-7785) Hive Clustered Bucketed ORC transactional table fails with UndeclaredThrowableException

2020-08-30 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187332#comment-17187332
 ] 

Abhishek Girish commented on DRILL-7785:


Hive version: Hive 2.3.3-mapr-1808
MapR version: 6.1.0.20180926230239.GA

Data: Generated using {code}Datasources/hive/execHive.sh 
framework/resources/Datasources/hive_storage/orc/transactional.ddl{code} 
Link: 
https://github.com/mapr/drill-test-framework/blob/master/framework/resources/Datasources/hive_storage/orc/transactional.ddl


> Hive Clustered Bucketed ORC transactional table fails with 
> UndeclaredThrowableException
> ---
>
> Key: DRILL-7785
> URL: https://issues.apache.org/jira/browse/DRILL-7785
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Flow, Storage - Hive
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Vova Vysotskyi
>Priority: Major
>
> Query: 
> {code}
> Functional/hive/hive_storage/fileformats/orc/transactional/orc_table_clustered_bucketed.sql
> select * from hive_orc_transactional.orc_table_clustered_bucketed
> {code}
> Exception:
> {code}
> java.sql.SQLException: EXECUTION_ERROR ERROR: 
> java.lang.reflect.UndeclaredThrowableException
> Failed to setup reader: HiveDefaultRecordReader
> Fragment: 0:0
> [Error Id: 323434cc-7bd2-4551-94d4-a5925f6a66af on drill80:31010]
>   (org.apache.drill.common.exceptions.ExecutionSetupException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30
> 
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():257
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1669
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> java.util.concurrent.ThreadPoolExecutor.runWorker():1149
> java.util.concurrent.ThreadPoolExecutor$Worker.run():624
> java.lang.Thread.run():748
>   Caused By (java.util.concurrent.ExecutionException) 
> java.lang.reflect.UndeclaredThrowableException
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.getDoneValue():553
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.get():534
> 
> org.apache.drill.shaded.guava.com.google.common.util.concurrent.FluentFuture$TrustedFuture.get():88
> 
> org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():252
> org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
> org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
> org.apache.drill.exec.physical.impl.ScanBatch.next():298
> org.apache.drill.exec.record.AbstractRecordBatch.next():119
> org.apache.drill.exec.record.AbstractRecordBatch.next():111
> org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59
> 
> org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
> org.apache.drill.exec.record.AbstractRecordBatch.next():170
> org.apache.drill.exec.physical.impl.BaseRootExec.next():103
> 
> org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
> org.apache.drill.exec.physical.impl.BaseRootExec.next():93
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
> org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
> java.security.AccessController.doPrivileged():-2
> javax.security.auth.Subject.doAs():422
> org.apache.hadoop.security.UserGroupInformation.doAs():1669
> org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
> org.apache.drill.common.SelfCleaningRunnable.run():38
> 

[jira] [Created] (DRILL-7785) Hive Clustered Bucketed ORC transactional table fails with UndeclaredThrowableException

2020-08-30 Thread Abhishek Girish (Jira)
Abhishek Girish created DRILL-7785:
--

 Summary: Hive Clustered Bucketed ORC transactional table fails 
with UndeclaredThrowableException
 Key: DRILL-7785
 URL: https://issues.apache.org/jira/browse/DRILL-7785
 Project: Apache Drill
  Issue Type: Bug
  Components: Execution - Flow, Storage - Hive
Affects Versions: 1.18.0
Reporter: Abhishek Girish
Assignee: Vova Vysotskyi


Query: 
{code}
Functional/hive/hive_storage/fileformats/orc/transactional/orc_table_clustered_bucketed.sql
select * from hive_orc_transactional.orc_table_clustered_bucketed
{code}
Exception:
{code}
java.sql.SQLException: EXECUTION_ERROR ERROR: 
java.lang.reflect.UndeclaredThrowableException

Failed to setup reader: HiveDefaultRecordReader
Fragment: 0:0

[Error Id: 323434cc-7bd2-4551-94d4-a5925f6a66af on drill80:31010]

  (org.apache.drill.common.exceptions.ExecutionSetupException) 
java.lang.reflect.UndeclaredThrowableException

org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30
org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():257
org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
org.apache.drill.exec.physical.impl.ScanBatch.next():298
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.physical.impl.BaseRootExec.next():103
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
org.apache.drill.exec.physical.impl.BaseRootExec.next():93
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1669
org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
  Caused By (java.util.concurrent.ExecutionException) 
java.lang.reflect.UndeclaredThrowableException

org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.getDoneValue():553

org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.get():534

org.apache.drill.shaded.guava.com.google.common.util.concurrent.FluentFuture$TrustedFuture.get():88
org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():252
org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
org.apache.drill.exec.physical.impl.ScanBatch.next():298
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.physical.impl.BaseRootExec.next():103
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
org.apache.drill.exec.physical.impl.BaseRootExec.next():93
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1669
org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
  Caused By (java.lang.reflect.UndeclaredThrowableException) null
org.apache.hadoop.security.UserGroupInformation.doAs():1687
org.apache.drill.exec.ops.OperatorContextImpl$1.call():101

org.apache.drill.shaded.guava.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly():125

org.apache.drill.shaded.guava.com.google.common.util.concurrent.InterruptibleTask.run():69

org.apache.drill.shaded.guava.com.google.common.util.concurrent.TrustedListenableFutureTask.run():78
java.util.concurrent.ThreadPoolExecutor.runWorker():1149

[jira] [Commented] (DRILL-7785) Hive Clustered Bucketed ORC transactional table fails with UndeclaredThrowableException

2020-08-30 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17187325#comment-17187325
 ] 

Abhishek Girish commented on DRILL-7785:


Another query:
{code}
Functional/hive/hive_storage/fileformats/orc/transactional/orc_table_partitioned_clustered_bucketed.sql
select * from hive_orc_transactional.orc_table_partitioned_clustered_bucketed

Exception:

java.sql.SQLException: EXECUTION_ERROR ERROR: 
java.lang.reflect.UndeclaredThrowableException

Failed to setup reader: HiveDefaultRecordReader
Fragment: 0:0

[Error Id: 7428d4f1-9968-481a-b0a3-5613d060448d on drill83:31010]

  (org.apache.drill.common.exceptions.ExecutionSetupException) 
java.lang.reflect.UndeclaredThrowableException

org.apache.drill.common.exceptions.ExecutionSetupException.fromThrowable():30
org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():257
org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
org.apache.drill.exec.physical.impl.ScanBatch.next():298
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.physical.impl.BaseRootExec.next():103
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
org.apache.drill.exec.physical.impl.BaseRootExec.next():93
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1669
org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
  Caused By (java.util.concurrent.ExecutionException) 
java.lang.reflect.UndeclaredThrowableException

org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.getDoneValue():553

org.apache.drill.shaded.guava.com.google.common.util.concurrent.AbstractFuture.get():534

org.apache.drill.shaded.guava.com.google.common.util.concurrent.FluentFuture$TrustedFuture.get():88
org.apache.drill.exec.store.hive.readers.HiveDefaultRecordReader.setup():252
org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas():331
org.apache.drill.exec.physical.impl.ScanBatch.internalNext():227
org.apache.drill.exec.physical.impl.ScanBatch.next():298
org.apache.drill.exec.record.AbstractRecordBatch.next():119
org.apache.drill.exec.record.AbstractRecordBatch.next():111
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext():59

org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext():85
org.apache.drill.exec.record.AbstractRecordBatch.next():170
org.apache.drill.exec.physical.impl.BaseRootExec.next():103
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext():81
org.apache.drill.exec.physical.impl.BaseRootExec.next():93
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():323
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run():310
java.security.AccessController.doPrivileged():-2
javax.security.auth.Subject.doAs():422
org.apache.hadoop.security.UserGroupInformation.doAs():1669
org.apache.drill.exec.work.fragment.FragmentExecutor.run():310
org.apache.drill.common.SelfCleaningRunnable.run():38
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
  Caused By (java.lang.reflect.UndeclaredThrowableException) null
org.apache.hadoop.security.UserGroupInformation.doAs():1687
org.apache.drill.exec.ops.OperatorContextImpl$1.call():101

org.apache.drill.shaded.guava.com.google.common.util.concurrent.TrustedListenableFutureTask$TrustedFutureInterruptibleTask.runInterruptibly():125

org.apache.drill.shaded.guava.com.google.common.util.concurrent.InterruptibleTask.run():69

org.apache.drill.shaded.guava.com.google.common.util.concurrent.TrustedListenableFutureTask.run():78
java.util.concurrent.ThreadPoolExecutor.runWorker():1149
java.util.concurrent.ThreadPoolExecutor$Worker.run():624
java.lang.Thread.run():748
  Caused By (org.apache.drill.common.exceptions.ExecutionSetupException) Failed 
to get o.a.hadoop.mapred.RecordReader from Hive InputFormat


[jira] [Commented] (DRILL-7784) Add Abhishek's GPG Keys

2020-08-24 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17183547#comment-17183547
 ] 

Abhishek Girish commented on DRILL-7784:


- Public GPG Keys signed by Vova
- Uploaded to https://dist.apache.org/repos/dist/release/drill/KEYS by Vova

> Add Abhishek's GPG Keys
> ---
>
> Key: DRILL-7784
> URL: https://issues.apache.org/jira/browse/DRILL-7784
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (DRILL-7784) Add Abhishek's GPG Keys

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish closed DRILL-7784.
--

> Add Abhishek's GPG Keys
> ---
>
> Key: DRILL-7784
> URL: https://issues.apache.org/jira/browse/DRILL-7784
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (DRILL-7784) Add Abhishek's GPG Keys

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish resolved DRILL-7784.

Resolution: Fixed

> Add Abhishek's GPG Keys
> ---
>
> Key: DRILL-7784
> URL: https://issues.apache.org/jira/browse/DRILL-7784
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7784) Add Abhishek's GPG Keys

2020-08-24 Thread Abhishek Girish (Jira)
Abhishek Girish created DRILL-7784:
--

 Summary: Add Abhishek's GPG Keys
 Key: DRILL-7784
 URL: https://issues.apache.org/jira/browse/DRILL-7784
 Project: Apache Drill
  Issue Type: Sub-task
Reporter: Abhishek Girish
Assignee: Abhishek Girish






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7782) Update the copyright year in NOTICE

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7782:
---
Parent: DRILL-7783
Issue Type: Sub-task  (was: Task)

> Update the copyright year in NOTICE
> ---
>
> Key: DRILL-7782
> URL: https://issues.apache.org/jira/browse/DRILL-7782
> Project: Apache Drill
>  Issue Type: Sub-task
>  Components: Tools, Build  Test
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Copyright year in NOTICE.txt file is until 2019, we should update it to 2020.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7783) Apache Drill 1.18.0 Release Activities

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7783:
---
Description: 
Source - https://github.com/apache/drill/blob/master/docs/dev/Release.md

1.18.0 Dashboard: 
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12335942

Kanban Board - 
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185

  was:
Source - https://github.com/parthchandra/drill/wiki/Drill-Release-Process

1.17.0 Dashboard - 
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333934

Kanban Board - 
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185


> Apache Drill 1.18.0 Release Activities
> --
>
> Key: DRILL-7783
> URL: https://issues.apache.org/jira/browse/DRILL-7783
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
> Fix For: 1.18.0
>
>
> Source - https://github.com/apache/drill/blob/master/docs/dev/Release.md
> 1.18.0 Dashboard: 
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12335942
> Kanban Board - 
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (DRILL-7783) Apache Drill 1.18.0 Release Activities

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish reassigned DRILL-7783:
--

Assignee: Abhishek Girish  (was: Vova Vysotskyi)

> Apache Drill 1.18.0 Release Activities
> --
>
> Key: DRILL-7783
> URL: https://issues.apache.org/jira/browse/DRILL-7783
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.17.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
> Fix For: 1.17.0
>
>
> Source - https://github.com/parthchandra/drill/wiki/Drill-Release-Process
> 1.17.0 Dashboard - 
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333934
> Kanban Board - 
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7783) Apache Drill 1.18.0 Release Activities

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7783:
---
Fix Version/s: (was: 1.17.0)
   1.18.0

> Apache Drill 1.18.0 Release Activities
> --
>
> Key: DRILL-7783
> URL: https://issues.apache.org/jira/browse/DRILL-7783
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.17.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
> Fix For: 1.18.0
>
>
> Source - https://github.com/parthchandra/drill/wiki/Drill-Release-Process
> 1.17.0 Dashboard - 
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333934
> Kanban Board - 
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7783) Apache Drill 1.18.0 Release Activities

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7783:
---
Affects Version/s: (was: 1.17.0)
   1.18.0

> Apache Drill 1.18.0 Release Activities
> --
>
> Key: DRILL-7783
> URL: https://issues.apache.org/jira/browse/DRILL-7783
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
> Fix For: 1.18.0
>
>
> Source - https://github.com/parthchandra/drill/wiki/Drill-Release-Process
> 1.17.0 Dashboard - 
> https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333934
> Kanban Board - 
> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7783) Apache Drill 1.18.0 Release Activities

2020-08-24 Thread Abhishek Girish (Jira)
Abhishek Girish created DRILL-7783:
--

 Summary: Apache Drill 1.18.0 Release Activities
 Key: DRILL-7783
 URL: https://issues.apache.org/jira/browse/DRILL-7783
 Project: Apache Drill
  Issue Type: Task
Affects Versions: 1.17.0
Reporter: Abhishek Girish
Assignee: Vova Vysotskyi
 Fix For: 1.17.0


Source - https://github.com/parthchandra/drill/wiki/Drill-Release-Process

1.17.0 Dashboard - 
https://issues.apache.org/jira/secure/Dashboard.jspa?selectPageId=12333934

Kanban Board - 
https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (DRILL-7782) Update the copyright year in NOTICE

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish closed DRILL-7782.
--

> Update the copyright year in NOTICE
> ---
>
> Key: DRILL-7782
> URL: https://issues.apache.org/jira/browse/DRILL-7782
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build  Test
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Copyright year in NOTICE.txt file is until 2019, we should update it to 2020.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (DRILL-7782) Update the copyright year in NOTICE

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish resolved DRILL-7782.

Resolution: Fixed

> Update the copyright year in NOTICE
> ---
>
> Key: DRILL-7782
> URL: https://issues.apache.org/jira/browse/DRILL-7782
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build  Test
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Copyright year in NOTICE.txt file is until 2019, we should update it to 2020.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7782) Update the copyright year in NOTICE

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7782:
---
Reviewer: Vova Vysotskyi
  Labels: ready-to-commit  (was: )

> Update the copyright year in NOTICE
> ---
>
> Key: DRILL-7782
> URL: https://issues.apache.org/jira/browse/DRILL-7782
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build  Test
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Copyright year in NOTICE.txt file is until 2019, we should update it to 2020.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7782) Update the copyright year in NOTICE

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7782:
---
Summary: Update the copyright year in NOTICE  (was: Update the copyright 
year in NOTICE.txt file)

> Update the copyright year in NOTICE
> ---
>
> Key: DRILL-7782
> URL: https://issues.apache.org/jira/browse/DRILL-7782
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build  Test
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
> Fix For: 1.18.0
>
>
> Copyright year in NOTICE.txt file is until 2019, we should update it to 2020.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7782) Update the copyright year in NOTICE.txt file

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7782:
---
Fix Version/s: (was: 1.16.0)
   1.18.0

> Update the copyright year in NOTICE.txt file
> 
>
> Key: DRILL-7782
> URL: https://issues.apache.org/jira/browse/DRILL-7782
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.18.0
>
>
> Copyright year in NOTICE.txt file is until 2018, we should update it to 2019.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7782) Update the copyright year in NOTICE.txt file

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7782:
---
Description: Copyright year in NOTICE.txt file is until 2019, we should 
update it to 2020.  (was: Copyright year in NOTICE.txt file is until 2018, we 
should update it to 2019.)

> Update the copyright year in NOTICE.txt file
> 
>
> Key: DRILL-7782
> URL: https://issues.apache.org/jira/browse/DRILL-7782
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build  Test
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
> Fix For: 1.18.0
>
>
> Copyright year in NOTICE.txt file is until 2019, we should update it to 2020.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7782) Update the copyright year in NOTICE.txt file

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7782:
---
Affects Version/s: (was: 1.16.0)
   1.18.0

> Update the copyright year in NOTICE.txt file
> 
>
> Key: DRILL-7782
> URL: https://issues.apache.org/jira/browse/DRILL-7782
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build  Test
>Affects Versions: 1.18.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
> Fix For: 1.18.0
>
>
> Copyright year in NOTICE.txt file is until 2018, we should update it to 2019.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7782) Update the copyright year in NOTICE.txt file

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7782:
---
Reviewer:   (was: Boaz Ben-Zvi)

> Update the copyright year in NOTICE.txt file
> 
>
> Key: DRILL-7782
> URL: https://issues.apache.org/jira/browse/DRILL-7782
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Abhishek Girish
>Assignee: Sorabh Hamirwasia
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Copyright year in NOTICE.txt file is until 2018, we should update it to 2019.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Assigned] (DRILL-7782) Update the copyright year in NOTICE.txt file

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish reassigned DRILL-7782:
--

Assignee: Abhishek Girish  (was: Sorabh Hamirwasia)

> Update the copyright year in NOTICE.txt file
> 
>
> Key: DRILL-7782
> URL: https://issues.apache.org/jira/browse/DRILL-7782
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.16.0
>
>
> Copyright year in NOTICE.txt file is until 2018, we should update it to 2019.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7782) Update the copyright year in NOTICE.txt file

2020-08-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7782:
---
Labels:   (was: ready-to-commit)

> Update the copyright year in NOTICE.txt file
> 
>
> Key: DRILL-7782
> URL: https://issues.apache.org/jira/browse/DRILL-7782
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
> Fix For: 1.18.0
>
>
> Copyright year in NOTICE.txt file is until 2018, we should update it to 2019.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7782) Update the copyright year in NOTICE.txt file

2020-08-24 Thread Abhishek Girish (Jira)
Abhishek Girish created DRILL-7782:
--

 Summary: Update the copyright year in NOTICE.txt file
 Key: DRILL-7782
 URL: https://issues.apache.org/jira/browse/DRILL-7782
 Project: Apache Drill
  Issue Type: Task
  Components: Tools, Build  Test
Affects Versions: 1.16.0
Reporter: Abhishek Girish
Assignee: Sorabh Hamirwasia
 Fix For: 1.16.0


Copyright year in NOTICE.txt file is until 2018, we should update it to 2019.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7660) Modify Drill Dockerfiles

2020-03-24 Thread Abhishek Girish (Jira)
Abhishek Girish created DRILL-7660:
--

 Summary: Modify Drill Dockerfiles
 Key: DRILL-7660
 URL: https://issues.apache.org/jira/browse/DRILL-7660
 Project: Apache Drill
  Issue Type: Sub-task
Reporter: Abhishek Girish
Assignee: Abhishek Girish






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Created] (DRILL-7659) Add support for Helm Charts based deployment on Kubernetes

2020-03-24 Thread Abhishek Girish (Jira)
Abhishek Girish created DRILL-7659:
--

 Summary: Add support for Helm Charts based deployment on Kubernetes
 Key: DRILL-7659
 URL: https://issues.apache.org/jira/browse/DRILL-7659
 Project: Apache Drill
  Issue Type: Sub-task
Reporter: Abhishek Girish
Assignee: Abhishek Girish






--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-6851) Create Drill Operator for Kubernetes

2020-03-24 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-6851?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-6851:
---
Parent: DRILL-6598
Issue Type: Sub-task  (was: Task)

> Create Drill Operator for Kubernetes
> 
>
> Key: DRILL-6851
> URL: https://issues.apache.org/jira/browse/DRILL-6851
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Abhishek Girish
>Assignee: Abhishek Girish
>Priority: Major
>
> This task is to track creating an initial version of the Drill Operator for 
> Kubernetes. I'll shortly update the JIRA on background, details on Operator, 
> and what's planned for the first version. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7563) Docker & Kubernetes Drill server container

2020-03-24 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066097#comment-17066097
 ] 

Abhishek Girish commented on DRILL-7563:


[~cgivre], once we get my changes in to Drill GitHub repo, I think we can work 
on the docs / tutorials. I also have some material which can help for it. 

> Docker & Kubernetes Drill server container
> --
>
> Key: DRILL-7563
> URL: https://issues.apache.org/jira/browse/DRILL-7563
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> Drill provides two Docker containers:
> * [Build Drill from 
> sources|https://github.com/apache/drill/blob/master/Dockerfile]
> * [Run Drill in interactive embedded 
> mode|https://github.com/apache/drill/blob/master/distribution/Dockerfile]
> User feedback suggests that these are not quite the right solutions to run 
> Drill in a K8s (or OpenShift) cluster. In addition, we need a container to 
> run a Drill server. This ticket summarizes the tasks involved.
> h4. Container Image
> The container image should:
> * Start with the OpenJDK base image with minimal extra packages.
> * Download and install an official Drill release.
> We may then want to provide two derived images:
> The Drillbit image which:
> * Configures Drill for production and as needed in the following steps.
> * Provides entry points for the Drillbit and for Sqlline
> * Exposes Drill's four ports
> * Accept as parameters things like the ZK host IP(s).
> The Sqlline image, meant to be run in interactive mode (like the current 
> embedded image) and which:
> * Accept as parameters the ZK host IP(s).
> Both should be published to the official Drill DockerHub account: 
> https://hub.docker.com/r/apache/drill
> h4. Runtime Environment
> Drill has very few dependencies, but it must have a running ZK.
> * Start a [ZK container|https://hub.docker.com/_/zookeeper/].
> * A place to store logs, which can be in the container by default, stored on 
> the host file system via a volume mount.
> * Access to a data source, which can be configured via a storage plugin 
> stored in ZK.
> * Ensure graceful shutdown integration with the Docker shutdown mechanism.
> h4. Running Drill in Docker
> Users must run at least one Drillbit, and may run more. Users may want to run 
> Sqlline.
> * The Drillbit container requires, at a minimum, the IP address of the ZK 
> instance(s).
> * The Sqlline container requires only the ZK instances, from which it can 
> find the Drillbit.
> Uses will want to customize some parts of Drill: at least memory, perhaps any 
> of the other options. Provide a way to pass this information into the 
> container to avoid the need to rebuild the container to change configuration.
> h4. Running Drill in K8s
> The containers should be usable in "plain" Docker. Today, however, many 
> people use K8s to orchestrate Docker. Thus, the Drillbit (but probably not 
> the Sqlline) container should be designed to work with K8s. An example set of 
> K8s YAML files should illustrate:
> * Create a host-mount file system to capture Drill logs and query profiles.
> * Optionally write Drill logs to stdout, to be captured by {{fluentd}} or 
> similar tools.
> * Pass Drill configuration (both HOCON and envrironment) as config maps.
> * Pass ZK as an environment variable (the value of which would, one presumes, 
> come from some kind of service discovery system.)
> The result is that the user should be able to manually tinker with the YAML 
> files, then use {{kubeadm}} to launch, monitor and stop Drill. The user sets 
> cluster size manually by launching the desired number of Drill pods.
> h4. Helm Chart for Drill
> The next step is to wrap the YAML files in a Helm chart, with parameters 
> exposed for the config options noted above.
> h4. Drill Operator for K8s
>  
> Full K8s integration will require an operator to manage the Drill cluster. 
> K8s operators are often written in Go, though doing so is not necessary. 
> Drill already includes Drill-on-YARN which is, essential a "YARN operator." 
> Repurpose this code to work with K8s as the target cluster manager rather 
> than YARN. Reuse the same operations from DoY: configure, start, resize and 
> stop a cluster.
> h4. Security
> The above steps provide an "MVP": minimum viable project - it will run Drill 
> with standard options in the various environments. Users who chose to run 
> Drill in production will likely require additional security settings. Enable 
> SSL? Control ingress? We need to understand what is needed, what Drill 
> offers, and how to enable Drill's security features in a containerized 
> environment.
> h4. Production Deployments
> With Docker and K8s the old maxim "the devil is in the details" is true 

[jira] [Commented] (DRILL-7563) Docker & Kubernetes Drill server container

2020-03-24 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17066086#comment-17066086
 ] 

Abhishek Girish commented on DRILL-7563:


I've added support for  auto-scaling and I've tested that it works well. Please 
see: https://github.com/Agirish/drill-helm-charts#autoscaling-drill-clusters

I  have a script to test this: 
https://github.com/Agirish/drill-helm-charts/blob/master/scripts/runCPULoadTest.sh

> Docker & Kubernetes Drill server container
> --
>
> Key: DRILL-7563
> URL: https://issues.apache.org/jira/browse/DRILL-7563
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> Drill provides two Docker containers:
> * [Build Drill from 
> sources|https://github.com/apache/drill/blob/master/Dockerfile]
> * [Run Drill in interactive embedded 
> mode|https://github.com/apache/drill/blob/master/distribution/Dockerfile]
> User feedback suggests that these are not quite the right solutions to run 
> Drill in a K8s (or OpenShift) cluster. In addition, we need a container to 
> run a Drill server. This ticket summarizes the tasks involved.
> h4. Container Image
> The container image should:
> * Start with the OpenJDK base image with minimal extra packages.
> * Download and install an official Drill release.
> We may then want to provide two derived images:
> The Drillbit image which:
> * Configures Drill for production and as needed in the following steps.
> * Provides entry points for the Drillbit and for Sqlline
> * Exposes Drill's four ports
> * Accept as parameters things like the ZK host IP(s).
> The Sqlline image, meant to be run in interactive mode (like the current 
> embedded image) and which:
> * Accept as parameters the ZK host IP(s).
> Both should be published to the official Drill DockerHub account: 
> https://hub.docker.com/r/apache/drill
> h4. Runtime Environment
> Drill has very few dependencies, but it must have a running ZK.
> * Start a [ZK container|https://hub.docker.com/_/zookeeper/].
> * A place to store logs, which can be in the container by default, stored on 
> the host file system via a volume mount.
> * Access to a data source, which can be configured via a storage plugin 
> stored in ZK.
> * Ensure graceful shutdown integration with the Docker shutdown mechanism.
> h4. Running Drill in Docker
> Users must run at least one Drillbit, and may run more. Users may want to run 
> Sqlline.
> * The Drillbit container requires, at a minimum, the IP address of the ZK 
> instance(s).
> * The Sqlline container requires only the ZK instances, from which it can 
> find the Drillbit.
> Uses will want to customize some parts of Drill: at least memory, perhaps any 
> of the other options. Provide a way to pass this information into the 
> container to avoid the need to rebuild the container to change configuration.
> h4. Running Drill in K8s
> The containers should be usable in "plain" Docker. Today, however, many 
> people use K8s to orchestrate Docker. Thus, the Drillbit (but probably not 
> the Sqlline) container should be designed to work with K8s. An example set of 
> K8s YAML files should illustrate:
> * Create a host-mount file system to capture Drill logs and query profiles.
> * Optionally write Drill logs to stdout, to be captured by {{fluentd}} or 
> similar tools.
> * Pass Drill configuration (both HOCON and envrironment) as config maps.
> * Pass ZK as an environment variable (the value of which would, one presumes, 
> come from some kind of service discovery system.)
> The result is that the user should be able to manually tinker with the YAML 
> files, then use {{kubeadm}} to launch, monitor and stop Drill. The user sets 
> cluster size manually by launching the desired number of Drill pods.
> h4. Helm Chart for Drill
> The next step is to wrap the YAML files in a Helm chart, with parameters 
> exposed for the config options noted above.
> h4. Drill Operator for K8s
>  
> Full K8s integration will require an operator to manage the Drill cluster. 
> K8s operators are often written in Go, though doing so is not necessary. 
> Drill already includes Drill-on-YARN which is, essential a "YARN operator." 
> Repurpose this code to work with K8s as the target cluster manager rather 
> than YARN. Reuse the same operations from DoY: configure, start, resize and 
> stop a cluster.
> h4. Security
> The above steps provide an "MVP": minimum viable project - it will run Drill 
> with standard options in the various environments. Users who chose to run 
> Drill in production will likely require additional security settings. Enable 
> SSL? Control ingress? We need to understand what is needed, what Drill 
> offers, and how to enable Drill's security features in a containerized 

[jira] [Commented] (DRILL-7563) Docker & Kubernetes Drill server container

2020-03-20 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17063721#comment-17063721
 ] 

Abhishek Girish commented on DRILL-7563:


Update: I've added support for overriding Drill configurations (drill-env.sh 
and drill-override.conf) via conf files uploaded to configMaps. Details in the 
repo.

> Docker & Kubernetes Drill server container
> --
>
> Key: DRILL-7563
> URL: https://issues.apache.org/jira/browse/DRILL-7563
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> Drill provides two Docker containers:
> * [Build Drill from 
> sources|https://github.com/apache/drill/blob/master/Dockerfile]
> * [Run Drill in interactive embedded 
> mode|https://github.com/apache/drill/blob/master/distribution/Dockerfile]
> User feedback suggests that these are not quite the right solutions to run 
> Drill in a K8s (or OpenShift) cluster. In addition, we need a container to 
> run a Drill server. This ticket summarizes the tasks involved.
> h4. Container Image
> The container image should:
> * Start with the OpenJDK base image with minimal extra packages.
> * Download and install an official Drill release.
> We may then want to provide two derived images:
> The Drillbit image which:
> * Configures Drill for production and as needed in the following steps.
> * Provides entry points for the Drillbit and for Sqlline
> * Exposes Drill's four ports
> * Accept as parameters things like the ZK host IP(s).
> The Sqlline image, meant to be run in interactive mode (like the current 
> embedded image) and which:
> * Accept as parameters the ZK host IP(s).
> Both should be published to the official Drill DockerHub account: 
> https://hub.docker.com/r/apache/drill
> h4. Runtime Environment
> Drill has very few dependencies, but it must have a running ZK.
> * Start a [ZK container|https://hub.docker.com/_/zookeeper/].
> * A place to store logs, which can be in the container by default, stored on 
> the host file system via a volume mount.
> * Access to a data source, which can be configured via a storage plugin 
> stored in ZK.
> * Ensure graceful shutdown integration with the Docker shutdown mechanism.
> h4. Running Drill in Docker
> Users must run at least one Drillbit, and may run more. Users may want to run 
> Sqlline.
> * The Drillbit container requires, at a minimum, the IP address of the ZK 
> instance(s).
> * The Sqlline container requires only the ZK instances, from which it can 
> find the Drillbit.
> Uses will want to customize some parts of Drill: at least memory, perhaps any 
> of the other options. Provide a way to pass this information into the 
> container to avoid the need to rebuild the container to change configuration.
> h4. Running Drill in K8s
> The containers should be usable in "plain" Docker. Today, however, many 
> people use K8s to orchestrate Docker. Thus, the Drillbit (but probably not 
> the Sqlline) container should be designed to work with K8s. An example set of 
> K8s YAML files should illustrate:
> * Create a host-mount file system to capture Drill logs and query profiles.
> * Optionally write Drill logs to stdout, to be captured by {{fluentd}} or 
> similar tools.
> * Pass Drill configuration (both HOCON and envrironment) as config maps.
> * Pass ZK as an environment variable (the value of which would, one presumes, 
> come from some kind of service discovery system.)
> The result is that the user should be able to manually tinker with the YAML 
> files, then use {{kubeadm}} to launch, monitor and stop Drill. The user sets 
> cluster size manually by launching the desired number of Drill pods.
> h4. Helm Chart for Drill
> The next step is to wrap the YAML files in a Helm chart, with parameters 
> exposed for the config options noted above.
> h4. Drill Operator for K8s
>  
> Full K8s integration will require an operator to manage the Drill cluster. 
> K8s operators are often written in Go, though doing so is not necessary. 
> Drill already includes Drill-on-YARN which is, essential a "YARN operator." 
> Repurpose this code to work with K8s as the target cluster manager rather 
> than YARN. Reuse the same operations from DoY: configure, start, resize and 
> stop a cluster.
> h4. Security
> The above steps provide an "MVP": minimum viable project - it will run Drill 
> with standard options in the various environments. Users who chose to run 
> Drill in production will likely require additional security settings. Enable 
> SSL? Control ingress? We need to understand what is needed, what Drill 
> offers, and how to enable Drill's security features in a containerized 
> environment.
> h4. Production Deployments
> With Docker and K8s the old maxim "the devil is in the details" 

[jira] [Commented] (DRILL-7563) Docker & Kubernetes Drill server container

2020-03-20 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7563?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17063608#comment-17063608
 ] 

Abhishek Girish commented on DRILL-7563:


Hey [~Paul.Rogers], [~arina], and all, sorry I missed this JIRA + comments in 
it.

I would like to share this GitHub repository of mine: 
https://github.com/Agirish/drill-helm-charts . Kindly take a look when you have 
some time.

It has support for deploying Apache Drill clusters on Kubernetes using the Helm 
Charts approach. It's functional out-of-box (I've deployed on a standard GKE 
Kubernetes Cluster) and I'm frequently adding fixes, improvements and new 
features. There are a few things missing - such as support for passing Drill 
configuration files as ConfigMaps, which I'm working on (but it has the most 
common configs available for anyone to change differently). 
The documentation has all the basic details on repo structure & usage, and I'm 
working on adding more information to it. This requires custom Dockerfiles and 
I have included those in the repo as well. 

I also have an alternate implementation by building a new native Kubernetes 
Operator for Apache Drill which can provide more flexibility and power. But for 
now, I'd like to focus on getting the above Helm Charts approach to be fully 
feature complete, reviewed and committed, so that we could ship them in an 
upcoming release.

Please feel free to submit corrections / enhancements as PRs (or file GitHub 
issues with comments) to the repo above until this is part of the official 
Drill repo: https://github.com/Agirish/drill-helm-charts/issues

> Docker & Kubernetes Drill server container
> --
>
> Key: DRILL-7563
> URL: https://issues.apache.org/jira/browse/DRILL-7563
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.17.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
>
> Drill provides two Docker containers:
> * [Build Drill from 
> sources|https://github.com/apache/drill/blob/master/Dockerfile]
> * [Run Drill in interactive embedded 
> mode|https://github.com/apache/drill/blob/master/distribution/Dockerfile]
> User feedback suggests that these are not quite the right solutions to run 
> Drill in a K8s (or OpenShift) cluster. In addition, we need a container to 
> run a Drill server. This ticket summarizes the tasks involved.
> h4. Container Image
> The container image should:
> * Start with the OpenJDK base image with minimal extra packages.
> * Download and install an official Drill release.
> We may then want to provide two derived images:
> The Drillbit image which:
> * Configures Drill for production and as needed in the following steps.
> * Provides entry points for the Drillbit and for Sqlline
> * Exposes Drill's four ports
> * Accept as parameters things like the ZK host IP(s).
> The Sqlline image, meant to be run in interactive mode (like the current 
> embedded image) and which:
> * Accept as parameters the ZK host IP(s).
> Both should be published to the official Drill DockerHub account: 
> https://hub.docker.com/r/apache/drill
> h4. Runtime Environment
> Drill has very few dependencies, but it must have a running ZK.
> * Start a [ZK container|https://hub.docker.com/_/zookeeper/].
> * A place to store logs, which can be in the container by default, stored on 
> the host file system via a volume mount.
> * Access to a data source, which can be configured via a storage plugin 
> stored in ZK.
> * Ensure graceful shutdown integration with the Docker shutdown mechanism.
> h4. Running Drill in Docker
> Users must run at least one Drillbit, and may run more. Users may want to run 
> Sqlline.
> * The Drillbit container requires, at a minimum, the IP address of the ZK 
> instance(s).
> * The Sqlline container requires only the ZK instances, from which it can 
> find the Drillbit.
> Uses will want to customize some parts of Drill: at least memory, perhaps any 
> of the other options. Provide a way to pass this information into the 
> container to avoid the need to rebuild the container to change configuration.
> h4. Running Drill in K8s
> The containers should be usable in "plain" Docker. Today, however, many 
> people use K8s to orchestrate Docker. Thus, the Drillbit (but probably not 
> the Sqlline) container should be designed to work with K8s. An example set of 
> K8s YAML files should illustrate:
> * Create a host-mount file system to capture Drill logs and query profiles.
> * Optionally write Drill logs to stdout, to be captured by {{fluentd}} or 
> similar tools.
> * Pass Drill configuration (both HOCON and envrironment) as config maps.
> * Pass ZK as an environment variable (the value of which would, one presumes, 
> come from some kind of service discovery system.)
> The result is that the user should be able to 

[jira] [Closed] (DRILL-6813) Inner-Join between two tar.gz files throwing error- One or more nodes ran out of memory while executing the query

2020-03-20 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish closed DRILL-6813.
--

> Inner-Join between two tar.gz files throwing error- One or more nodes ran out 
> of memory while executing the query
> -
>
> Key: DRILL-6813
> URL: https://issues.apache.org/jira/browse/DRILL-6813
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Ashish Pancholi
>Assignee: Abhishek Girish
>Priority: Major
>
> I am using `Apache Drill` version `1.14` on `windows` system having 8 GB RAM 
> and running Drill using the command:
> {code:java}
> sqlline.bat -u "jdbc:drill:zk=local"{code}
> I am trying to `execute` a `join query` on two `compressed` and `archived` 
> `CSV` files.
> Query:
> {code:java}
> SELECT * FROM 
> dfs.`C:\Users\admin\Desktop\DRILL_FILES\csvFileParquet\TBL_MOREDATA-20180924181406.tar.gz`
>  AS Table0 INNER JOIN 
> dfs.`C:\Users\admin\Desktop\DRILL_FILES\csvFileParquet\TBL_MOREDATA1-20180924181406.tar.gz`
>  AS Table1 ON Table0.columns[0]=Table1.columns[0]{code}
> But an out of memory, error occurred:
> {code:java}
> org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One 
> or more nodes ran out of memory while executing the query. Unable to allocate 
> buffer of size 131072 (rounded from 86104) due to memory limit (630194176). 
> Current allocation: 630108434 Fragment 0:0 [Error Id: 
> 585c0644-5fd5-446e-b9b3-d48e0771eb2a on DESKTOP-SM3E3KM:31010]{code}
> To resolve the issue, I tried to update `config\drill-env.sh` file but the 
> issue remains the same and it looks like updating the script file does not 
> reflect the changes because I am trying to increase the DIRECT MEMORY beyond 
> the system memory (RAM), every time drill starts up peacefully. Not even 
> complaining that you have exceeded the memory, therefore, it looks like the 
> changes are not reflecting.
> {code:java}
> export DRILLBIT_MAX_PROC_MEM=12G
> export DRILL_HEAP=2G
> export DRILL_MAX_DIRECT_MEMORY=10G{code}
> whereas my system's main memory is only 8 GB.
> Please help me to resolve the out of memory error. I had even run the below 
> queries, in order to follow the troubleshooting instructions but the issue 
> remains the same.
> {code:java}
> alter session set `planner.enable_hashagg` = false; 
> alter session set `planner.enable_hashjoin` = false;
> alter session set planner.width.max_per_node=3; 
> alter system set planner.width.max_per_query = 100;{code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Resolved] (DRILL-6813) Inner-Join between two tar.gz files throwing error- One or more nodes ran out of memory while executing the query

2020-03-20 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-6813?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish resolved DRILL-6813.

Resolution: Information Provided

> Inner-Join between two tar.gz files throwing error- One or more nodes ran out 
> of memory while executing the query
> -
>
> Key: DRILL-6813
> URL: https://issues.apache.org/jira/browse/DRILL-6813
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.14.0
>Reporter: Ashish Pancholi
>Assignee: Abhishek Girish
>Priority: Major
>
> I am using `Apache Drill` version `1.14` on `windows` system having 8 GB RAM 
> and running Drill using the command:
> {code:java}
> sqlline.bat -u "jdbc:drill:zk=local"{code}
> I am trying to `execute` a `join query` on two `compressed` and `archived` 
> `CSV` files.
> Query:
> {code:java}
> SELECT * FROM 
> dfs.`C:\Users\admin\Desktop\DRILL_FILES\csvFileParquet\TBL_MOREDATA-20180924181406.tar.gz`
>  AS Table0 INNER JOIN 
> dfs.`C:\Users\admin\Desktop\DRILL_FILES\csvFileParquet\TBL_MOREDATA1-20180924181406.tar.gz`
>  AS Table1 ON Table0.columns[0]=Table1.columns[0]{code}
> But an out of memory, error occurred:
> {code:java}
> org.apache.drill.common.exceptions.UserRemoteException: RESOURCE ERROR: One 
> or more nodes ran out of memory while executing the query. Unable to allocate 
> buffer of size 131072 (rounded from 86104) due to memory limit (630194176). 
> Current allocation: 630108434 Fragment 0:0 [Error Id: 
> 585c0644-5fd5-446e-b9b3-d48e0771eb2a on DESKTOP-SM3E3KM:31010]{code}
> To resolve the issue, I tried to update `config\drill-env.sh` file but the 
> issue remains the same and it looks like updating the script file does not 
> reflect the changes because I am trying to increase the DIRECT MEMORY beyond 
> the system memory (RAM), every time drill starts up peacefully. Not even 
> complaining that you have exceeded the memory, therefore, it looks like the 
> changes are not reflecting.
> {code:java}
> export DRILLBIT_MAX_PROC_MEM=12G
> export DRILL_HEAP=2G
> export DRILL_MAX_DIRECT_MEMORY=10G{code}
> whereas my system's main memory is only 8 GB.
> Please help me to resolve the out of memory error. I had even run the below 
> queries, in order to follow the troubleshooting instructions but the issue 
> remains the same.
> {code:java}
> alter session set `planner.enable_hashagg` = false; 
> alter session set `planner.enable_hashjoin` = false;
> alter session set planner.width.max_per_node=3; 
> alter system set planner.width.max_per_query = 100;{code}
>  
>  
>  
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7496) Update Dockerfile to publish release images under Apache Docker Hub

2019-12-26 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17003726#comment-17003726
 ] 

Abhishek Girish commented on DRILL-7496:


Hey Vova,

I've played around with this. I think there was an issue with getting it to 
work because of our Dockerfile. Let me revisit that and share details over here.

> Update Dockerfile to publish release images under Apache Docker Hub
> ---
>
> Key: DRILL-7496
> URL: https://issues.apache.org/jira/browse/DRILL-7496
> Project: Apache Drill
>  Issue Type: Task
>Affects Versions: 1.17.0
>Reporter: Vova Vysotskyi
>Assignee: Vova Vysotskyi
>Priority: Major
> Fix For: 1.18.0
>
>
> Update Dockerfile to use built release archive by default for images and 
> publish images under Apache Docker Hub.
> Once Dockerfile is updated, we should ask INFRA to add a hook for publishing 
> release images automatically one release tag is created. As an example may be 
> used this Jira: https://issues.apache.org/jira/browse/INFRA-13838.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Closed] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage

2019-10-21 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish closed DRILL-7405.
--

> Build fails due to inaccessible apache-drill on S3 storage
> --
>
> Key: DRILL-7405
> URL: https://issues.apache.org/jira/browse/DRILL-7405
> Project: Apache Drill
>  Issue Type: Task
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Boaz Ben-Zvi
>Assignee: Abhishek Girish
>Priority: Critical
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
>   A new clean build (e.g. after deleting the ~/.m2 local repository) would 
> fail now due to:  
> Access denied to: 
> [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=]
>  
> (e.g., for the test data  sf-0.01_tpc-h_parquet_typed.tgz )
> A new publicly available storage place is needed, plus appropriate changes in 
> Drill to get to these resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Comment Edited] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage

2019-10-17 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954179#comment-16954179
 ] 

Abhishek Girish edited comment on DRILL-7405 at 10/18/19 1:12 AM:
--

Switching priority to Critical - as the S3 link will only be available for a 
short period. 

I have a PR [1] open - it moves the files to GitHub as they are just a few MB 
in size. [~shamirwasia]/[~sorabh] can you please take a look?

[1] https://github.com/apache/drill/pull/1874 


was (Author: agirish):
Switching priority to Critical - as the S3 link will only be available for a 
short period. 

I have a PR open - it moves the files to GitHub as they are just a few MB in 
size.

https://github.com/apache/drill/pull/1874

[~shamirwasia]/[~sorabh] can . you please take a look>

> Build fails due to inaccessible apache-drill on S3 storage
> --
>
> Key: DRILL-7405
> URL: https://issues.apache.org/jira/browse/DRILL-7405
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Boaz Ben-Zvi
>Assignee: Abhishek Girish
>Priority: Blocker
> Fix For: 1.17.0
>
>
>   A new clean build (e.g. after deleting the ~/.m2 local repository) would 
> fail now due to:  
> Access denied to: 
> [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=]
>  
> (e.g., for the test data  sf-0.01_tpc-h_parquet_typed.tgz )
> A new publicly available storage place is needed, plus appropriate changes in 
> Drill to get to these resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage

2019-10-17 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954179#comment-16954179
 ] 

Abhishek Girish commented on DRILL-7405:


Switching priority to Critical - as the S3 link will only be available for a 
short period. 

I have a PR open - it moves the files to GitHub as they are just a few MB in 
size.

https://github.com/apache/drill/pull/1874

[~shamirwasia]/[~sorabh] can . you please take a look>

> Build fails due to inaccessible apache-drill on S3 storage
> --
>
> Key: DRILL-7405
> URL: https://issues.apache.org/jira/browse/DRILL-7405
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Boaz Ben-Zvi
>Assignee: Abhishek Girish
>Priority: Blocker
> Fix For: 1.17.0
>
>
>   A new clean build (e.g. after deleting the ~/.m2 local repository) would 
> fail now due to:  
> Access denied to: 
> [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=]
>  
> (e.g., for the test data  sf-0.01_tpc-h_parquet_typed.tgz )
> A new publicly available storage place is needed, plus appropriate changes in 
> Drill to get to these resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage

2019-10-17 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7405:
---
Priority: Critical  (was: Blocker)

> Build fails due to inaccessible apache-drill on S3 storage
> --
>
> Key: DRILL-7405
> URL: https://issues.apache.org/jira/browse/DRILL-7405
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Boaz Ben-Zvi
>Assignee: Abhishek Girish
>Priority: Critical
> Fix For: 1.17.0
>
>
>   A new clean build (e.g. after deleting the ~/.m2 local repository) would 
> fail now due to:  
> Access denied to: 
> [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=]
>  
> (e.g., for the test data  sf-0.01_tpc-h_parquet_typed.tgz )
> A new publicly available storage place is needed, plus appropriate changes in 
> Drill to get to these resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage

2019-10-17 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7405:
---
Reviewer: Sorabh Hamirwasia
Priority: Blocker  (was: Minor)

> Build fails due to inaccessible apache-drill on S3 storage
> --
>
> Key: DRILL-7405
> URL: https://issues.apache.org/jira/browse/DRILL-7405
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Boaz Ben-Zvi
>Assignee: Abhishek Girish
>Priority: Blocker
> Fix For: 1.17.0
>
>
>   A new clean build (e.g. after deleting the ~/.m2 local repository) would 
> fail now due to:  
> Access denied to: 
> [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=]
>  
> (e.g., for the test data  sf-0.01_tpc-h_parquet_typed.tgz )
> A new publicly available storage place is needed, plus appropriate changes in 
> Drill to get to these resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage

2019-10-16 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953225#comment-16953225
 ] 

Abhishek Girish commented on DRILL-7405:


Changing priority as it's no longer blocking builds. Will keep JIRA open to 
find a long term solution.

> Build fails due to inaccessible apache-drill on S3 storage
> --
>
> Key: DRILL-7405
> URL: https://issues.apache.org/jira/browse/DRILL-7405
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Boaz Ben-Zvi
>Assignee: Abhishek Girish
>Priority: Minor
> Fix For: 1.17.0
>
>
>   A new clean build (e.g. after deleting the ~/.m2 local repository) would 
> fail now due to:  
> Access denied to: 
> [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=]
>  
> (e.g., for the test data  sf-0.01_tpc-h_parquet_typed.tgz )
> A new publicly available storage place is needed, plus appropriate changes in 
> Drill to get to these resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage

2019-10-16 Thread Abhishek Girish (Jira)


 [ 
https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7405:
---
Priority: Minor  (was: Blocker)

> Build fails due to inaccessible apache-drill on S3 storage
> --
>
> Key: DRILL-7405
> URL: https://issues.apache.org/jira/browse/DRILL-7405
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Boaz Ben-Zvi
>Assignee: Abhishek Girish
>Priority: Minor
> Fix For: 1.17.0
>
>
>   A new clean build (e.g. after deleting the ~/.m2 local repository) would 
> fail now due to:  
> Access denied to: 
> [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=]
>  
> (e.g., for the test data  sf-0.01_tpc-h_parquet_typed.tgz )
> A new publicly available storage place is needed, plus appropriate changes in 
> Drill to get to these resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Commented] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage

2019-10-16 Thread Abhishek Girish (Jira)


[ 
https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16953220#comment-16953220
 ] 

Abhishek Girish commented on DRILL-7405:


Hey [~ben-zvi], this works for me. I am able to download the below file:

http://apache-drill.s3.amazonaws.com/files/sf-0.01_tpc-h_parquet_typed.tgz

Could you please try again?

> Build fails due to inaccessible apache-drill on S3 storage
> --
>
> Key: DRILL-7405
> URL: https://issues.apache.org/jira/browse/DRILL-7405
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Tools, Build  Test
>Affects Versions: 1.16.0
>Reporter: Boaz Ben-Zvi
>Assignee: Abhishek Girish
>Priority: Blocker
> Fix For: 1.17.0
>
>
>   A new clean build (e.g. after deleting the ~/.m2 local repository) would 
> fail now due to:  
> Access denied to: 
> [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=]
>  
> (e.g., for the test data  sf-0.01_tpc-h_parquet_typed.tgz )
> A new publicly available storage place is needed, plus appropriate changes in 
> Drill to get to these resources.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)


[jira] [Updated] (DRILL-7166) Count(*) queries with wildcards in table name are reading metadata cache and returning wrong results

2019-04-11 Thread Abhishek Girish (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-7166?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Abhishek Girish updated DRILL-7166:
---
Summary: Count(*) queries with wildcards in table name are reading metadata 
cache and returning wrong results  (was: Tests doing count(* ) with wildcards 
in table name are querying metadata cache and returning wrong results)

> Count(*) queries with wildcards in table name are reading metadata cache and 
> returning wrong results
> 
>
> Key: DRILL-7166
> URL: https://issues.apache.org/jira/browse/DRILL-7166
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.16.0
>Reporter: Abhishek Girish
>Assignee: Venkata Jyothsna Donapati
>Priority: Blocker
> Fix For: 1.16.0
>
>
> Tests:
> {code}
> Functional/metadata_caching/data/drill4376_1.q
> Functional/metadata_caching/data/drill4376_2.q
> Functional/metadata_caching/data/drill4376_3.q
> Functional/metadata_caching/data/drill4376_4.q
> Functional/metadata_caching/data/drill4376_5.q
> Functional/metadata_caching/data/drill4376_6.q
> Functional/metadata_caching/data/drill4376_8.q
> {code}
> Example pattern of queries:
> {code}
> select count(*) from `lineitem_hierarchical_intint/*8*/3*`;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7166) Tests doing count(* ) with wildcards in table name are querying metadata cache and returning wrong results

2019-04-10 Thread Abhishek Girish (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16814809#comment-16814809
 ] 

Abhishek Girish commented on DRILL-7166:


{code}
Query: Functional/metadata_caching/data/drill4376_6.q
select count(*) from `lineitem_hierarchical_intint/*/1*`


 Expected number of rows: 1
Actual number of rows from Drill: 1
 Number of matching rows: 0
  Number of rows missing: 1
   Number of rows unexpected: 1

These rows are not expected (first 10):
70175

These rows are missing (first 10):
19775 (1 occurence(s))

Query: Functional/metadata_caching/data/drill4376_8.q
select count(*) from `lineitem_hierarchical_intint/*8*/3*`


 Expected number of rows: 1
Actual number of rows from Drill: 1
 Number of matching rows: 0
  Number of rows missing: 1
   Number of rows unexpected: 1

These rows are not expected (first 10):
70175

These rows are missing (first 10):
3600 (1 occurence(s))

Query: Functional/metadata_caching/data/drill4376_3.q
select count(*) from `lineitem_hierarchical_intint/1**2`


 Expected number of rows: 1
Actual number of rows from Drill: 1
 Number of matching rows: 0
  Number of rows missing: 1
   Number of rows unexpected: 1

These rows are not expected (first 10):
70175

These rows are missing (first 10):
20175 (1 occurence(s))

Query: Functional/metadata_caching/data/drill4376_2.q
select count(*) from `lineitem_hierarchical_intint/19*4`


 Expected number of rows: 1
Actual number of rows from Drill: 1
 Number of matching rows: 0
  Number of rows missing: 1
   Number of rows unexpected: 1

These rows are not expected (first 10):
70175

These rows are missing (first 10):
2 (1 occurence(s))

Query: Functional/metadata_caching/data/drill4376_1.q
select count(*) from `lineitem_hierarchical_intint/199*`


 Expected number of rows: 1
Actual number of rows from Drill: 1
 Number of matching rows: 0
  Number of rows missing: 1
   Number of rows unexpected: 1

These rows are not expected (first 10):
70175

These rows are missing (first 10):
3 (1 occurence(s))

Query: Functional/metadata_caching/data/drill4376_5.q
select count(*) from `lineitem_hierarchical_intint/*/1`


 Expected number of rows: 1
Actual number of rows from Drill: 1
 Number of matching rows: 0
  Number of rows missing: 1
   Number of rows unexpected: 1

These rows are not expected (first 10):
70175

These rows are missing (first 10):
6300 (1 occurence(s))

Query: Functional/metadata_caching/data/drill4376_4.q
select count(*) from `lineitem_hierarchical_intint/*8*`


 Expected number of rows: 1
Actual number of rows from Drill: 1
 Number of matching rows: 0
  Number of rows missing: 1
   Number of rows unexpected: 1

These rows are not expected (first 10):
70175

These rows are missing (first 10):
40175 (1 occurence(s))
{code}

> Tests doing count(* ) with wildcards in table name are querying metadata 
> cache and returning wrong results
> --
>
> Key: DRILL-7166
> URL: https://issues.apache.org/jira/browse/DRILL-7166
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Metadata
>Affects Versions: 1.16.0
>Reporter: Abhishek Girish
>Assignee: Pritesh Maker
>Priority: Critical
> Fix For: 1.16.0
>
>
> Tests:
> {code}
> Functional/metadata_caching/data/drill4376_1.q
> Functional/metadata_caching/data/drill4376_2.q
> Functional/metadata_caching/data/drill4376_3.q
> Functional/metadata_caching/data/drill4376_4.q
> Functional/metadata_caching/data/drill4376_5.q
> Functional/metadata_caching/data/drill4376_6.q
> Functional/metadata_caching/data/drill4376_8.q
> {code}
> Example pattern of queries:
> {code}
> select count(*) from `lineitem_hierarchical_intint/*8*/3*`;
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7166) Tests doing count(* ) with wildcards in table name are querying metadata cache and returning wrong results

2019-04-10 Thread Abhishek Girish (JIRA)
Abhishek Girish created DRILL-7166:
--

 Summary: Tests doing count(* ) with wildcards in table name are 
querying metadata cache and returning wrong results
 Key: DRILL-7166
 URL: https://issues.apache.org/jira/browse/DRILL-7166
 Project: Apache Drill
  Issue Type: Bug
  Components: Metadata
Affects Versions: 1.16.0
Reporter: Abhishek Girish
Assignee: Pritesh Maker
 Fix For: 1.16.0


Tests:
{code}
Functional/metadata_caching/data/drill4376_1.q
Functional/metadata_caching/data/drill4376_2.q
Functional/metadata_caching/data/drill4376_3.q
Functional/metadata_caching/data/drill4376_4.q
Functional/metadata_caching/data/drill4376_5.q
Functional/metadata_caching/data/drill4376_6.q
Functional/metadata_caching/data/drill4376_8.q
{code}
Example pattern of queries:
{code}
select count(*) from `lineitem_hierarchical_intint/*8*/3*`;
{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


  1   2   3   4   5   6   7   8   >