[GitHub] drill issue #1011: Drill 1170: Drill-on-YARN

2018-02-22 Thread ilooner
Github user ilooner commented on the issue:

https://github.com/apache/drill/pull/1011
  
@paul-rogers You need to add this dependency to your drill-yarn pom.xml

```

  org.apache.drill
  drill-common
  ${project.version}
  tests
  test

```


---


[jira] [Created] (DRILL-6184) Add batch sizing information to query profile

2018-02-22 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6184:
---

 Summary: Add batch sizing information to query profile
 Key: DRILL-6184
 URL: https://issues.apache.org/jira/browse/DRILL-6184
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Flow
Affects Versions: 1.12.0
Reporter: Padma Penumarthy
Assignee: Padma Penumarthy
 Fix For: 1.13.0


for debugging, we need batch sizing information for each operator in query 
profile.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] drill pull request #1129: DRILL-6180: Use System Option "output_batch_size" ...

2018-02-22 Thread ppadma
GitHub user ppadma opened a pull request:

https://github.com/apache/drill/pull/1129

DRILL-6180: Use System Option "output_batch_size" for External Sort

External Sort has boot time configuration for output batch size 
"drill.exec.sort.external.spill.merge_batch_size" which is defaulted to 16M.
To make batch sizing configuration uniform across all operators, change 
this to use new system option that is added 
"drill.exec.memory.operator.output_batch_size". 
This option has default value of 32M. Changed it to 16M.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppadma/drill DRILL-6180

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1129.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1129


commit 5121663c1fac618d0374667c97c20570197b7455
Author: Padma Penumarthy 
Date:   2018-02-23T00:41:47Z

DRILL-6180: Use System Option "output_batch_size" for External Sort




---


[GitHub] drill issue #1105: DRILL-6125: Fix possible memory leak when query is cancel...

2018-02-22 Thread priteshm
Github user priteshm commented on the issue:

https://github.com/apache/drill/pull/1105
  
@arina-ielchiieva is this bug ready to commit?


---


Re: [DISCUSS] 1.13.0 release

2018-02-22 Thread Parth Chandra
Bit of a tepid response from dev; but Aman's approval is all the
encouragement I need to roll out a release :)

Thoughts on pending PRs?




On Thu, Feb 22, 2018 at 9:54 PM, Aman Sinha  wrote:

> Agreed...it would be good to get the ball rolling on the 1.13.0 release.
> Among other things, this release
> has the long pending Calcite rebase changes and the sooner we get it it out
> for users, the better.
>
> Thanks for volunteering !
>
> -Aman
>
> On Wed, Feb 21, 2018 at 9:03 PM, Parth Chandra  wrote:
>
> > Hello Drillers,
> >
> >   I feel we might benefit from a early release for 1.13.0. We took longer
> > to do the previous release so it would be nice to bring the release train
> > back on track.
> >
> >   I'll volunteer (!) to manage the release :)
> >
> >   What do you guys think?
> >
> >   If we are in agreement on starting the release cycle and there are any
> > issues on which work is in progress, that you feel we *must* include in
> the
> > release, please post in reply to this thread. Let's at least get a head
> > start on closing pending PRs since these are usually what delays
> releases.
> >
> > Thanks
> >
> > Parth
> >
>


[jira] [Created] (DRILL-6183) Default value for parameter 'planner.width.max_per_node'

2018-02-22 Thread Arjun (JIRA)
Arjun created DRILL-6183:


 Summary: Default value for parameter 'planner.width.max_per_node'
 Key: DRILL-6183
 URL: https://issues.apache.org/jira/browse/DRILL-6183
 Project: Apache Drill
  Issue Type: Bug
  Components:  Server
Affects Versions: 1.12.0
 Environment: Drill 1.12
Reporter: Arjun


The default value for configuration parameter 'planner.width.max_per_node' is 
shown as 0 in Drill 1.12. In the previous versions, the default value is set as 
70% total core in the drillbit node.This could be confusing for users upgrading 
from previous versions ( Whether it is unlimited value).  
{code:java}
0: jdbc:drill:drillbit=localhost> select * from sys.options where name like 
'%planner.width%'; 
+--+---+---+--+--+--+-+---++
 |            name            | kind  | accessibleScopes  | optionScope  |  
status  | num_val  | string_val  | bool_val  | float_val  | 
+--+---+---+--+--+--+-+---++
 | planner.width.max_per_node  | LONG  | ALL              | BOOT        | 
DEFAULT  | 0        | null        | null      | null      | | 
planner.width.max_per_query  | LONG  | ALL              | BOOT        | DEFAULT 
 | 1000    | null        | null      | null      | 
+--+---+---+--+--+--+-+---++
 2 rows selected (0.913 seconds) 0: jdbc:drill:drillbit=localhost>

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6182) Doc bug on parameter 'drill.exec.spill.fs'

2018-02-22 Thread Satoshi Yamada (JIRA)
Satoshi Yamada created DRILL-6182:
-

 Summary: Doc bug on parameter 'drill.exec.spill.fs'
 Key: DRILL-6182
 URL: https://issues.apache.org/jira/browse/DRILL-6182
 Project: Apache Drill
  Issue Type: Bug
  Components: Documentation
Reporter: Satoshi Yamada


Parameter 'drill.exe.spill.fs' should be 'drill.exec.spill.fs' (with "c" after 
exe).**

Observed in the documents below.

[https://drill.apache.org/docs/start-up-options/]

[https://drill.apache.org/docs/sort-based-and-hash-based-memory-constrained-operators/]

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-6181) CTAS should support writing nested structures (nested lists) to parquet.

2018-02-22 Thread Khurram Faraaz (JIRA)
Khurram Faraaz created DRILL-6181:
-

 Summary: CTAS should support writing nested structures (nested 
lists) to parquet.
 Key: DRILL-6181
 URL: https://issues.apache.org/jira/browse/DRILL-6181
 Project: Apache Drill
  Issue Type: Bug
  Components: Storage - Parquet
Affects Versions: 1.12.0
Reporter: Khurram Faraaz


Both Parquet and Hive support writing nested structures into parquet

https://issues.apache.org/jira/browse/HIVE-8909
https://issues.apache.org/jira/browse/PARQUET-113

A CTAS from Drill fails when there is a nested list of lists, in one of the 
columns in the project.

JSON data used in the test, note that "arr" is a nested list of lists 
 
{noformat} 
[root@qa102-45 ~]# cat jsonToParquet_02.json
{"id":"123","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
{"id":"3","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
{"id":"13","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
{"id":"12","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
{"id":"2","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
{"id":"1","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
{"id":"230","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
{"id":"1230","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
{"id":"1123","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
{"id":"2123","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
{"id":"1523","arr":[[1,2,3,4],[5,6,7,8,9,10],[11,12,13,14,15]]}
[root@qa102-45 ~]#
{noformat}

CTAS fails with UnsupportedOperationException on Drill 1.12.0-mapr commit id 
bb07ebbb9ba8742f44689f8bd8efb5853c5edea0

{noformat}
 0: jdbc:drill:schema=dfs.tmp> CREATE TABLE tbl_prq_from_json_02 as select id, 
arr from `jsonToParquet_02.json`;
Error: SYSTEM ERROR: UnsupportedOperationException: Unsupported type LIST

Fragment 0:0

[Error Id: 7e5b3c2d-9cf1-4e87-96c8-e7e7e8055ddf on qa102-45.qa.lab:31010] 
(state=,code=0)
{noformat}

Stack trace from drillbit.log

{noformat}
2018-02-22 09:56:54,368 [2570fb99-62da-a516-2c1f-0381e21723ae:frag:0:0] ERROR 
o.a.d.e.w.fragment.FragmentExecutor - SYSTEM ERROR: 
UnsupportedOperationException: Unsupported type LIST

Fragment 0:0

[Error Id: 7e5b3c2d-9cf1-4e87-96c8-e7e7e8055ddf on qa102-45.qa.lab:31010]
org.apache.drill.common.exceptions.UserException: SYSTEM ERROR: 
UnsupportedOperationException: Unsupported type LIST

Fragment 0:0

[Error Id: 7e5b3c2d-9cf1-4e87-96c8-e7e7e8055ddf on qa102-45.qa.lab:31010]
 at 
org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:586)
 ~[drill-common-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:301)
 [drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:160)
 [drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:267)
 [drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
[drill-common-1.12.0-mapr.jar:1.12.0-mapr]
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) 
[na:1.8.0_161]
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) 
[na:1.8.0_161]
 at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161]
Caused by: java.lang.UnsupportedOperationException: Unsupported type LIST
 at 
org.apache.drill.exec.store.parquet.ParquetRecordWriter.getType(ParquetRecordWriter.java:253)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.exec.store.parquet.ParquetRecordWriter.newSchema(ParquetRecordWriter.java:205)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.exec.store.parquet.ParquetRecordWriter.updateSchema(ParquetRecordWriter.java:190)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.exec.physical.impl.WriterRecordBatch.setupNewSchema(WriterRecordBatch.java:157)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.exec.physical.impl.WriterRecordBatch.innerNext(WriterRecordBatch.java:103)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:164)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:119)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:109)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.exec.record.AbstractSingleRecordBatch.innerNext(AbstractSingleRecordBatch.java:51)
 ~[drill-java-exec-1.12.0-mapr.jar:1.12.0-mapr]
 at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:134)
 

[jira] [Created] (DRILL-6180) Use System Option "output_batch_size" for External Sort

2018-02-22 Thread Padma Penumarthy (JIRA)
Padma Penumarthy created DRILL-6180:
---

 Summary: Use System Option "output_batch_size" for External Sort
 Key: DRILL-6180
 URL: https://issues.apache.org/jira/browse/DRILL-6180
 Project: Apache Drill
  Issue Type: Improvement
  Components: Execution - Flow
Affects Versions: 1.12.0
Reporter: Padma Penumarthy
Assignee: Padma Penumarthy
 Fix For: 1.13.0


External Sort has boot time configuration for output batch size 
"drill.exec.sort.external.spill.merge_batch_size" which is defaulted to 16M.

To make batch sizing configuration uniform across all operators, change this to 
use new system option that is added 
"drill.exec.memory.operator.output_batch_size". This option has default value 
of 32M.

So, what are the implications if default is changed to 32M for external sort ?

Instead, should we change the output batch size default to 16M for all 
operators ?

 

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[GitHub] drill issue #1101: DRILL-6032: Made the batch sizing for HashAgg more accura...

2018-02-22 Thread priteshm
Github user priteshm commented on the issue:

https://github.com/apache/drill/pull/1101
  
@Ben-Zvi can you please do a final review?


---


[GitHub] drill pull request #1128: Added usage for graceful_stop in drillbit.sh

2018-02-22 Thread dvjyothsna
GitHub user dvjyothsna opened a pull request:

https://github.com/apache/drill/pull/1128

Added usage for graceful_stop in drillbit.sh



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dvjyothsna/drill DRILL-6040

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1128.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1128


commit 9a64d3017b6f0c77d2d77d0911a825bdcefb7cb4
Author: dvjyothsna 
Date:   2018-02-22T19:30:50Z

Added usage for graceful_stop in drillbit.sh




---


[GitHub] drill pull request #1127: DRILL-6021:Show shutdown button when authenticatio...

2018-02-22 Thread dvjyothsna
GitHub user dvjyothsna opened a pull request:

https://github.com/apache/drill/pull/1127

DRILL-6021:Show shutdown button when authentication is not enabled

Display the shutdown button when authentication is not enabled since the 
user is by default the admin when the authentication is not enabled.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/dvjyothsna/drill DRILL-6021

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1127.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1127


commit 788f65083e78795e6bf71a6d22150d9268169484
Author: Jyothsna Donapati 
Date:   2018-02-22T19:17:48Z

DRILL-6021:Show shutdown button when authentication is not enabled




---


[GitHub] drill pull request #1126: DRILL-6179: Added pcapng-format support

2018-02-22 Thread Vlad-Storona
GitHub user Vlad-Storona opened a pull request:

https://github.com/apache/drill/pull/1126

DRILL-6179: Added pcapng-format support

See DRILL-6179 for details.
https://issues.apache.org/jira/browse/DRILL-6179

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/mapr-demos/drill pcapng_dev

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1126.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1126


commit 043988c0a12bca3288f8ac49384ba6d1584fa159
Author: Vlad Storona 
Date:   2018-01-30T12:55:04Z

DRILL-6179: Added pcapng-format support




---


Re: [DISCUSS] 1.13.0 release

2018-02-22 Thread Aman Sinha
Agreed...it would be good to get the ball rolling on the 1.13.0 release.
Among other things, this release
has the long pending Calcite rebase changes and the sooner we get it it out
for users, the better.

Thanks for volunteering !

-Aman

On Wed, Feb 21, 2018 at 9:03 PM, Parth Chandra  wrote:

> Hello Drillers,
>
>   I feel we might benefit from a early release for 1.13.0. We took longer
> to do the previous release so it would be nice to bring the release train
> back on track.
>
>   I'll volunteer (!) to manage the release :)
>
>   What do you guys think?
>
>   If we are in agreement on starting the release cycle and there are any
> issues on which work is in progress, that you feel we *must* include in the
> release, please post in reply to this thread. Let's at least get a head
> start on closing pending PRs since these are usually what delays releases.
>
> Thanks
>
> Parth
>


[jira] [Created] (DRILL-6179) Added pcapng-format support

2018-02-22 Thread Vlad (JIRA)
Vlad created DRILL-6179:
---

 Summary: Added pcapng-format support
 Key: DRILL-6179
 URL: https://issues.apache.org/jira/browse/DRILL-6179
 Project: Apache Drill
  Issue Type: New Feature
Reporter: Vlad
Assignee: Vlad


The _PCAP Next Generation Dump File Format_ (or pcapng for short) [1] is an 
attempt to overcome the limitations of the currently widely used (but limited) 
libpcap format.

At a first level, it is desirable to query and filter by source and destination 
IP and port, and src/dest mac addreses or by protocol. Beyond that, however, it 
would be very useful to be able to group packets by TCP session and eventually 
to look at packet contents.

Initial work is available at  
https://github.com/mapr-demos/drill/tree/pcapng_dev

[1] https://pcapng.github.io/pcapng/

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)