[jira] [Commented] (DRILL-6364) WebUI does not cleanly handle shutdown and state toggling when Drillbits go on and offline
[ https://issues.apache.org/jira/browse/DRILL-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457253#comment-16457253 ] ASF GitHub Bot commented on DRILL-6364: --- Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/1241 @sohami / @arina-ielchiieva can you review this? The change is not extensive and fairly straightforward. > WebUI does not cleanly handle shutdown and state toggling when Drillbits go > on and offline > -- > > Key: DRILL-6364 > URL: https://issues.apache.org/jira/browse/DRILL-6364 > Project: Apache Drill > Issue Type: Bug > Components: Web Server >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > When the webpage is loaded the first time, the shutdown button is enabled by > default, which might not be correct, since scenarios like HTTPS, etc does not > support this for remote bits. (i.e the user needs to navigate to that node's > UI for shutting it down). > Similarly, when a previously unseen Drillbit comes online, the node will not > be rendered until the page is refreshed by the user. > Lastly, if the node from whom the UI page was served goes down, the status > update for the rest of the cluster is not updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6364) WebUI does not cleanly handle shutdown and state toggling when Drillbits go on and offline
[ https://issues.apache.org/jira/browse/DRILL-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457251#comment-16457251 ] ASF GitHub Bot commented on DRILL-6364: --- Github user kkhatua commented on the issue: https://github.com/apache/drill/pull/1241 Screenshot of when UI node `kk127` goes down. The UI's javascript logic queries other Drillbits in the list (in this case, `kk128`) and discovers two new previously unseen Drillbits - `kk130` and `kk129`, discovered in the sequence in which they were discovered in the cluster. State changes are marked correctly, with shutdown buttons disabled. A prompt in the form of an orange refresh button near the Drillbit count indicates the need to refresh. Alternatively, one of the other nodes can be used for pop-out of a new WebUI. ![image](https://user-images.githubusercontent.com/4335237/39389539-681fed40-4a3e-11e8-92f7-6d5e717e0881.png) > WebUI does not cleanly handle shutdown and state toggling when Drillbits go > on and offline > -- > > Key: DRILL-6364 > URL: https://issues.apache.org/jira/browse/DRILL-6364 > Project: Apache Drill > Issue Type: Bug > Components: Web Server >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > When the webpage is loaded the first time, the shutdown button is enabled by > default, which might not be correct, since scenarios like HTTPS, etc does not > support this for remote bits. (i.e the user needs to navigate to that node's > UI for shutting it down). > Similarly, when a previously unseen Drillbit comes online, the node will not > be rendered until the page is refreshed by the user. > Lastly, if the node from whom the UI page was served goes down, the status > update for the rest of the cluster is not updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6364) WebUI does not cleanly handle shutdown and state toggling when Drillbits go on and offline
[ https://issues.apache.org/jira/browse/DRILL-6364?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457232#comment-16457232 ] ASF GitHub Bot commented on DRILL-6364: --- GitHub user kkhatua opened a pull request: https://github.com/apache/drill/pull/1241 DRILL-6364: Handle Cluster Info in WebUI when existing/new bits restart As a follow up to DRILL-6289, the following improvements have been done: 1. When loading the page for the first time, the WebUI enables the shutdown button without actually checking the state of the Drillbits. The ideal behaviour should be to disable the button till the state is verified. **[Done]** _If a Drillbit is confirmed down (i.e. not in `/state` response), it is marked as OFFLINE and button is disabled._ 2. When shutting down the current Drillbit, the WebUI no more has access to the cluster state. The ideal behaviour here should be to fetch the state from any of the other Drillbits to update the status. **[Done]** _With the current Drillbit down, the other bits are requested for cluster state info and update accordingly._ 3. When a new, previously unseen Drillbit comes up, the WebUI will never render it because the table is statically generated during the first page load. The idea behaviour should be to append to the table on discovery of a new node. **[Done]** _The new Drillbit info is injected and a prompt appears to refresh the page to re-populate any missing info. This also works with feature (2) mentioned above._ The only Java code change was to have the state response carry the address and http-port as a tuple, instead of the user-port (which seems to be never used). You can merge this pull request into a Git repository by running: $ git pull https://github.com/kkhatua/drill DRILL-6364 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/drill/pull/1241.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1241 commit ab3e8619c6259803eb362be290a3a3605839a194 Author: Kunal Khatua Date: 2018-04-27T23:27:45Z DRILL-6364: Handle Cluster Info in WebUI when existing/new bits restart As a follow up to DRILL-6289, the following improvements have been done: 1. When loading the page for the first time, the WebUI enables the shutdown button without actually checking the state of the Drillbits. The ideal behaviour should be to disable the button till the state is verified. [Done] If a Drillbit is confirmed down (i.e. not in `/state` response), it is marked as OFFLINE and button is disabled. 2. When shutting down the current Drillbit, the WebUI no more has access to the cluster state. The ideal behaviour here should be to fetch the state from any of the other Drillbits to update the status. [Done] With the current Drillbit down, the other bits are requested for cluster state info and update accordingly. 3. When a new, previously unseen Drillbit comes up, the WebUI will never render it because the table is statically generated during the first page load. The idea behaviour should be to append to the table on discovery of a new node. [Done] The new Drillbit info is injected and a prompt appears to refresh the page to re-populate any missing info. This also works with feature (2) mentioned above. The only Java code change was to have the state response carry the address and http-port as a tuple, instead of the user-port (which is never used). > WebUI does not cleanly handle shutdown and state toggling when Drillbits go > on and offline > -- > > Key: DRILL-6364 > URL: https://issues.apache.org/jira/browse/DRILL-6364 > Project: Apache Drill > Issue Type: Bug > Components: Web Server >Reporter: Kunal Khatua >Assignee: Kunal Khatua >Priority: Major > Fix For: 1.14.0 > > > When the webpage is loaded the first time, the shutdown button is enabled by > default, which might not be correct, since scenarios like HTTPS, etc does not > support this for remote bits. (i.e the user needs to navigate to that node's > UI for shutting it down). > Similarly, when a previously unseen Drillbit comes online, the node will not > be rendered until the page is refreshed by the user. > Lastly, if the node from whom the UI page was served goes down, the status > update for the rest of the cluster is not updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6364) WebUI does not cleanly handle shutdown and state toggling when Drillbits go on and offline
Kunal Khatua created DRILL-6364: --- Summary: WebUI does not cleanly handle shutdown and state toggling when Drillbits go on and offline Key: DRILL-6364 URL: https://issues.apache.org/jira/browse/DRILL-6364 Project: Apache Drill Issue Type: Bug Components: Web Server Reporter: Kunal Khatua Assignee: Kunal Khatua Fix For: 1.14.0 When the webpage is loaded the first time, the shutdown button is enabled by default, which might not be correct, since scenarios like HTTPS, etc does not support this for remote bits. (i.e the user needs to navigate to that node's UI for shutting it down). Similarly, when a previously unseen Drillbit comes online, the node will not be rendered until the page is refreshed by the user. Lastly, if the node from whom the UI page was served goes down, the status update for the rest of the cluster is not updated. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6242) Output format for nested date, time, timestamp values in an object hierarchy
[ https://issues.apache.org/jira/browse/DRILL-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457203#comment-16457203 ] ASF GitHub Bot commented on DRILL-6242: --- Github user paul-rogers commented on the issue: https://github.com/apache/drill/pull/1184 Just a quick reminder that the current "JSON Map" returned for a map column in JDBC was very likely done so that calling `toString()` in `sqlline` produces something like this: `{"c":"foo"}`. I realize this is a very obscure point; but worth keeping in mind to avoid bugs from `sqlline` users... > Output format for nested date, time, timestamp values in an object hierarchy > > > Key: DRILL-6242 > URL: https://issues.apache.org/jira/browse/DRILL-6242 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types >Affects Versions: 1.12.0 >Reporter: Jiang Wu >Assignee: Jiang Wu >Priority: Major > Fix For: 1.14.0 > > > Some storages (mapr db, mongo db, etc.) have hierarchical objects that > contain nested fields of date, time, timestamp types. When a query returns > these objects, the output format for the nested date, time, timestamp, are > showing the internal object (org.joda.time.DateTime), rather than the logical > data value. > For example. Suppose in MongoDB, we have a single object that looks like > this: > {code:java} > > db.test.findOne(); > { > "_id" : ObjectId("5aa8487d470dd39a635a12f5"), > "name" : "orange", > "context" : { > "date" : ISODate("2018-03-13T21:52:54.940Z"), > "user" : "jack" > } > } > {code} > Then connect Drill to the above MongoDB storage, and run the following query > within Drill: > {code:java} > > select t.context.`date`, t.context from test t; > ++-+ > | EXPR$0 | context | > ++-+ > | 2018-03-13 | > {"date":{"dayOfYear":72,"year":2018,"dayOfMonth":13,"dayOfWeek":2,"era":1,"millisOfDay":78774940,"weekOfWeekyear":11,"weekyear":2018,"monthOfYear":3,"yearOfEra":2018,"yearOfCentury":18,"centuryOfEra":20,"millisOfSecond":940,"secondOfMinute":54,"secondOfDay":78774,"minuteOfHour":52,"minuteOfDay":1312,"hourOfDay":21,"zone":{"fixed":true,"id":"UTC"},"millis":1520977974940,"chronology":{"zone":{"fixed":true,"id":"UTC"}},"afterNow":false,"beforeNow":true,"equalNow":false},"user":"jack"} > | > {code} > We can see that from the above output, when the date field is retrieved as a > top level column, Drill outputs a logical date value. But when the same > field is within an object hierarchy, Drill outputs the internal object used > to hold the date value. > The expected output is the same display for whether the date field is shown > as a top level column or when it is within an object hierarchy: > {code:java} > > select t.context.`date`, t.context from test t; > ++-+ > | EXPR$0 | context | > ++-+ > | 2018-03-13 | {"date":"2018-03-13","user":"jack"} | > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5917) Ban org.json:json library in Drill
[ https://issues.apache.org/jira/browse/DRILL-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457196#comment-16457196 ] Vlad Rozov commented on DRILL-5917: --- ? > Ban org.json:json library in Drill > -- > > Key: DRILL-5917 > URL: https://issues.apache.org/jira/browse/DRILL-5917 > Project: Apache Drill > Issue Type: Task >Affects Versions: 1.11.0 >Reporter: Arina Ielchiieva >Assignee: Vlad Rozov >Priority: Major > Labels: ready-to-commit > Fix For: 1.12.0 > > Attachments: image.png > > > Apache Drill has dependencies on json.org lib indirectly from two libraries: > com.mapr.hadoop:maprfs:jar:5.2.1-mapr > com.mapr.fs:mapr-hbase:jar:5.2.1-mapr > {noformat} > [INFO] org.apache.drill.contrib:drill-format-mapr:jar:1.12.0-SNAPSHOT > [INFO] +- com.mapr.hadoop:maprfs:jar:5.2.1-mapr:compile > [INFO] | \- org.json:json:jar:20080701:compile > [INFO] \- com.mapr.fs:mapr-hbase:jar:5.2.1-mapr:compile > [INFO]\- (org.json:json:jar:20080701:compile - omitted for duplicate) > {noformat} > Need to make sure we won't have any dependencies from these libs to > org.json:json lib and ban this lib in main pom.xml file. > Issue is critical since Apache release won't happen until we make sure > org.json:json lib is not used (https://www.apache.org/legal/resolved.html). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-5917) Ban org.json:json library in Drill
[ https://issues.apache.org/jira/browse/DRILL-5917?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] David Capwell updated DRILL-5917: - Attachment: image.png > Ban org.json:json library in Drill > -- > > Key: DRILL-5917 > URL: https://issues.apache.org/jira/browse/DRILL-5917 > Project: Apache Drill > Issue Type: Task >Affects Versions: 1.11.0 >Reporter: Arina Ielchiieva >Assignee: Vlad Rozov >Priority: Major > Labels: ready-to-commit > Fix For: 1.12.0 > > Attachments: image.png > > > Apache Drill has dependencies on json.org lib indirectly from two libraries: > com.mapr.hadoop:maprfs:jar:5.2.1-mapr > com.mapr.fs:mapr-hbase:jar:5.2.1-mapr > {noformat} > [INFO] org.apache.drill.contrib:drill-format-mapr:jar:1.12.0-SNAPSHOT > [INFO] +- com.mapr.hadoop:maprfs:jar:5.2.1-mapr:compile > [INFO] | \- org.json:json:jar:20080701:compile > [INFO] \- com.mapr.fs:mapr-hbase:jar:5.2.1-mapr:compile > [INFO]\- (org.json:json:jar:20080701:compile - omitted for duplicate) > {noformat} > Need to make sure we won't have any dependencies from these libs to > org.json:json lib and ban this lib in main pom.xml file. > Issue is critical since Apache release won't happen until we make sure > org.json:json lib is not used (https://www.apache.org/legal/resolved.html). -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6242) Output format for nested date, time, timestamp values in an object hierarchy
[ https://issues.apache.org/jira/browse/DRILL-6242?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457182#comment-16457182 ] ASF GitHub Bot commented on DRILL-6242: --- Github user parthchandra commented on the issue: https://github.com/apache/drill/pull/1184 ``` What do you mean by "Json representation"? ``` Sorry, my mistake, got all tangled up. ``` we may want to further translate the Local [Date|Time|DateTime] objects inside the Map|List to java.sql.[Date|Time|Timestamp] upon access. But to do that inside the SqlAccessor, you would need to deep copy the Map|List and build another version with the date|time translated into java.sql.date|time. ``` That is what I thought you wanted to get to. If the current state is something you can work with, then great. I can review the final changes once you're done and merge them as well. Let's move the other discussion to another thread or JIRA. > Output format for nested date, time, timestamp values in an object hierarchy > > > Key: DRILL-6242 > URL: https://issues.apache.org/jira/browse/DRILL-6242 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types >Affects Versions: 1.12.0 >Reporter: Jiang Wu >Assignee: Jiang Wu >Priority: Major > Fix For: 1.14.0 > > > Some storages (mapr db, mongo db, etc.) have hierarchical objects that > contain nested fields of date, time, timestamp types. When a query returns > these objects, the output format for the nested date, time, timestamp, are > showing the internal object (org.joda.time.DateTime), rather than the logical > data value. > For example. Suppose in MongoDB, we have a single object that looks like > this: > {code:java} > > db.test.findOne(); > { > "_id" : ObjectId("5aa8487d470dd39a635a12f5"), > "name" : "orange", > "context" : { > "date" : ISODate("2018-03-13T21:52:54.940Z"), > "user" : "jack" > } > } > {code} > Then connect Drill to the above MongoDB storage, and run the following query > within Drill: > {code:java} > > select t.context.`date`, t.context from test t; > ++-+ > | EXPR$0 | context | > ++-+ > | 2018-03-13 | > {"date":{"dayOfYear":72,"year":2018,"dayOfMonth":13,"dayOfWeek":2,"era":1,"millisOfDay":78774940,"weekOfWeekyear":11,"weekyear":2018,"monthOfYear":3,"yearOfEra":2018,"yearOfCentury":18,"centuryOfEra":20,"millisOfSecond":940,"secondOfMinute":54,"secondOfDay":78774,"minuteOfHour":52,"minuteOfDay":1312,"hourOfDay":21,"zone":{"fixed":true,"id":"UTC"},"millis":1520977974940,"chronology":{"zone":{"fixed":true,"id":"UTC"}},"afterNow":false,"beforeNow":true,"equalNow":false},"user":"jack"} > | > {code} > We can see that from the above output, when the date field is retrieved as a > top level column, Drill outputs a logical date value. But when the same > field is within an object hierarchy, Drill outputs the internal object used > to hold the date value. > The expected output is the same display for whether the date field is shown > as a top level column or when it is within an object hierarchy: > {code:java} > > select t.context.`date`, t.context from test t; > ++-+ > | EXPR$0 | context | > ++-+ > | 2018-03-13 | {"date":"2018-03-13","user":"jack"} | > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6348) Unordered Receiver does not report its memory usage
[ https://issues.apache.org/jira/browse/DRILL-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457120#comment-16457120 ] ASF GitHub Bot commented on DRILL-6348: --- Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1237 IMO, it will be good to understand what other operators do as well. For example what Project or Filter operators do. Do they take ownership of incoming batches? And if they do, when is the ownership taken? I do not suggest that we change how Sender and Receiver control **all** aspects of communication, at least not as part of this JIRA/PR. The difference in my and your approach is whether or not UnorderedReceiver and other receivers are pass-through operators. My view is that receivers are not pass-through operators and they are buffering operators as they receive batches from the network and buffer them before downstream operators are ready to consume those batches. In your view, receivers are pass-through operators that get batches from fragment queue or some other queue and pass them to downstream. As there is no wait and no processing between getting a batch from fragment queue and passing it to the next operator, I don't see why a receiver needs to take the ownership. > Unordered Receiver does not report its memory usage > --- > > Key: DRILL-6348 > URL: https://issues.apache.org/jira/browse/DRILL-6348 > Project: Apache Drill > Issue Type: Task > Components: Execution - Flow >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > The Drill Profile functionality doesn't show any memory usage for the > Unordered Receiver operator. This is problematic when analyzing OOM > conditions since we cannot account for all of a query memory usage. This Jira > is to fix memory reporting for the Unordered Receiver operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6348) Unordered Receiver does not report its memory usage
[ https://issues.apache.org/jira/browse/DRILL-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457069#comment-16457069 ] ASF GitHub Bot commented on DRILL-6348: --- Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1237#discussion_r184807153 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unorderedreceiver/UnorderedReceiverBatch.java --- @@ -149,25 +149,32 @@ private RawFragmentBatch getNextBatch() throws IOException { } } + private RawFragmentBatch getNextNotEmptyBatch() throws IOException { +RawFragmentBatch batch; +try { + stats.startWait(); --- End diff -- I see; I will then fix any such occurrences when opportunity presents itself as I have seen both patterns in the Drill code base. > Unordered Receiver does not report its memory usage > --- > > Key: DRILL-6348 > URL: https://issues.apache.org/jira/browse/DRILL-6348 > Project: Apache Drill > Issue Type: Task > Components: Execution - Flow >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > The Drill Profile functionality doesn't show any memory usage for the > Unordered Receiver operator. This is problematic when analyzing OOM > conditions since we cannot account for all of a query memory usage. This Jira > is to fix memory reporting for the Unordered Receiver operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6348) Unordered Receiver does not report its memory usage
[ https://issues.apache.org/jira/browse/DRILL-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16457056#comment-16457056 ] ASF GitHub Bot commented on DRILL-6348: --- Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1237#discussion_r184804819 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unorderedreceiver/UnorderedReceiverBatch.java --- @@ -149,25 +149,32 @@ private RawFragmentBatch getNextBatch() throws IOException { } } + private RawFragmentBatch getNextNotEmptyBatch() throws IOException { +RawFragmentBatch batch; +try { + stats.startWait(); --- End diff -- it may throw `AssertException` now and other exceptions may be added in the future. > Unordered Receiver does not report its memory usage > --- > > Key: DRILL-6348 > URL: https://issues.apache.org/jira/browse/DRILL-6348 > Project: Apache Drill > Issue Type: Task > Components: Execution - Flow >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > The Drill Profile functionality doesn't show any memory usage for the > Unordered Receiver operator. This is problematic when analyzing OOM > conditions since we cannot account for all of a query memory usage. This Jira > is to fix memory reporting for the Unordered Receiver operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6307) Handle empty batches in record batch sizer correctly
[ https://issues.apache.org/jira/browse/DRILL-6307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Padma Penumarthy updated DRILL-6307: Labels: ready-to-commit (was: ) > Handle empty batches in record batch sizer correctly > > > Key: DRILL-6307 > URL: https://issues.apache.org/jira/browse/DRILL-6307 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.13.0 >Reporter: Padma Penumarthy >Assignee: Padma Penumarthy >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0 > > > when we get empty batch, record batch sizer calculates row width as zero. In > that case, we do not do accounting and memory allocation correctly for > outgoing batches. > For example, in merge join, for outer left join, if right side batch is > empty, we still have to include the right side columns as null in outgoing > batch. > Say first batch is empty. Then, for outgoing, we allocate empty vectors with > zero capacity. When we read the next batch with data, we will end up going > through realloc loop. If we use right side row width as 0 in outgoing row > width calculation, number of rows we will calculate will be higher and later > when we get a non empty batch, we might exceed the memory limits. > One possible workaround/solution : Allocate memory based on std size for > empty input batch. Use allocation width as width of the batch in number of > rows calculation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6348) Unordered Receiver does not report its memory usage
[ https://issues.apache.org/jira/browse/DRILL-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456910#comment-16456910 ] ASF GitHub Bot commented on DRILL-6348: --- Github user sachouche commented on the issue: https://github.com/apache/drill/pull/1237 That was not my intention as my current change aimed at describing the system the way it is. @parthchandra, any feedback? > Unordered Receiver does not report its memory usage > --- > > Key: DRILL-6348 > URL: https://issues.apache.org/jira/browse/DRILL-6348 > Project: Apache Drill > Issue Type: Task > Components: Execution - Flow >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > The Drill Profile functionality doesn't show any memory usage for the > Unordered Receiver operator. This is problematic when analyzing OOM > conditions since we cannot account for all of a query memory usage. This Jira > is to fix memory reporting for the Unordered Receiver operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6348) Unordered Receiver does not report its memory usage
[ https://issues.apache.org/jira/browse/DRILL-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456899#comment-16456899 ] ASF GitHub Bot commented on DRILL-6348: --- Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1237 @sachouche I'd suggest moving the discussion to dev list as the topic of the batch ownership is beyond PR review (code changes). > Unordered Receiver does not report its memory usage > --- > > Key: DRILL-6348 > URL: https://issues.apache.org/jira/browse/DRILL-6348 > Project: Apache Drill > Issue Type: Task > Components: Execution - Flow >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > The Drill Profile functionality doesn't show any memory usage for the > Unordered Receiver operator. This is problematic when analyzing OOM > conditions since we cannot account for all of a query memory usage. This Jira > is to fix memory reporting for the Unordered Receiver operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6327) Update unary operators to handle IterOutcome.EMIT
[ https://issues.apache.org/jira/browse/DRILL-6327?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456879#comment-16456879 ] ASF GitHub Bot commented on DRILL-6327: --- Github user parthchandra commented on the issue: https://github.com/apache/drill/pull/1240 +1. Very nicely done. > Update unary operators to handle IterOutcome.EMIT > - > > Key: DRILL-6327 > URL: https://issues.apache.org/jira/browse/DRILL-6327 > Project: Apache Drill > Issue Type: Task >Reporter: Parth Chandra >Assignee: Sorabh Hamirwasia >Priority: Major > > IterOutcome.EMIT is a new state introduced by the Lateral Join > implementation. All operators need to be updated to handle it. > This Jira is to track the subtask of updating the unary operators (derived > from AbstractSingleRecordBatch). > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6348) Unordered Receiver does not report its memory usage
[ https://issues.apache.org/jira/browse/DRILL-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456839#comment-16456839 ] ASF GitHub Bot commented on DRILL-6348: --- Github user sachouche commented on the issue: https://github.com/apache/drill/pull/1237 @vrozov, **What are we trying to solve / improve** - Drill is currently not properly reporting memory held in Fragment's receive queues - This makes it hard to analyze OOM conditions This is what I want to see - Every operator reporting on the resources it is currently using (needed) - Fragment held resources (other than the ones already reported by the child operators) - Drilbit level (metadata caches, web-server, ..) - I am ok to incrementally reach this goal **Data Exchange Logistic** - Ideally, the data exchange fabric should be decoupled from the Drill Receive / Send operators - The fabric should be handling all the aspects of pre-fetch / pressuring and so forth - It will tune to the speed of producers / consumers when writing / reading data from it - This infrastructure should have its own resource management and reporting capabilities **Operator based Reporting** - Receive and Send operators shall not worry about batches they didn't consume yet - Doing so is counter productive as the Data Exchange fabric will interpret a "drain" operation as the operator "needing" more data. - For example, the merge-receiver should not be managing the receive queues; it should only advertise the pattern of data consumption and let the exchange fabric figure out the rest. The main difference in the two approaches, is that essentially, you are preaching for Receive and Send operators to control all aspects of communication whereas I am preaching for decoupling such aspects. > Unordered Receiver does not report its memory usage > --- > > Key: DRILL-6348 > URL: https://issues.apache.org/jira/browse/DRILL-6348 > Project: Apache Drill > Issue Type: Task > Components: Execution - Flow >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > The Drill Profile functionality doesn't show any memory usage for the > Unordered Receiver operator. This is problematic when analyzing OOM > conditions since we cannot account for all of a query memory usage. This Jira > is to fix memory reporting for the Unordered Receiver operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6202) Deprecate usage of IndexOutOfBoundsException to re-alloc vectors
[ https://issues.apache.org/jira/browse/DRILL-6202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vlad Rozov updated DRILL-6202: -- Labels: ready-to-commit (was: ) > Deprecate usage of IndexOutOfBoundsException to re-alloc vectors > > > Key: DRILL-6202 > URL: https://issues.apache.org/jira/browse/DRILL-6202 > Project: Apache Drill > Issue Type: Bug >Reporter: Vlad Rozov >Assignee: Vlad Rozov >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0 > > > As bounds checking may be enabled or disabled, using > IndexOutOfBoundsException to resize vectors is unreliable. It works only when > bounds checking is enabled. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6348) Unordered Receiver does not report its memory usage
[ https://issues.apache.org/jira/browse/DRILL-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456760#comment-16456760 ] ASF GitHub Bot commented on DRILL-6348: --- Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1237#discussion_r184747702 --- Diff: exec/memory/base/src/main/java/org/apache/drill/exec/memory/AllocationManager.java --- @@ -253,10 +261,12 @@ public boolean transferBalance(final BufferLedger target) { target.historicalLog.recordEvent("incoming(from %s)", owningLedger.allocator.name); } -boolean overlimit = target.allocator.forceAllocate(size); +// Release first to handle the case where the current and target allocators were part of the same +// parent / child tree. allocator.releaseBytes(size); +boolean allocationFit = target.allocator.forceAllocate(size); --- End diff -- - The change of order is an optimization for a parent / child relationship as if we don't release first, then we could unnecessarily go over the memory budget (double counting). - The force-alloc() / free() failures should never happen on normal conditions; when they do, the best thing to do is to exit. I still prefer not to promote the target allocator till it is 100% successful. > Unordered Receiver does not report its memory usage > --- > > Key: DRILL-6348 > URL: https://issues.apache.org/jira/browse/DRILL-6348 > Project: Apache Drill > Issue Type: Task > Components: Execution - Flow >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > The Drill Profile functionality doesn't show any memory usage for the > Unordered Receiver operator. This is problematic when analyzing OOM > conditions since we cannot account for all of a query memory usage. This Jira > is to fix memory reporting for the Unordered Receiver operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6348) Unordered Receiver does not report its memory usage
[ https://issues.apache.org/jira/browse/DRILL-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456757#comment-16456757 ] ASF GitHub Bot commented on DRILL-6348: --- Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1237#discussion_r184727914 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unorderedreceiver/UnorderedReceiverBatch.java --- @@ -149,25 +149,32 @@ private RawFragmentBatch getNextBatch() throws IOException { } } + private RawFragmentBatch getNextNotEmptyBatch() throws IOException { +RawFragmentBatch batch; +try { + stats.startWait(); --- End diff -- Ok good point, as I have seen both practices being done within the Drill code. Though, I don't think this is a big deal as I don't see startWait() failing as it merely invokes nano time. > Unordered Receiver does not report its memory usage > --- > > Key: DRILL-6348 > URL: https://issues.apache.org/jira/browse/DRILL-6348 > Project: Apache Drill > Issue Type: Task > Components: Execution - Flow >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > The Drill Profile functionality doesn't show any memory usage for the > Unordered Receiver operator. This is problematic when analyzing OOM > conditions since we cannot account for all of a query memory usage. This Jira > is to fix memory reporting for the Unordered Receiver operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6348) Unordered Receiver does not report its memory usage
[ https://issues.apache.org/jira/browse/DRILL-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456758#comment-16456758 ] ASF GitHub Bot commented on DRILL-6348: --- Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1237#discussion_r184730050 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/record/RawFragmentBatch.java --- @@ -77,4 +83,46 @@ public long getByteCount() { public boolean isAckSent() { return ackSent.get(); } + + /** + * Transfer ownership of this DrillBuf to the target allocator. This is done for better memory + * accounting (that is, the operator should be charged with the body's Drillbuf memory). + * + * NOTES - + * + * This operation is a NOOP when a) the current allocator (associated with the DrillBuf) is not the + * owning allocator or b) the target allocator is already the owner + * When transfer happens, a new RawFragmentBatch instance is allocated; this is done for proper + * DrillBuf reference count accounting + * The RPC handling code caches a reference to this RawFragmentBatch object instance; release() + * calls should be routed to the previous DrillBuf + * + * + * @param targetAllocator target allocator + * @return a new {@link RawFragmentBatch} object instance on success (where the buffer ownership has + * been switched to the target allocator); otherwise this operation is a NOOP (current instance + * returned) + */ + public RawFragmentBatch transferBodyOwnership(BufferAllocator targetAllocator) { +if (body == null) { + return this; // NOOP +} + +if (!body.getLedger().isOwningLedger() + || body.getLedger().isOwner(targetAllocator)) { + + return this; +} + +int writerIndex = body.writerIndex(); +TransferResult transferResult = body.transferOwnership(targetAllocator); + +// Set the index and increment reference count +transferResult.buffer.writerIndex(writerIndex); + +// Clear the current Drillbuffer since caller will perform release() on the new one +body.release(); + +return new RawFragmentBatch(getHeader(), transferResult.buffer, getSender(), false); --- End diff -- We can take up such an enhancement as as part of another JIRA as any changes within the RPC layer have to be thoroughly tested. > Unordered Receiver does not report its memory usage > --- > > Key: DRILL-6348 > URL: https://issues.apache.org/jira/browse/DRILL-6348 > Project: Apache Drill > Issue Type: Task > Components: Execution - Flow >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > The Drill Profile functionality doesn't show any memory usage for the > Unordered Receiver operator. This is problematic when analyzing OOM > conditions since we cannot account for all of a query memory usage. This Jira > is to fix memory reporting for the Unordered Receiver operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6348) Unordered Receiver does not report its memory usage
[ https://issues.apache.org/jira/browse/DRILL-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456759#comment-16456759 ] ASF GitHub Bot commented on DRILL-6348: --- Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1237#discussion_r184728292 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unorderedreceiver/UnorderedReceiverBatch.java --- @@ -149,25 +149,32 @@ private RawFragmentBatch getNextBatch() throws IOException { } } + private RawFragmentBatch getNextNotEmptyBatch() throws IOException { +RawFragmentBatch batch; +try { + stats.startWait(); + batch = getNextBatch(); + + // skip over empty batches. we do this since these are basically control messages. + while (batch != null && batch.getHeader().getDef().getRecordCount() == 0 --- End diff -- Ignore this comment as I thought you were releasing the returned batch. > Unordered Receiver does not report its memory usage > --- > > Key: DRILL-6348 > URL: https://issues.apache.org/jira/browse/DRILL-6348 > Project: Apache Drill > Issue Type: Task > Components: Execution - Flow >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > The Drill Profile functionality doesn't show any memory usage for the > Unordered Receiver operator. This is problematic when analyzing OOM > conditions since we cannot account for all of a query memory usage. This Jira > is to fix memory reporting for the Unordered Receiver operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6272) Remove binary jars files from source distribution
[ https://issues.apache.org/jira/browse/DRILL-6272?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456657#comment-16456657 ] ASF GitHub Bot commented on DRILL-6272: --- Github user arina-ielchiieva commented on the issue: https://github.com/apache/drill/pull/1225 @vrozov now PR contains two commits: 1. jmockit and mockito upgrade (DRILL-6363); 2. maven-embedder usage for unit tests (used latest version as you suggested) (DRILL-6272). Please review. > Remove binary jars files from source distribution > - > > Key: DRILL-6272 > URL: https://issues.apache.org/jira/browse/DRILL-6272 > Project: Apache Drill > Issue Type: Task >Reporter: Vlad Rozov >Assignee: Arina Ielchiieva >Priority: Critical > Fix For: 1.14.0 > > > Per [~vrozov] the source distribution contains binary jar files under > exec/java-exec/src/test/resources/jars -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6348) Unordered Receiver does not report its memory usage
[ https://issues.apache.org/jira/browse/DRILL-6348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456626#comment-16456626 ] ASF GitHub Bot commented on DRILL-6348: --- Github user sachouche commented on a diff in the pull request: https://github.com/apache/drill/pull/1237#discussion_r184726839 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/physical/impl/unorderedreceiver/UnorderedReceiverBatch.java --- @@ -201,6 +208,11 @@ public IterOutcome next() { context.getExecutorState().fail(ex); return IterOutcome.STOP; } finally { + + if (batch != null) { +batch.release(); +batch = null; --- End diff -- The point of this pattern is that if you would like to continue using this object then be prepared to know what can and what cannot be used. > Unordered Receiver does not report its memory usage > --- > > Key: DRILL-6348 > URL: https://issues.apache.org/jira/browse/DRILL-6348 > Project: Apache Drill > Issue Type: Task > Components: Execution - Flow >Reporter: salim achouche >Assignee: salim achouche >Priority: Major > Fix For: 1.14.0 > > > The Drill Profile functionality doesn't show any memory usage for the > Unordered Receiver operator. This is problematic when analyzing OOM > conditions since we cannot account for all of a query memory usage. This Jira > is to fix memory reporting for the Unordered Receiver operator. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6281) Refactor TimedRunnable
[ https://issues.apache.org/jira/browse/DRILL-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456620#comment-16456620 ] ASF GitHub Bot commented on DRILL-6281: --- Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1238#discussion_r184724657 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedCallable.java --- @@ -0,0 +1,266 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store; + +import java.io.IOException; +import java.util.List; +import java.util.Objects; +import java.util.concurrent.Callable; +import java.util.concurrent.CancellationException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.RejectedExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.function.Consumer; +import java.util.function.Function; +import java.util.stream.Collectors; + +import org.apache.drill.common.exceptions.UserException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.google.common.base.Preconditions; +import com.google.common.base.Stopwatch; +import com.google.common.util.concurrent.MoreExecutors; +import com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * Class used to allow parallel executions of tasks in a simplified way. Also maintains and reports timings of task completion. + * TODO: look at switching to fork join. + * @param The time value that will be returned when the task is executed. + */ +public abstract class TimedCallable implements Callable { + private static final Logger logger = LoggerFactory.getLogger(TimedCallable.class); + + private static long TIMEOUT_PER_RUNNABLE_IN_MSECS = 15000; + + private volatile long startTime = 0; + private volatile long executionTime = -1; + + private static class FutureMapper implements Function, V> { +int count; +Throwable throwable = null; + +private void setThrowable(Throwable t) { + if (throwable == null) { +throwable = t; + } else { +throwable.addSuppressed(t); + } +} + +@Override +public V apply(Future future) { + Preconditions.checkState(future.isDone()); + if (!future.isCancelled()) { +try { + count++; + return future.get(); +} catch (InterruptedException e) { + // there is no wait as we are getting result from the completed/done future + logger.error("Unexpected exception", e); + throw UserException.internalError(e) + .message("Unexpected exception") + .build(logger); +} catch (ExecutionException e) { + setThrowable(e.getCause()); +} + } else { +setThrowable(new CancellationException()); + } + return null; +} + } + + private static class Statistics implements Consumer> { +final long start = System.nanoTime(); +final Stopwatch watch = Stopwatch.createStarted(); +long totalExecution; +long maxExecution; +int count; +int startedCount; +private int doneCount; +// measure thread creation times +long earliestStart; +long latestStart; +long totalStart; + +@Override +public void accept(TimedCallable task) { + count++; + long threadStart = task.getStartTime(TimeUnit.NANOSECONDS) - start; + if (threadStart >= 0) { +startedCount++; +earliestStart = Math.min(earliestStart, threadStart); +latestStart = Math.max(latestStart, threadStart); +totalSt
[jira] [Commented] (DRILL-5797) Use more often the new parquet reader
[ https://issues.apache.org/jira/browse/DRILL-5797?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456529#comment-16456529 ] Oleksandr Kalinin commented on DRILL-5797: -- Just for a record, further debugging shows how complex column sneaks into ReadState: (1) `ParquetRecordReader.setup()` triggers ParquetSchema buildSchema/loadParquetSchema for column mapping (2) `ParquetSchema.loadParquetSchema()` is using `ParqueSchema.fieldSelected()` for column matching (3) fieldSelected() takes MaterializedField as an argument and uses it's getName() method for column name comparison. For column B.A it returns A. (4) As result of that, column B.A of the file gets positively matched to column A and gets added to selectedColumnMetadata in the ParquetSchema which is then passed to ReadState > Use more often the new parquet reader > - > > Key: DRILL-5797 > URL: https://issues.apache.org/jira/browse/DRILL-5797 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Parquet >Reporter: Damien Profeta >Assignee: Oleksandr Kalinin >Priority: Major > Fix For: 1.14.0 > > > The choice of using the regular parquet reader of the optimized one is based > of what type of columns is in the file. But the columns that are read by the > query doesn't matter. We can increase a little bit the cases where the > optimized reader is used by checking is the projected column are simple or > not. > This is an optimization waiting for the fast parquet reader to handle complex > structure. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6363) Upgrade jmockit and mockito libs
[ https://issues.apache.org/jira/browse/DRILL-6363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6363: Description: Change Jmockit from {noformat} com.googlecode.jmockit jmockit 1.3 test {noformat} to {noformat} org.jmockit jmockit 1.39 test {noformat} Change Mockito core version from 1.9.5 to 2.18.3. was: JMOCKIT {noformat} com.googlecode.jmockit jmockit 1.3 test {noformat} to {noformat} org.jmockit jmockit 1.39 test {noformat} Change Mockito core version from 1.9.5 to 2.18.3. > Upgrade jmockit and mockito libs > > > Key: DRILL-6363 > URL: https://issues.apache.org/jira/browse/DRILL-6363 > Project: Apache Drill > Issue Type: Task >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > > Change Jmockit from > {noformat} > > com.googlecode.jmockit > jmockit > 1.3 > test > > {noformat} > to > {noformat} > > org.jmockit > jmockit > 1.39 > test > > {noformat} > Change Mockito core version from 1.9.5 to 2.18.3. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6363) Upgrade jmockit and mockito libs
[ https://issues.apache.org/jira/browse/DRILL-6363?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6363: Description: JMOCKIT {noformat} com.googlecode.jmockit jmockit 1.3 test {noformat} to {noformat} org.jmockit jmockit 1.39 test {noformat} Change Mockito core version from 1.9.5 to 2.18.3. > Upgrade jmockit and mockito libs > > > Key: DRILL-6363 > URL: https://issues.apache.org/jira/browse/DRILL-6363 > Project: Apache Drill > Issue Type: Task >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.14.0 > > > JMOCKIT > {noformat} > > com.googlecode.jmockit > jmockit > 1.3 > test > > {noformat} > to > {noformat} > > org.jmockit > jmockit > 1.39 > test > > {noformat} > Change Mockito core version from 1.9.5 to 2.18.3. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-6363) Upgrade jmockit and mockito libs
Arina Ielchiieva created DRILL-6363: --- Summary: Upgrade jmockit and mockito libs Key: DRILL-6363 URL: https://issues.apache.org/jira/browse/DRILL-6363 Project: Apache Drill Issue Type: Task Affects Versions: 1.13.0 Reporter: Arina Ielchiieva Assignee: Arina Ielchiieva Fix For: 1.14.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6281) Refactor TimedRunnable
[ https://issues.apache.org/jira/browse/DRILL-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456515#comment-16456515 ] ASF GitHub Bot commented on DRILL-6281: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1238#discussion_r184701824 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedCallable.java --- @@ -0,0 +1,266 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store; + +import java.io.IOException; +import java.util.List; +import java.util.Objects; +import java.util.concurrent.Callable; +import java.util.concurrent.CancellationException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.RejectedExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.function.Consumer; +import java.util.function.Function; +import java.util.stream.Collectors; + +import org.apache.drill.common.exceptions.UserException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.google.common.base.Preconditions; +import com.google.common.base.Stopwatch; +import com.google.common.util.concurrent.MoreExecutors; +import com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * Class used to allow parallel executions of tasks in a simplified way. Also maintains and reports timings of task completion. + * TODO: look at switching to fork join. + * @param The time value that will be returned when the task is executed. + */ +public abstract class TimedCallable implements Callable { + private static final Logger logger = LoggerFactory.getLogger(TimedCallable.class); + + private static long TIMEOUT_PER_RUNNABLE_IN_MSECS = 15000; + + private volatile long startTime = 0; + private volatile long executionTime = -1; + + private static class FutureMapper implements Function, V> { +int count; +Throwable throwable = null; + +private void setThrowable(Throwable t) { + if (throwable == null) { +throwable = t; + } else { +throwable.addSuppressed(t); + } +} + +@Override +public V apply(Future future) { + Preconditions.checkState(future.isDone()); + if (!future.isCancelled()) { +try { + count++; + return future.get(); +} catch (InterruptedException e) { + // there is no wait as we are getting result from the completed/done future + logger.error("Unexpected exception", e); + throw UserException.internalError(e) + .message("Unexpected exception") + .build(logger); +} catch (ExecutionException e) { + setThrowable(e.getCause()); +} + } else { +setThrowable(new CancellationException()); + } + return null; +} + } + + private static class Statistics implements Consumer> { +final long start = System.nanoTime(); +final Stopwatch watch = Stopwatch.createStarted(); +long totalExecution; +long maxExecution; +int count; +int startedCount; +private int doneCount; +// measure thread creation times +long earliestStart; +long latestStart; +long totalStart; + +@Override +public void accept(TimedCallable task) { + count++; + long threadStart = task.getStartTime(TimeUnit.NANOSECONDS) - start; + if (threadStart >= 0) { +startedCount++; +earliestStart = Math.min(earliestStart, threadStart); +latestStart = Math.max(latestStart, threadStart); +
[jira] [Commented] (DRILL-6281) Refactor TimedRunnable
[ https://issues.apache.org/jira/browse/DRILL-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456481#comment-16456481 ] ASF GitHub Bot commented on DRILL-6281: --- Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1238#discussion_r184694930 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedCallable.java --- @@ -0,0 +1,266 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store; + +import java.io.IOException; +import java.util.List; +import java.util.Objects; +import java.util.concurrent.Callable; +import java.util.concurrent.CancellationException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.RejectedExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.function.Consumer; +import java.util.function.Function; +import java.util.stream.Collectors; + +import org.apache.drill.common.exceptions.UserException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.google.common.base.Preconditions; +import com.google.common.base.Stopwatch; +import com.google.common.util.concurrent.MoreExecutors; +import com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * Class used to allow parallel executions of tasks in a simplified way. Also maintains and reports timings of task completion. + * TODO: look at switching to fork join. + * @param The time value that will be returned when the task is executed. + */ +public abstract class TimedCallable implements Callable { + private static final Logger logger = LoggerFactory.getLogger(TimedCallable.class); + + private static long TIMEOUT_PER_RUNNABLE_IN_MSECS = 15000; + + private volatile long startTime = 0; + private volatile long executionTime = -1; + + private static class FutureMapper implements Function, V> { +int count; +Throwable throwable = null; + +private void setThrowable(Throwable t) { + if (throwable == null) { +throwable = t; + } else { +throwable.addSuppressed(t); + } +} + +@Override +public V apply(Future future) { + Preconditions.checkState(future.isDone()); + if (!future.isCancelled()) { +try { + count++; + return future.get(); +} catch (InterruptedException e) { + // there is no wait as we are getting result from the completed/done future + logger.error("Unexpected exception", e); + throw UserException.internalError(e) + .message("Unexpected exception") + .build(logger); +} catch (ExecutionException e) { + setThrowable(e.getCause()); +} + } else { +setThrowable(new CancellationException()); + } + return null; +} + } + + private static class Statistics implements Consumer> { +final long start = System.nanoTime(); +final Stopwatch watch = Stopwatch.createStarted(); +long totalExecution; +long maxExecution; +int count; +int startedCount; +private int doneCount; +// measure thread creation times +long earliestStart; +long latestStart; +long totalStart; + +@Override +public void accept(TimedCallable task) { + count++; + long threadStart = task.getStartTime(TimeUnit.NANOSECONDS) - start; + if (threadStart >= 0) { +startedCount++; +earliestStart = Math.min(earliestStart, threadStart); +latestStart = Math.max(latestStart, threadStart); +totalSt
[jira] [Commented] (DRILL-6281) Refactor TimedRunnable
[ https://issues.apache.org/jira/browse/DRILL-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456472#comment-16456472 ] ASF GitHub Bot commented on DRILL-6281: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1238#discussion_r184693553 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedCallable.java --- @@ -0,0 +1,266 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store; + +import java.io.IOException; +import java.util.List; +import java.util.Objects; +import java.util.concurrent.Callable; +import java.util.concurrent.CancellationException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.RejectedExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.function.Consumer; +import java.util.function.Function; +import java.util.stream.Collectors; + +import org.apache.drill.common.exceptions.UserException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.google.common.base.Preconditions; +import com.google.common.base.Stopwatch; +import com.google.common.util.concurrent.MoreExecutors; +import com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * Class used to allow parallel executions of tasks in a simplified way. Also maintains and reports timings of task completion. + * TODO: look at switching to fork join. + * @param The time value that will be returned when the task is executed. + */ +public abstract class TimedCallable implements Callable { + private static final Logger logger = LoggerFactory.getLogger(TimedCallable.class); + + private static long TIMEOUT_PER_RUNNABLE_IN_MSECS = 15000; + + private volatile long startTime = 0; + private volatile long executionTime = -1; + + private static class FutureMapper implements Function, V> { +int count; +Throwable throwable = null; + +private void setThrowable(Throwable t) { + if (throwable == null) { +throwable = t; + } else { +throwable.addSuppressed(t); + } +} + +@Override +public V apply(Future future) { + Preconditions.checkState(future.isDone()); + if (!future.isCancelled()) { +try { + count++; + return future.get(); +} catch (InterruptedException e) { + // there is no wait as we are getting result from the completed/done future + logger.error("Unexpected exception", e); + throw UserException.internalError(e) + .message("Unexpected exception") + .build(logger); +} catch (ExecutionException e) { + setThrowable(e.getCause()); +} + } else { +setThrowable(new CancellationException()); + } + return null; +} + } + + private static class Statistics implements Consumer> { +final long start = System.nanoTime(); +final Stopwatch watch = Stopwatch.createStarted(); +long totalExecution; +long maxExecution; +int count; +int startedCount; +private int doneCount; +// measure thread creation times +long earliestStart; +long latestStart; +long totalStart; + +@Override +public void accept(TimedCallable task) { + count++; + long threadStart = task.getStartTime(TimeUnit.NANOSECONDS) - start; + if (threadStart >= 0) { +startedCount++; +earliestStart = Math.min(earliestStart, threadStart); +latestStart = Math.max(latestStart, threadStart); +
[jira] [Commented] (DRILL-6281) Refactor TimedRunnable
[ https://issues.apache.org/jira/browse/DRILL-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456468#comment-16456468 ] ASF GitHub Bot commented on DRILL-6281: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1238#discussion_r184693216 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedCallable.java --- @@ -0,0 +1,258 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store; + +import java.io.IOException; +import java.util.List; +import java.util.Objects; +import java.util.concurrent.Callable; +import java.util.concurrent.CancellationException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.RejectedExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.function.Consumer; +import java.util.function.Function; +import java.util.stream.Collectors; + +import org.apache.drill.common.exceptions.UserException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.google.common.base.Preconditions; +import com.google.common.base.Stopwatch; +import com.google.common.util.concurrent.MoreExecutors; +import com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * Class used to allow parallel executions of tasks in a simplified way. Also maintains and reports timings of task completion. + * TODO: look at switching to fork join. + * @param The time value that will be returned when the task is executed. + */ +public abstract class TimedCallable implements Callable { + private static final Logger logger = LoggerFactory.getLogger(TimedCallable.class); + + private static long TIMEOUT_PER_RUNNABLE_IN_MSECS = 15000; + + private volatile long startTime = 0; + private volatile long executionTime = -1; + + private static class FutureMapper implements Function, V> { +int count; +Throwable throwable = null; + +private void setThrowable(Throwable t) { + if (throwable == null) { +throwable = t; + } else { +throwable.addSuppressed(t); + } +} + +@Override +public V apply(Future future) { + Preconditions.checkState(future.isDone()); + if (!future.isCancelled()) { +try { + count++; + return future.get(); +} catch (InterruptedException e) { + // there is no wait as we are getting result from the completed/done future + logger.error("Unexpected exception", e); + throw UserException.internalError(e) + .message("Unexpected exception") + .build(logger); +} catch (ExecutionException e) { + setThrowable(e.getCause()); +} + } else { +setThrowable(new CancellationException()); + } + return null; +} + } + + private static class Statistics implements Consumer> { +final long start = System.nanoTime(); +final Stopwatch watch = Stopwatch.createStarted(); +long totalExecution = 0; +long maxExecution = 0; +int startedCount = 0; +private int doneCount = 0; +// measure thread creation times +long earliestStart = Long.MAX_VALUE; +long latestStart = 0; +long totalStart = 0; + +@Override +public void accept(TimedCallable task) { + long threadStart = task.getStartTime(TimeUnit.NANOSECONDS) - start; + if (threadStart >= 0) { +startedCount++; +earliestStart = Math.min(earliestStart, threadStart); +latestStart = Math.max(latestStart, threadStart); +
[jira] [Commented] (DRILL-6281) Refactor TimedRunnable
[ https://issues.apache.org/jira/browse/DRILL-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456461#comment-16456461 ] ASF GitHub Bot commented on DRILL-6281: --- Github user vrozov commented on a diff in the pull request: https://github.com/apache/drill/pull/1238#discussion_r184691926 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedCallable.java --- @@ -0,0 +1,258 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store; + +import java.io.IOException; +import java.util.List; +import java.util.Objects; +import java.util.concurrent.Callable; +import java.util.concurrent.CancellationException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.RejectedExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.function.Consumer; +import java.util.function.Function; +import java.util.stream.Collectors; + +import org.apache.drill.common.exceptions.UserException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.google.common.base.Preconditions; +import com.google.common.base.Stopwatch; +import com.google.common.util.concurrent.MoreExecutors; +import com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * Class used to allow parallel executions of tasks in a simplified way. Also maintains and reports timings of task completion. + * TODO: look at switching to fork join. + * @param The time value that will be returned when the task is executed. + */ +public abstract class TimedCallable implements Callable { + private static final Logger logger = LoggerFactory.getLogger(TimedCallable.class); + + private static long TIMEOUT_PER_RUNNABLE_IN_MSECS = 15000; + + private volatile long startTime = 0; + private volatile long executionTime = -1; + + private static class FutureMapper implements Function, V> { +int count; +Throwable throwable = null; + +private void setThrowable(Throwable t) { + if (throwable == null) { +throwable = t; + } else { +throwable.addSuppressed(t); + } +} + +@Override +public V apply(Future future) { + Preconditions.checkState(future.isDone()); + if (!future.isCancelled()) { +try { + count++; + return future.get(); +} catch (InterruptedException e) { + // there is no wait as we are getting result from the completed/done future + logger.error("Unexpected exception", e); + throw UserException.internalError(e) + .message("Unexpected exception") + .build(logger); +} catch (ExecutionException e) { + setThrowable(e.getCause()); +} + } else { +setThrowable(new CancellationException()); + } + return null; +} + } + + private static class Statistics implements Consumer> { +final long start = System.nanoTime(); +final Stopwatch watch = Stopwatch.createStarted(); +long totalExecution = 0; +long maxExecution = 0; +int startedCount = 0; +private int doneCount = 0; +// measure thread creation times +long earliestStart = Long.MAX_VALUE; +long latestStart = 0; +long totalStart = 0; + +@Override +public void accept(TimedCallable task) { + long threadStart = task.getStartTime(TimeUnit.NANOSECONDS) - start; + if (threadStart >= 0) { +startedCount++; +earliestStart = Math.min(earliestStart, threadStart); +latestStart = Math.max(latestStart, threadStart); +totalS
[jira] [Commented] (DRILL-6331) Parquet filter pushdown does not support the native hive reader
[ https://issues.apache.org/jira/browse/DRILL-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456452#comment-16456452 ] ASF GitHub Bot commented on DRILL-6331: --- Github user vrozov commented on the issue: https://github.com/apache/drill/pull/1214 When moving files around please preserve the history of modifications done to the file. > Parquet filter pushdown does not support the native hive reader > --- > > Key: DRILL-6331 > URL: https://issues.apache.org/jira/browse/DRILL-6331 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Labels: doc-impacting, ready-to-commit > Fix For: 1.14.0 > > > Initially HiveDrillNativeParquetGroupScan was based mainly on HiveScan, the > core difference between them was > that HiveDrillNativeParquetScanBatchCreator was creating ParquetRecordReader > instead of HiveReader. > This allowed to read Hive parquet files using Drill native parquet reader but > did not expose Hive data to Drill optimizations. > For example, filter push down, limit push down, count to direct scan > optimizations. > Hive code had to be refactored to use the same interfaces as > ParquestGroupScan in order to be exposed to such optimizations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6345) Add LOG10 function implementation
[ https://issues.apache.org/jira/browse/DRILL-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi updated DRILL-6345: --- Labels: ready-to-commit (was: ) > Add LOG10 function implementation > - > > Key: DRILL-6345 > URL: https://issues.apache.org/jira/browse/DRILL-6345 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0 > > > Add LOG10 function implementation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6331) Parquet filter pushdown does not support the native hive reader
[ https://issues.apache.org/jira/browse/DRILL-6331?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456171#comment-16456171 ] ASF GitHub Bot commented on DRILL-6331: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1214 > Parquet filter pushdown does not support the native hive reader > --- > > Key: DRILL-6331 > URL: https://issues.apache.org/jira/browse/DRILL-6331 > Project: Apache Drill > Issue Type: Improvement > Components: Storage - Hive >Affects Versions: 1.13.0 >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Labels: doc-impacting, ready-to-commit > Fix For: 1.14.0 > > > Initially HiveDrillNativeParquetGroupScan was based mainly on HiveScan, the > core difference between them was > that HiveDrillNativeParquetScanBatchCreator was creating ParquetRecordReader > instead of HiveReader. > This allowed to read Hive parquet files using Drill native parquet reader but > did not expose Hive data to Drill optimizations. > For example, filter push down, limit push down, count to direct scan > optimizations. > Hive code had to be refactored to use the same interfaces as > ParquestGroupScan in order to be exposed to such optimizations. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6342) Parquet filter pushdown doesn't work in case of filtering fields inside arrays of complex fields
[ https://issues.apache.org/jira/browse/DRILL-6342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456170#comment-16456170 ] ASF GitHub Bot commented on DRILL-6342: --- Github user asfgit closed the pull request at: https://github.com/apache/drill/pull/1231 > Parquet filter pushdown doesn't work in case of filtering fields inside > arrays of complex fields > > > Key: DRILL-6342 > URL: https://issues.apache.org/jira/browse/DRILL-6342 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.14.0 >Reporter: Anton Gozhiy >Assignee: Arina Ielchiieva >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0 > > Attachments: Complex_data.tar.gz > > > *Data:* > Complex_data data set is attached > *Query:* > {code:sql} > explain plan for select * from dfs.tmp.`Complex_data` t where > t.list_of_complex_fields[2].nested_field is true > {code} > *Expected result:* > numFiles=2 > Statistics of the file that should't be scanned: > {noformat} > list_of_complex_fields: > .nested_field: BOOLEAN UNCOMPRESSED DO:0 FPO:497 > SZ:41/41/1.00 VC:3 ENC:PLAIN,RLE ST:[min: false, max: false, num_nulls: 0] > {noformat} > *Actual result:* > numFiles=3 > I.e, filter pushdown is not work -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6360) Document the typeof() function
[ https://issues.apache.org/jira/browse/DRILL-6360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6360: Fix Version/s: 1.14.0 > Document the typeof() function > -- > > Key: DRILL-6360 > URL: https://issues.apache.org/jira/browse/DRILL-6360 > Project: Apache Drill > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Bridget Bevens >Priority: Minor > Labels: doc-impacting > Fix For: 1.14.0 > > > Drill has a {{typeof()}} function that returns the data type (but not mode) > of a column. It was discussed on the dev list recently. However, a search of > the Drill web site, and a scan by hand, failed to turn up documentation about > the function. > As a general suggestion, would be great to have an alphabetical list of all > functions so we don't have to hunt all over the site to find which functions > are available. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6360) Document the typeof() function
[ https://issues.apache.org/jira/browse/DRILL-6360?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6360: Labels: doc-impacting (was: ) > Document the typeof() function > -- > > Key: DRILL-6360 > URL: https://issues.apache.org/jira/browse/DRILL-6360 > Project: Apache Drill > Issue Type: Improvement > Components: Documentation >Affects Versions: 1.13.0 >Reporter: Paul Rogers >Assignee: Bridget Bevens >Priority: Minor > Labels: doc-impacting > Fix For: 1.14.0 > > > Drill has a {{typeof()}} function that returns the data type (but not mode) > of a column. It was discussed on the dev list recently. However, a search of > the Drill web site, and a scan by hand, failed to turn up documentation about > the function. > As a general suggestion, would be great to have an alphabetical list of all > functions so we don't have to hunt all over the site to find which functions > are available. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6281) Refactor TimedRunnable
[ https://issues.apache.org/jira/browse/DRILL-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456106#comment-16456106 ] ASF GitHub Bot commented on DRILL-6281: --- Github user arina-ielchiieva commented on a diff in the pull request: https://github.com/apache/drill/pull/1238#discussion_r184622525 --- Diff: exec/java-exec/src/main/java/org/apache/drill/exec/store/TimedCallable.java --- @@ -0,0 +1,258 @@ +/* + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.drill.exec.store; + +import java.io.IOException; +import java.util.List; +import java.util.Objects; +import java.util.concurrent.Callable; +import java.util.concurrent.CancellationException; +import java.util.concurrent.ExecutionException; +import java.util.concurrent.ExecutorService; +import java.util.concurrent.Executors; +import java.util.concurrent.Future; +import java.util.concurrent.RejectedExecutionException; +import java.util.concurrent.TimeUnit; +import java.util.function.Consumer; +import java.util.function.Function; +import java.util.stream.Collectors; + +import org.apache.drill.common.exceptions.UserException; + +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import com.google.common.base.Preconditions; +import com.google.common.base.Stopwatch; +import com.google.common.util.concurrent.MoreExecutors; +import com.google.common.util.concurrent.ThreadFactoryBuilder; + +/** + * Class used to allow parallel executions of tasks in a simplified way. Also maintains and reports timings of task completion. + * TODO: look at switching to fork join. + * @param The time value that will be returned when the task is executed. + */ +public abstract class TimedCallable implements Callable { + private static final Logger logger = LoggerFactory.getLogger(TimedCallable.class); + + private static long TIMEOUT_PER_RUNNABLE_IN_MSECS = 15000; + + private volatile long startTime = 0; + private volatile long executionTime = -1; + + private static class FutureMapper implements Function, V> { +int count; +Throwable throwable = null; + +private void setThrowable(Throwable t) { + if (throwable == null) { +throwable = t; + } else { +throwable.addSuppressed(t); + } +} + +@Override +public V apply(Future future) { + Preconditions.checkState(future.isDone()); + if (!future.isCancelled()) { +try { + count++; + return future.get(); +} catch (InterruptedException e) { + // there is no wait as we are getting result from the completed/done future + logger.error("Unexpected exception", e); + throw UserException.internalError(e) + .message("Unexpected exception") + .build(logger); +} catch (ExecutionException e) { + setThrowable(e.getCause()); +} + } else { +setThrowable(new CancellationException()); + } + return null; +} + } + + private static class Statistics implements Consumer> { +final long start = System.nanoTime(); +final Stopwatch watch = Stopwatch.createStarted(); +long totalExecution = 0; +long maxExecution = 0; +int startedCount = 0; +private int doneCount = 0; +// measure thread creation times +long earliestStart = Long.MAX_VALUE; +long latestStart = 0; +long totalStart = 0; + +@Override +public void accept(TimedCallable task) { + long threadStart = task.getStartTime(TimeUnit.NANOSECONDS) - start; + if (threadStart >= 0) { +startedCount++; +earliestStart = Math.min(earliestStart, threadStart); +latestStart = Math.max(latestStart, threadStart); +
[jira] [Updated] (DRILL-6281) Refactor TimedRunnable
[ https://issues.apache.org/jira/browse/DRILL-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6281: Labels: (was: ready-to-commit) > Refactor TimedRunnable > -- > > Key: DRILL-6281 > URL: https://issues.apache.org/jira/browse/DRILL-6281 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Vlad Rozov >Assignee: Vlad Rozov >Priority: Major > Fix For: 1.14.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6281) Refactor TimedRunnable
[ https://issues.apache.org/jira/browse/DRILL-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6281: Labels: ready-to-commit (was: ) > Refactor TimedRunnable > -- > > Key: DRILL-6281 > URL: https://issues.apache.org/jira/browse/DRILL-6281 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Vlad Rozov >Assignee: Vlad Rozov >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6281) Refactor TimedRunnable
[ https://issues.apache.org/jira/browse/DRILL-6281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6281: Fix Version/s: 1.14.0 > Refactor TimedRunnable > -- > > Key: DRILL-6281 > URL: https://issues.apache.org/jira/browse/DRILL-6281 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Vlad Rozov >Assignee: Vlad Rozov >Priority: Major > Labels: ready-to-commit > Fix For: 1.14.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6345) Add LOG10 function implementation
[ https://issues.apache.org/jira/browse/DRILL-6345?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16456102#comment-16456102 ] ASF GitHub Bot commented on DRILL-6345: --- Github user vladimirtkach commented on the issue: https://github.com/apache/drill/pull/1230 @vvysotskyi made changes according to your remarks > Add LOG10 function implementation > - > > Key: DRILL-6345 > URL: https://issues.apache.org/jira/browse/DRILL-6345 > Project: Apache Drill > Issue Type: Improvement > Components: Functions - Drill >Reporter: Volodymyr Tkach >Assignee: Volodymyr Tkach >Priority: Major > Fix For: 1.14.0 > > > Add LOG10 function implementation. -- This message was sent by Atlassian JIRA (v7.6.3#76005)