Re: [VOTE] Release Apache Drill 1.18.0 - RC0
Verified checksums and signatures for binary and source tarballs and for jars published to the maven repo. Run all unit tests on Ubuntu with JDK 8 using tar with sources. Run Drill in embedded mode on Ubuntu, submitted several queries, verified that profiles displayed correctly. Checked JDBC driver using SQuirreL SQL client and custom java client, ensured that it works correctly with the custom authenticator. +1 (binding) Kind regards, Volodymyr Vysotskyi On Mon, Aug 31, 2020 at 1:37 PM Volodymyr Vysotskyi wrote: > Hi all, > > I have looked into the DRILL-7785, and the problem is not in Drill, so it > is not a blocker for the release. > For more details please refer to my comment > <https://issues.apache.org/jira/browse/DRILL-7785?focusedCommentId=17187629=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17187629> > on this ticket. > > Kind regards, > Volodymyr Vysotskyi > > > On Mon, Aug 31, 2020 at 4:26 AM Abhishek Girish > wrote: > >> Yup we can certainly include it if RC0 fails. So far I’m inclined to not >> consider it a blocker. I’ve requested Vova and Anton to take a look. >> >> So folks, please continue to test the candidate. >> >> On Sun, Aug 30, 2020 at 6:16 PM Charles Givre wrote: >> >> > Ok. Are you looking to include DRILL-7785? I don't think it's a >> blocker, >> > but if we find anything with RC0... let's make sure we get it in. >> > >> > -- C >> > >> > >> > >> > > On Aug 30, 2020, at 9:14 PM, Abhishek Girish >> wrote: >> > >> > > >> > >> > > Hey Charles, >> > >> > > >> > >> > > I would have liked to. We did get one of the PRs merged after the >> master >> > >> > > branch was closed as I hadn't made enough progress with the release >> yet. >> > >> > > But that’s not the case now. >> > >> > > >> > >> > > Unless DRILL-7781 is a release blocker, we should probably skip it. So >> > far, >> > >> > > a lot of effort has gone into getting RC0 ready. So I'm hoping to get >> > this >> > >> > > closed asap. >> > >> > > >> > >> > > Regards, >> > >> > > Abhishek >> > >> > > >> > >> > > On Sun, Aug 30, 2020 at 6:07 PM Charles Givre >> wrote: >> > >> > > >> > >> > >> HI Abhishek, >> > >> > >> >> > >> > >> Can we merge DRILL-7781? We really shouldn't ship something with a >> > simple >> > >> > >> bug like this. >> > >> > >> >> > >> > >> -- C >> > >> > >> >> > >> > >> >> > >> > >> >> > >> > >> >> > >> > >> >> > >> > >>> On Aug 30, 2020, at 8:40 PM, Abhishek Girish >> > wrote: >> > >> > >> >> > >> > >>> >> > >> > >> >> > >> > >>> Advanced tests from [5] are also complete. All 7500+ tests passed, >> > except >> > >> > >> >> > >> > >>> for a few relating to known resource issues (drillbit connectivity / >> > OOM >> > >> > >> >> > >> > >>> /...). Plus a few with the same symptoms as DRILL-7785. >> > >> > >> >> > >> > >>> >> > >> > >> >> > >> > >>> On Sun, Aug 30, 2020 at 2:17 PM Abhishek Girish > > >> > >> > >> wrote: >> > >> > >> >> > >> > >>> >> > >> > >> >> > >> > >>>> Wanted to share an update on some of the testing I've done from my >> > side: >> > >> > >> >> > >> > >>>> >> > >> > >> >> > >> > >>>> All Functional tests from [5] (plus private Customer tests) are >> > >> > >> complete. >> > >> > >> >> > >> > >>>> 10,000+ tests have passed. However, I did see an issue with Hive >> ORC >> > >> > >> tables >> > >> > >> >> > >> > >>>> (DRILL-7785). N
Re: [VOTE] Release Apache Drill 1.18.0 - RC0
Hi all, I have looked into the DRILL-7785, and the problem is not in Drill, so it is not a blocker for the release. For more details please refer to my comment <https://issues.apache.org/jira/browse/DRILL-7785?focusedCommentId=17187629=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-17187629> on this ticket. Kind regards, Volodymyr Vysotskyi On Mon, Aug 31, 2020 at 4:26 AM Abhishek Girish wrote: > Yup we can certainly include it if RC0 fails. So far I’m inclined to not > consider it a blocker. I’ve requested Vova and Anton to take a look. > > So folks, please continue to test the candidate. > > On Sun, Aug 30, 2020 at 6:16 PM Charles Givre wrote: > > > Ok. Are you looking to include DRILL-7785? I don't think it's a > blocker, > > but if we find anything with RC0... let's make sure we get it in. > > > > -- C > > > > > > > > > On Aug 30, 2020, at 9:14 PM, Abhishek Girish > wrote: > > > > > > > > > > Hey Charles, > > > > > > > > > > I would have liked to. We did get one of the PRs merged after the > master > > > > > branch was closed as I hadn't made enough progress with the release > yet. > > > > > But that’s not the case now. > > > > > > > > > > Unless DRILL-7781 is a release blocker, we should probably skip it. So > > far, > > > > > a lot of effort has gone into getting RC0 ready. So I'm hoping to get > > this > > > > > closed asap. > > > > > > > > > > Regards, > > > > > Abhishek > > > > > > > > > > On Sun, Aug 30, 2020 at 6:07 PM Charles Givre > wrote: > > > > > > > > > >> HI Abhishek, > > > > >> > > > > >> Can we merge DRILL-7781? We really shouldn't ship something with a > > simple > > > > >> bug like this. > > > > >> > > > > >> -- C > > > > >> > > > > >> > > > > >> > > > > >> > > > > >> > > > > >>> On Aug 30, 2020, at 8:40 PM, Abhishek Girish > > wrote: > > > > >> > > > > >>> > > > > >> > > > > >>> Advanced tests from [5] are also complete. All 7500+ tests passed, > > except > > > > >> > > > > >>> for a few relating to known resource issues (drillbit connectivity / > > OOM > > > > >> > > > > >>> /...). Plus a few with the same symptoms as DRILL-7785. > > > > >> > > > > >>> > > > > >> > > > > >>> On Sun, Aug 30, 2020 at 2:17 PM Abhishek Girish > > > > >> wrote: > > > > >> > > > > >>> > > > > >> > > > > >>>> Wanted to share an update on some of the testing I've done from my > > side: > > > > >> > > > > >>>> > > > > >> > > > > >>>> All Functional tests from [5] (plus private Customer tests) are > > > > >> complete. > > > > >> > > > > >>>> 10,000+ tests have passed. However, I did see an issue with Hive ORC > > > > >> tables > > > > >> > > > > >>>> (DRILL-7785). Need to investigate if it's a blocker for the release. > > > > >> > > > > >>>> > > > > >> > > > > >>>> Of course, all unit tests (part of the AD repo) - for both default > and > > > > >> > > > > >>>> 'mapr' profiles are also successful. > > > > >> > > > > >>>> > > > > >> > > > > >>>> > > > > >> > > > > >>>> > > > > >> > > > > >>>> [5] https://github.com/mapr/drill-test-framework > > > > >> > > > > >>>> > > > > >> > > > > >>>> On Sun, Aug 30, 2020 at 10:14 AM Abhishek Girish < > agir...@apache.org> > > > > >> > > > > >>>> wrote: > > > > >> > > > > >>>> > > > > >> > > > > >>>>> Hi all, > > > > >> > > > > >>>>> > > >
Re: [DISCUSS] Drill 1.18.0 release
Hi Abhishek, Charles, I don't think that this PR is ready to be merged, or will be ready in a couple of days since the issue I have pointed would require large effort to fix. I propose to continue with the release process for now and merge this PR after the release when it will be ready. Kind regards, Volodymyr Vysotskyi On Mon, Aug 24, 2020 at 7:07 AM Abhishek Girish wrote: > Looks like Vova has already added his comments and we are waiting on > Bowen's responses. I don't mind waiting a day or two to see if this can > indeed be completed. I'll watch the PR for now. > > -Abhishek > > > On Sun, Aug 23, 2020 at 11:47 AM Charles Givre wrote: > > > HI Abhishek, > > Thanks for taking the lead on this. I'd really like to see if we can get > > DRILL-7745 (https://github.com/apache/drill/pull/2084 < > > https://github.com/apache/drill/pull/2084>) into this release. We are > > very close to it being done, but there are some unanswered questions from > > Vova. If you could take a look that would be great as well. > > Thanks, > > -- C > > > > > > > On Aug 21, 2020, at 4:58 PM, Abhishek Girish > wrote: > > > > > > Hey folks, > > > > > > Having discussed with Vova, I propose creating a branch on Monday > > morning. > > > And holding off commits to master from Sunday evening. The plan is to > get > > > the release out before the end of the month. > > > > > > On a side note, Vova is leaving HPE CV team end of this month. They > may > > > reduce his involvement with Drill (But I do hope he stays active with > the > > > project). So it's even more important for us to get this release out on > > > time. > > > > > > Please share if any concerns. > > > > > > Thanks, > > > Abhishek > > > > > > On Thu, Jul 2, 2020 at 10:07 AM Abhishek Girish > > wrote: > > > > > >> Update: Due to some work + personal reasons, I've been unable to make > > much > > >> progress with the release. > > >> > > >> Charles, I'm trying to get the K8S work in, and that's partly the > reason > > >> for the delay. > > >> > > >> I'm still committed to getting this out soon. Thanks for all your > > >> patience. > > >> > > >> Regards, > > >> Abhishek > > >> > > >> On Thu, Jun 18, 2020 at 8:08 AM Charles Givre > wrote: > > >> > > >>> Hi Abhishek, > > >>> > > >>> > > >>> Do you think you'll be able to get the K8s work into this release? > > >>> > > >>> > > >>> Thanks, > > >>> > > >>> > > >>> -- C > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>> > > >>>> On Jun 18, 2020, at 11:07 AM, Abhishek Girish > > >>> wrote: > > >>> > > >>> > > >>>> > > >>> > > >>> > > >>>> Hello all, > > >>> > > >>> > > >>>> > > >>> > > >>> > > >>>> I'm thinking of cutting a branch on Jun 21. Please get your commits > in > > >>> by > > >>> > > >>> > > >>>> then. Hopefully we can roll out a release on or before Jun 30. > > >>> > > >>> > > >>>> > > >>> > > >>> > > >>>> Regards, > > >>> > > >>> > > >>>> Abhishek > > >>> > > >>> > > >>>> > > >>> > > >>> > > >>>> On Sun, Jun 14, 2020 at 6:58 AM Volodymyr Vysotskyi < > > >>> volody...@apache.org> > > >>> > > >>> > > >>>> wrote: > > >>> > > >>> > > >>>> > > >>> > > >>> > > >>>>> Hi Abhishek, > > >>> > > >>> > > >>>>> > > >>> > > >>> > > >>>>> For now, I'm not sure that I'll be able to resolve DRILL-7523 > > >>> > > >>> > > >>>>> <https://issues.apache.org/jira/brows
Re: Calcite Question for you
Hi Charles, I'm not sure that it can be addressed in Drill only. Kind regards, Volodymyr Vysotskyi On Thu, Jun 25, 2020 at 5:10 AM Charles Givre wrote: > Hi Ted, > That probably would solve the issue for the user. However what's happening > is that certain JDBC DBs (Oracle) are returning data types that Drill does > not support and the queries are failing with a validation error. IMHO, > there should be some way to either tell Drill that this weird field is a > specific datatype or something so that the user can access the data. > It seemed as if Vova proposed a solution to Calcite but it was never > adopted. I'm wondering if there's some way that we can either extend that > class, or somehow catch the exception which is being thrown and fix the > issue. > Best, > -- C > > > > > On Jun 24, 2020, at 6:20 PM, Ted Dunning wrote: > > > > Charles, > > > > Do you think that suggesting using a non-materialized view would help in > > this case? > > > > On Wed, Jun 24, 2020 at 2:31 PM Charles Givre wrote: > > > >> Hi Vova, > >> I have a Calcite question for you. I’m helping someone debug an issue > >> they’re having connecting Drill to an OracleDB and they’re running into > the > >> issue you reported here: > >> https://issues.apache.org/jira/browse/CALCITE-3533 < > >> https://issues.apache.org/jira/browse/CALCITE-3533> > >> > >> I’m wondering if there’s a way that we could fix this on the Drill side > >> since it doesn’t look like Calcite is going to fix this. What do you > think? > >> Best, > >> — C > >
Re: [DISCUSS] Drill 1.18.0 release
Hi Abhishek, For now, I'm not sure that I'll be able to resolve DRILL-7523 <https://issues.apache.org/jira/browse/DRILL-7523> soon since there are still a lot of unresolved issues. So please do not block the release because of this ticket. Kind regards, Volodymyr Vysotskyi On Thu, Jun 4, 2020 at 5:47 AM Charles Givre wrote: > HI Abhishek, > Thank you for volunteering to be release manager. Regarding the release, > I'd like to see the PR for the Druid storage plugin committed. It's pretty > close to being done, I'd guess another week or so. There are a few other > PRs that Paul and I were working on which would be good to include. > > The recently submitted IPFS plugin would be nice to include, but we'll see > how rapidly that progresses. Again, thanks for your help with this. > -- C > > > > > On May 27, 2020, at 7:27 AM, Volodymyr Vysotskyi > wrote: > > > > Hi Abhishek, > > > > Thanks for starting this discussion and for being a volunteer for the > > upcoming release! > > > > I want to include DRILL-7523 > > <https://issues.apache.org/jira/browse/DRILL-7523> into this release - > it > > is based on Ihor's work for updating Calcite to 1.22. > > I'll try to complete it in a couple of weeks, but this ticket is not a > > blocking one for the release. > > > > Kind regards, > > Volodymyr Vysotskyi > > > > > > On Tue, May 26, 2020 at 9:35 PM Abhishek Girish > wrote: > > > >> Hey everyone, > >> > >> It's been over 5 months since the last release, so it is about time we > >> begin to discuss the next one. I am volunteering to be the release > manager. > >> Vova has graciously offered me his full support with the release process > >> and I look forward to working with you all to get this release out in > time. > >> > >> We already have substantial changes in master (~130+ commits) and many > are > >> in the pipeline. If there are any specific issues that you feel we > *must* > >> include in the release, please let us know by replying to this thread. > >> Based on that we can define the release cut off date. > >> > >> Regards, > >> Abhishek > >> > >
Re: [DISCUSS] Drill 1.18.0 release
Hi Abhishek, Thanks for starting this discussion and for being a volunteer for the upcoming release! I want to include DRILL-7523 <https://issues.apache.org/jira/browse/DRILL-7523> into this release - it is based on Ihor's work for updating Calcite to 1.22. I'll try to complete it in a couple of weeks, but this ticket is not a blocking one for the release. Kind regards, Volodymyr Vysotskyi On Tue, May 26, 2020 at 9:35 PM Abhishek Girish wrote: > Hey everyone, > > It's been over 5 months since the last release, so it is about time we > begin to discuss the next one. I am volunteering to be the release manager. > Vova has graciously offered me his full support with the release process > and I look forward to working with you all to get this release out in time. > > We already have substantial changes in master (~130+ commits) and many are > in the pipeline. If there are any specific issues that you feel we *must* > include in the release, please let us know by replying to this thread. > Based on that we can define the release cut off date. > > Regards, > Abhishek >
[DISCUSS] Using GitHub Actions for CI
Hi all, I want to discuss using GitHub Actions for running Drill unit tests. Currently, we use Travis to build the project and run *partial* tests suite for every pull request and new commits pushed to the master branch. Also, we have a configuration for CircleCI which allows running more unit tests for user repositories, including jobs for JDK 8, 11-13. CircleCI is not set up for Apache Drill since INFRA can't allow write access for 3d party [1]. GitHub Actions provides more resources for running jobs and has softer limitations compared with CI mentioned above (for example, it allows 20 concurrent jobs and with a time limit of 6 hours [2] compared to 50 minutes for Travis). So GitHub Actions may be used as a single CI and will be able to run *full* tests suite. Here is the Jira tor this with the latest status: https://issues.apache.org/jira/browse/DRILL-7543 Are there any thoughts or objections regarding moving to GitHub Actions? [1] https://issues.apache.org/jira/browse/INFRA-17133 [2] https://help.github.com/en/actions/automating-your-workflow-with-github-actions/about-github-actions#usage-limits Kind regards, Volodymyr Vysotskyi
Re: [ANNOUNCE] New PMC member: Bohdan Kazydub
Congrats, Bohdan! Kind regards, Volodymyr Vysotskyi On Wed, Jan 29, 2020 at 8:39 PM Paul Rogers wrote: > Congratulations Bohdan, well deserved! > - Paul > > > > On Wednesday, January 29, 2020, 09:41:21 AM PST, Arina Ielchiieva < > ar...@apache.org> wrote: > > I am pleased to announce that Drill PMC invited Bohdan Kazydub to the PMC > and > he has accepted the invitation. > > Congratulations Bohdan and welcome! > > - Arina > (on behalf of Drill PMC) >
Re: Slack Channel
+1 for asking to send to the Apache Drill mailing list. The advantage of mailing list usage is that its history of replies is preserved and may be found in mailing list archives or through googling, so it becomes possible for others with the same problem to find the solution or refer such users to the mailing thread where the specific problem was discussed. Kind regards, Volodymyr Vysotskyi On Thu, Jan 23, 2020 at 9:54 AM Arina Yelchiyeva wrote: > Charles, I don’t think Slack channel is that popular among Drill devs. > I guess best recommendation is to ask to send email to user mailing list. > Maybe some automatic reply can be configured. > > Kind regards, > Arina > > > On Jan 22, 2020, at 9:33 PM, Charles Givre wrote: > > > > Hey Drill Devs > > There are two pending questions on the Drill slack channel, one relating > to Hive and the other relating to complex data in Drill. Could you guys > take a look? > > Thx, > > -- C > >
Re: About integration of drill and arrow
Hi Paul, Thanks for summarizing, it looks even better than my previous letters. Answering to Igor's question regarding conversion for join, I imagined it in the following way: Let's look at the simple example first: Join / \ DrillScanConvert operator (Arrow -> Drill) | ArrowEvfScan So EVF API may be used in Convert operator to create row set from Arrow vectors and populate row set in Drill format. These conversion operators may be inserted at the planning time using trait sets logic (similar to the insertion sort or distribution operators where it is required). Kind regards, Volodymyr Vysotskyi On Sun, Jan 12, 2020 at 11:50 PM Paul Rogers wrote: > Hi Volodymyr, > > You made a number of excellent points that we should remember as we > continue our discussion. If I may paraphrase: > > > 1. A conversion of our internal data layout will be complex. We can't > expect to do it in a single step. Some readers may never convert. For a > while, at least in a development branch, possibly in master, we must be > able to run the two systems together. > > > 2. Running two systems together requires conversion between formats occur > at some point in the DAG. > > > 3. We have discussed an internal API to isolate operators from the details > of the internal data layout. The column accessors (the core of EVF) are > helpful for operators that work with column values, but not for bulk > operations (such as exchanges, maybe Flatten, maybe Implicit Join, etc.) > Specialized operators (the MapR readers, possibly Parquet) may require > operations that do not yet appear in the column accessors. (The same can > probably said for code gen which does some pretty unusual things.) > > > 4. Before we tackle the full conversion, we should try Arrow (or whatever > option we choose) in selected scenarios to verify the benefit we believe we > will receive. > > You also suggest a strategy to address the above requirements. Again, to > paraphrase: > > > 1. Evaluate internal data layout alternatives to learn their advantages, > and to identify what a common API might need beyond what we have today. > > 2. Pick a data layout based on the facts discovered above. Let's call this > the "next gen" option. > > > 3. Enhance/develop the required internal APIs. > > 4. Develop a conversion operator that we can insert between value > vector-based operators and next gen-based operators. > > > 5. Convert operators step-by-step, testing each for performance and > functionality. Build on the APIs from step 3 and the conversion "shims" > from step 4. > > > Is this an accurate summary of your comments? > > Thanks, > - Paul > > > > On Friday, January 10, 2020, 9:55:49 AM PST, Volodymyr Vysotskyi < > volody...@apache.org> wrote: > > Hi Paul and Igor, > > It is great that the discussion has affected high-level questions of the > effort and benefits of moving to the Arrow. > The main arguments of moving to Arrow for me were possible performance > improvements (perhaps with Gandiva usage) and significant codebase > improvements (perhaps with the additional bug fixes) compared to current > Drill's vectors code. > > In my previous letter, I concentrated on the case when we will agree to > completely move to the Arrow and proposed the way, which in my opinion > would be more optimal and splittable. > > I understand your concerns about having mixed two systems. > I don't propose to recommend Drill users to enable Arrow usage while a lot > of conversions between batches would be happening. > But without trying to use Arrow classes step-by-step, we risk to end up > with dozens of unresolved Arrow-related issues, > a lot of changes in unmerged branches and merge conflicts after every > new commit into the master branch. > > Regarding unnecessary complexity connected with adapting Arrow and Drill > vectors to work together, this conversion may be done at EVF API level, no > need to dive into the vectors implementation details and issues of each > side, we just need to extend EVF API if required, and just be sure that > both implementations still works correctly with that API. > Even with your strategy, we should adopt EVF to work with Arrow, I just > propose to do it at the beginning and additionally preserve EVF with Drill > value vectors. > > Also, when we will be able to switch between Arrow and Drill even for some > operator kinds, we would be able to estimate how the operator performance > was changed and make the decision to continue the integration having real > numbers. > And for the case, if we wouldn't be satisfied with the performance changes, >
Re: About integration of drill and arrow
Hi Paul and Igor, It is great that the discussion has affected high-level questions of the effort and benefits of moving to the Arrow. The main arguments of moving to Arrow for me were possible performance improvements (perhaps with Gandiva usage) and significant codebase improvements (perhaps with the additional bug fixes) compared to current Drill's vectors code. In my previous letter, I concentrated on the case when we will agree to completely move to the Arrow and proposed the way, which in my opinion would be more optimal and splittable. I understand your concerns about having mixed two systems. I don't propose to recommend Drill users to enable Arrow usage while a lot of conversions between batches would be happening. But without trying to use Arrow classes step-by-step, we risk to end up with dozens of unresolved Arrow-related issues, a lot of changes in unmerged branches and merge conflicts after every new commit into the master branch. Regarding unnecessary complexity connected with adapting Arrow and Drill vectors to work together, this conversion may be done at EVF API level, no need to dive into the vectors implementation details and issues of each side, we just need to extend EVF API if required, and just be sure that both implementations still works correctly with that API. Even with your strategy, we should adopt EVF to work with Arrow, I just propose to do it at the beginning and additionally preserve EVF with Drill value vectors. Also, when we will be able to switch between Arrow and Drill even for some operator kinds, we would be able to estimate how the operator performance was changed and make the decision to continue the integration having real numbers. And for the case, if we wouldn't be satisfied with the performance changes, we may stop integrating before rewriting all the project code. But in this case, we would have minimum changes, enough for just using Arrow as a data source and data sink for easier integration with other projects. I think this is a required minimum we should provide independently on the decision about deeper integration. Kind regards, Volodymyr Vysotskyi On Fri, Jan 10, 2020 at 1:47 PM Igor Guzenko wrote: > Hello Drill Developers and Drill Users, > > This discussion started as migration to Arrow but uncovered questions of > strategical plans for moving towards Apache Drill 2.0. > Below are my personal thoughts of what we, as developers, should do to > offer Drill users better experience: > > 1. High performant bulk insertions into as many data sources as possible. > There is a whole bunch of different tools for data pipelining to use... > But why people who know SQL should spend time learning something new for > simply moving data between tools? > > 2. Improve the efficiency of memory management (EVF, resource management, > improved costs planning using meta store, etc.). Since we're dealing with > big data alongside other tools installed on data nodes we should utilize > memory very economically and effectively. > > 3. Make integration with all other tools and formats as stable as possible. > The high amount of bugs in the area tells that we have lots to improve. > Every user is happy when he gets a tool and it simply works as expected. > Also, analyze user requirements and provide integration with new most > popular tools. Querying high variety of > data sources were and still one of the biggest selling points. > > 4. Make code highly extensible and extremely friendly for contributions. No > one would want to spend years of learning to make a contribution. This is > why I want to see a lot of modules that are highly cohesive and define > clear APIs for interaction with each other. This is also about paying old > technical debts related to fat JDBC client, copy of web server in Drill on > YARN, mixing everything in exec module, etc. > > 5. Focus on performance improvements of every component, from query > planning to execution. > > These are my thoughts from developer's perspective. Since I'm just > developer from Ukraine and far far away from Drill users, I believe that > Charles Givre is the one who can build a strong Drill user community and > collect their requirements for us. > > > What relates to Volodymyr's suggestion about adapting Arrow and Drill > vectors to work together (the same step is required to implement an Arrow > client, suggested by Paul). > I'm totally against the idea because it brings a huge amount of unnecessary > complexity just to uncover small insides into the integration. First is > that this is against the whole idea of Arrow since the main idea of Arrow > is to provide unified columnar memory layout between different tools > without any data conversions. But the step exactly requires data > conversions, at least our nullability vector and their validity bitmaps are > n
Instructions for updating Drill Website
Hi all, I have updated the instruction for updating the Apache Drill Website, so you shouldn't face issues I have seen when was attempting to make some changes there. The instruction is placed in *README.md* file in *gp-pages* branch: https://github.com/apache/drill/blob/gh-pages/README.md Also, it would be good if devs whose changes have doc-impacting, update site documentation, especially when adding new plugins, functionality, etc. But don't forget to specify that this functionality is available only starting from a specific Drill version. Kind regards, Volodymyr Vysotskyi
Official Apache Drill Docker Images
Hi all, Some time ago we have introduced Docker Images for Drill and published them under custom repository. But now we have Official Docker Repository for Apache Drill placed in https://hub.docker.com/r/apache/drill. All images from our previous repository were pushed there and also DockerHub Automated Build was set up for the master branch which publishes images with master tag after the master branch is updated. Feel free to test it, and now even on the actual master branch! For the instructions on how to run Drill on Docker, please refer to https://drill.apache.org/docs/running-drill-on-docker/. Kind regards, Volodymyr Vysotskyi
Re: About integration of drill and arrow
Hi all, Glad to see that this discussion became active again! I have some comments regarding the steps for moving from Drill Vectors to Arrow Vectors. No doubt that using EVF for all operators and readers instead of value vectors will simplify things a lot. But considering the target goal - integration with Arrow, it may be the main show-stopper for it. There may be some operators which would be hard to adapt to use EVF, for example, I think Flatten operator will be among them since its implementation deeply connected with value vectors. Also, it requires moving all storage and format plugins to EVF, which also may be problematic, for example, some plugins like MaprDB have specific features, and it should be considered when moving to EVF. Some other plugins are so obsolete, that I'm not sure that they still work and that someone still uses it, so except moving to EVF, they should be resurrected to verify that they weren't broken more than before. This is a huge piece of work, and only after that, we will proceed with the next step - integrating Arrow to EVF and then handling new Arrow-related issues for all the operators and readers at the same time. I propose to update these steps a little bit. 1. I agree that at first, we should extract EVF-related classes into a separate module. 2. But as the next step, I propose to extract EVF API which doesn't depend on the vector implementation (Drill vectors, or Arrow ones). 3. After that, introduce module with Arrow which also implements this EVF API. 4. Introduce transformers that will be able to convert from Drill vectors into Arrow vectors and vice versa. These transformers may be implemented to work using EVF abstractions instead of operating with specific vector implementations. 5.1. At this point, we can introduce Arrow connectors to fetch the data in Arrow format or return it in such a format using transformers from step 4. 5.2. Also, at this point, we may start rewriting operators to EVF and switching EVF implementation from the EVF based on Drill Vectors to the implementation which uses Arrow Vectors. Or switching implementations for existing EVF-based format plugins and fix newly discovered issues in Arrow. Since at this point we will have operators which use Arrow format and operators which use Drill Vectors format, we should insert operators that transform one vector format to another introduced in step 4 between every pair of operators which returns batches in a different format. I know, that such an approach requires some additional work, like introducing transformers from step 4 and may cause some performance degradations for the case when format transformation is complex for some types and when we still have sequences of operators with different formats. But with this approach, transitioning to Arrow wouldn't be blocked until everything is moved to EVF and it would be possible to transmit step-by-step, and Drill still will be able to switch between formats if it would be required. Kind regards, Volodymyr Vysotskyi On Thu, Jan 9, 2020 at 2:45 PM Igor Guzenko wrote: > Hi Paul, > > Though I have very limited knowledge about Arrow at the moment, I can > highlight a few advantages of trying it: > 1. Allows fixing all the long-standing nullability issues and provide > better integration for storage plugins like Hive. >https://jira.apache.org/jira/browse/DRILL-1344 >https://jira.apache.org/jira/browse/DRILL-3831 >https://jira.apache.org/jira/browse/DRILL-4824 >https://jira.apache.org/jira/browse/DRILL-7255 >https://jira.apache.org/jira/browse/DRILL-7366 > 2. Some work was done by community to implement optimized Arrow readers for > Parquet and other formats We could try to adopt and check whether we > can benefit from them. > 3. Since Arrow is under active development we could try their newest > features, like Flight which promises improved data transfers over the > network. > > Thanks, > Igor > On Wed, Jan 8, 2020 at 11:55 PM Paul Rogers > wrote: > > > Hi Igor, > > > > Before diving into design issues, it may be worthwhile to think about the > > premise: should Drill adopt Arrow as its internal memory layout? This is > > the question that the team has wrestled with since Arrow was launched. > > Arrow has three parts. Let's think about each. > > > > First is a direct memory layout. The approach you suggest will let us > work > > with the Arrow memory format. Use EVF to access vectors, then the > > underlying vectors can be swapped from Drill to Arrow. But, what is the > > advantage of using Arrow? The arrow layout isn't better than Drill's; it > is > > just different. Adopting the Arrow memory layout by itself provides > little > > benefit, but bit cost. This is one reason the team has been so reluctant
Re: [VOTE] Release Apache Drill 1.17.0 - RC2
Hi all, Since the release is finished, the master branch is opened for new commits. Kind regards, Volodymyr Vysotskyi On Sat, Dec 28, 2019 at 10:56 AM Paul Rogers wrote: > Hi Arina, > > Thanks much to everyone who helped with the release; it is quite a bit of > work. Coordinating on doing the release builds sounds like a good plan. A > huge THANK YOU to Vova for gathering the release instructions! > > Thanks, > - Paul > > > > On Friday, December 27, 2019, 11:06:44 PM PST, Arina Yelchiyeva < > arina.yelchiy...@gmail.com> wrote: > > If release manager does not have access to the test cluster to run all > tests, before starting the release he can ask, those who have, to run tests > on the commit on which release candidate will be prepared. Other than that, > all the remaining steps does not require any special environment setup. > Vova is going to add release instruction and scripts to the Drill project > soon. > > Kind regards, > Arina > >
[ANNOUNCE] Apache Drill 1.17.0 Released
On behalf of the Apache Drill community, I am happy to announce the release of Apache Drill 1.17.0. Drill is an Apache open-source SQL query engine for Big Data exploration. Drill is designed from the ground up to support high-performance analysis on the semi-structured and rapidly evolving data coming from modern Big Data applications, while still providing the familiarity and ecosystem of ANSI SQL, the industry-standard query language. Drill provides plug-and-play integration with existing Apache Hive and Apache HBase deployments. For information about Apache Drill, and to get involved, visit the project website [1]. Total of 200 JIRA's are resolved in this release of Drill with following new features and improvements [2]: - Hive complex types support (DRILL-7251, DRILL-7252, DRILL-7253, DRILL-7254) - ESRI Shapefile (shp) (DRILL-4303) and Excel (DRILL-7177) format plugins support - Drill Metastore support (DRILL-7272, DRILL-7273, DRILL-7357) - Upgrade to HADOOP-3.2 (DRILL-6540) - Schema Provision using File / Table Function (DRILL-6835) - Parquet runtime row group pruning (DRILL-7062) - User-Agent UDFs (DRILL-7343) - Canonical Map support (DRILL-7096) - Kafka storage plugin improvements (DRILL-6739, DRILL-6723, DRILL-7164, DRILL-7290, DRILL-7388) For the full list please see release notes [3]. The binary and source artifacts are available here [4]. Thanks to everyone in the community who contributed to this release! 1. https://drill.apache.org/ 2. https://drill.apache.org/blog/2019/12/26/drill-1.17-released/ 3. https://drill.apache.org/docs/apache-drill-1-17-0-release-notes/ 4. https://drill.apache.org/download/ Kind regards, Volodymyr Vysotskyi
[RESULT] [VOTE] Release Apache Drill 1.17.0 - RC2
The vote passes. Thanks to everyone who has tested the release candidate and given their comments and votes. Final tally: 3x +1 (binding): Arina, Charles, Vova 2x +1 (non-binding): Denys, Holger No 0s or -1s. I'll start process for pushing the release artifacts and send an announcement once propagated. Kind regards, Volodymyr Vysotskyi
Re: [VOTE] Release Apache Drill 1.17.0 - RC2
Voting ends now. Thanks everybody who voted! I'll post the results soon. Kind regards, Volodymyr Vysotskyi On Wed, Dec 25, 2019 at 4:47 PM wrote: > + Installed from binary tar archive > + RC files have been replaced with stabile versions > + JDBC connect with custom authenticator > + Basic queries (CSV, text manipulation, join, etc.) > + Logged in to web UI, reviewed profiles & logs, submitted trivial queries > > LGTM > +1 (non-binding) > > Thx & BR > Holger >
[VOTE] Release Apache Drill 1.17.0 - RC2
Hi all, I'd like to propose the third release candidate (RC2) of Apache Drill, version 1.17.0. Changes since the previous release candidate: fixed show-stopper DRILL-7494 <https://issues.apache.org/jira/browse/DRILL-7494>. The release candidate covers a total of 205 resolved JIRAs [1]. Thanks to everyone who contributed to this release. The tarball artifacts are hosted at [2] and the maven artifacts are hosted at [3]. This release candidate is based on commit 2eb6bbe0501cb6553106e63dc1f2810ff10ae375 located at [4]. Please download and try out the release. The vote ends at 5 PM UTC (9 AM PDT, 7 PM EET, 10:30 PM IST), December 25, 2019 [ ] +1 [ ] +0 [ ] -1 Here's my vote: +1 [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344870 [2] http://home.apache.org/~volodymyr/drill/releases/1.17.0/rc2/ [3] https://repository.apache.org/content/repositories/orgapachedrill-1077/ [4] https://github.com/vvysotskyi/drill/commits/drill-1.17.0 Kind regards, Volodymyr Vysotskyi
Re: [CANCEL] [VOTE] Release Apache Drill 1.17.0 - RC1
Hi all, Regression found during validating the second release candidate (Apache Drill 1.17.0 - RC1) was fixed, so I'll start preparing a new release candidate soon. Kind regards, Volodymyr Vysotskyi On Fri, Dec 20, 2019 at 9:21 PM Volodymyr Vysotskyi wrote: > Hi all, > > The vote for Apache Drill 1.17.0 - RC1 was canceled due to DRILL-7494 > <https://issues.apache.org/jira/browse/DRILL-7494>. > > Thanks to all who voted. A new release will be tagged as 1.17.0 - RC2 and > will be available for voting soon. > > Kind regards, > Volodymyr Vysotskyi >
[CANCEL] [VOTE] Release Apache Drill 1.17.0 - RC1
Hi all, The vote for Apache Drill 1.17.0 - RC1 was canceled due to DRILL-7494 <https://issues.apache.org/jira/browse/DRILL-7494>. Thanks to all who voted. A new release will be tagged as 1.17.0 - RC2 and will be available for voting soon. Kind regards, Volodymyr Vysotskyi
Re: [VOTE] Release Apache Drill 1.17.0 - RC1
Hi all, Thank you all a lot for verifying the release! Looks like this issue mentioned by Holger with a custom authenticator is a blocker for this release - I was able to reproduce it with simple java client on Apache Drill 1.17.0, but with 1.16 it works fine. It was caused by DRILL-6540. I'll submit the fix for this soon. This issue sinks this release candidate. I'll prepare a new RC when this one is fixed and merged. Kind regards, Volodymyr Vysotskyi On Fri, Dec 20, 2019 at 6:43 PM Bohdan Kazydub wrote: > Checked queries on Parquet files containing complex types. > > +1. > > On Fri, Dec 20, 2019 at 6:37 PM Arina Yelchiyeva < > arina.yelchiy...@gmail.com> > wrote: > > > Hi Holder, > > > > Thanks for participation in release verification. > > > > Only regressions or some really significant issues can be a release > > blockers. > > As Anton mentioned, since this rc jars are present in previous Drill > > versions, it is not a regression. > > I believe Drill has dependency to this libraries (maybe transitive) and I > > am not sure we can remove them, maybe version can be upgraded. > > I would be good if you could check and file a Jira if issue is still > > actual. > > > > Kind regards, > > Arina > > > > > On Dec 20, 2019, at 6:18 PM, Anton Gozhiy wrote: > > > > > > Hi Holger, > > > These RC files were present in Drill 1.16.0, so this is not a > regression. > > > And about JDBC connection problem, could you please file a JIRA with > more > > > details? > > > > > > On Fri, Dec 20, 2019 at 6:02 PM wrote: > > > > > >> - Tested custom authenticator with JDBC connect to Drill, and was > unable > > >> to connect (connection hangs, ACL disabled, couldn't spend a deeper > look > > >> into it). > > >> + Custom authenticator login with local sqlline has been successful > > >> - Found RC files for 3rd party jars in the binary archive, which > > shouldn't > > >> be there in a release version (from my perspective): > > >> > > >> apache-drill-1.17.0/jars/3rdparty/kerb-client-1.0.0-RC2.jar > > >> apache-drill-1.17.0/jars/3rdparty/kerby-config-1.0.0-RC2.jar > > >> apache-drill-1.17.0/jars/3rdparty/kerb-common-1.0.0-RC2.jar > > >> apache-drill-1.17.0/jars/3rdparty/kerb-crypto-1.0.0-RC2.jar > > >> apache-drill-1.17.0/jars/3rdparty/kerb-util-1.0.0-RC2.jar > > >> apache-drill-1.17.0/jars/3rdparty/kerb-core-1.0.0-RC2.jar > > >> apache-drill-1.17.0/jars/3rdparty/kerby-asn1-1.0.0-RC2.jar > > >> apache-drill-1.17.0/jars/3rdparty/kerby-pkix-1.0.0-RC2.jar > > >> apache-drill-1.17.0/jars/3rdparty/kerby-util-1.0.0-RC2.jar > > >> apache-drill-1.17.0/jars/3rdparty/kerb-simplekdc-1.0.0-RC2.jar > > >> apache-drill-1.17.0/jars/3rdparty/kerb-server-1.0.0-RC2.jar > > >> apache-drill-1.17.0/jars/3rdparty/kerb-identity-1.0.0-RC2.jar > > >> apache-drill-1.17.0/jars/3rdparty/kerb-admin-1.0.0-RC2.jar > > >> > > >> From me: > > >> -1 (non-binding) > > >> > > >> BR > > >> Holger > > >> > > > > > > > > > -- > > > Sincerely, Anton Gozhiy > > > anton5...@gmail.com > > > > >
[VOTE] Release Apache Drill 1.17.0 - RC1
Hi all, I'd like to propose the second release candidate (RC1) of Apache Drill, version 1.17.0. Changes since the previous release candidate: fixed the following show-stoppers: DRILL-7484 <https://issues.apache.org/jira/browse/DRILL-7484> , DRILL-7485 <https://issues.apache.org/jira/browse/DRILL-7485>, DRILL-6332 <https://issues.apache.org/jira/browse/DRILL-6332>, DRILL-7472 <https://issues.apache.org/jira/browse/DRILL-7472>, DRILL-7474 <https://issues.apache.org/jira/browse/DRILL-7474>, DRILL-7476 <https://issues.apache.org/jira/browse/DRILL-7476>, DRILL-7481 <https://issues.apache.org/jira/browse/DRILL-7481>, DRILL-7482 <https://issues.apache.org/jira/browse/DRILL-7482>, DRILL-7473 <https://issues.apache.org/jira/browse/DRILL-7473>, DRILL-7479 <https://issues.apache.org/jira/browse/DRILL-7479>, DRILL-7483 <https://issues.apache.org/jira/browse/DRILL-7483>, DRILL-7486 <https://issues.apache.org/jira/browse/DRILL-7486>, and DRILL-7470 <https://issues.apache.org/jira/browse/DRILL-7470>. The release candidate covers a total of 203 resolved JIRAs [1]. Thanks to everyone who contributed to this release. The tarball artifacts are hosted at [2] and the maven artifacts are hosted at [3]. This release candidate is based on commit d65d2b4cc2aa5ea7a59cd40d0ad57a1e4639ae12 located at [4]. Please download and try out the release. The vote ends at 5 PM UTC (9 AM PDT, 7 PM EET, 10:30 PM IST), December 20, 2019 [ ] +1 [ ] +0 [ ] -1 Here's my vote: +1 [1] https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344870 [2] http://home.apache.org/~volodymyr/drill/releases/1.17.0/rc1/ [3] https://repository.apache.org/content/repositories/orgapachedrill-1076/ [4] https://github.com/vvysotskyi/drill/commits/drill-1.17.0 Kind regards, Volodymyr Vysotskyi
Re: [CANCEL] [VOTE] Release Apache Drill 1.17.0 - rc0
Hi Drillers, I have merged all pull requests for release-blocking issues and will start preparing new RC soon. Also, thank you all for so active and high-quality work, we have merged 12(!) pull requests since the last week. Kind regards, Volodymyr Vysotskyi On Mon, Dec 16, 2019 at 5:11 PM Volodymyr Vysotskyi wrote: > Hi all, > > Here is the current status of the release-blocking issues and issues which > will be included in this release: > - DRILL-7484 <https://issues.apache.org/jira/browse/DRILL-7484> - ready > to be merged; > - DRILL-7485 <https://issues.apache.org/jira/browse/DRILL-7485> - ready > to be merged; > - DRILL-6332 <https://issues.apache.org/jira/browse/DRILL-6332> - ready > to be merged; > - DRILL-7472 <https://issues.apache.org/jira/browse/DRILL-7472> - ready > to be merged; > - DRILL-7474 <https://issues.apache.org/jira/browse/DRILL-7474> - ready > to be merged; > - DRILL-7476 <https://issues.apache.org/jira/browse/DRILL-7476> - ready > to be merged; > - DRILL-7481 <https://issues.apache.org/jira/browse/DRILL-7481> - ready > to be merged; > - DRILL-7482 <https://issues.apache.org/jira/browse/DRILL-7482> - ready > to be merged; > - DRILL-7473 <https://issues.apache.org/jira/browse/DRILL-7473> - PR is > created and reviewed, CR comments should be addressed; > - DRILL-7479 <https://issues.apache.org/jira/browse/DRILL-7479> - PR is > created and reviewed, CR comments should be addressed; > - DRILL-7483 <https://issues.apache.org/jira/browse/DRILL-7483> - PR is > created and reviewed, CR comments should be addressed; > > Thanks to all for working on and reviewing fixes for these issues. > Please let me know if there are any additional issues that should be > included in this release, or if there are some troubles with the issues > listed above. > > Kind regards, > Volodymyr Vysotskyi > > > On Fri, Dec 13, 2019 at 5:52 PM Vova Vysotskyi wrote: > >> If the PR will be ready to be merged before other blocking Jiras are >> resolved, we can include it. >> >> Kind regards, >> Volodymyr Vysotskyi >> >> >> On Fri, Dec 13, 2019 at 5:42 PM Charles Givre wrote: >> >>> One more thing... >>> Do you think we could include DRILL-6332 ( >>> https://github.com/apache/drill/pull/1931 < >>> https://github.com/apache/drill/pull/1931>)? This seems like a very >>> minor change that could have a significant impact on potential users. >>> >>> I'll get DRILL-7484 and DRILL-7485 cleaned up and done over the weekend. >>> --C >>> >>> > On Dec 13, 2019, at 10:40 AM, Volodymyr Vysotskyi < >>> volody...@apache.org> wrote: >>> > >>> > Hi Charles, >>> > >>> > Yes, I think we can also include this one. Thanks for finding this >>> issue >>> > and submitting the fix! >>> > >>> > Kind regards, >>> > Volodymyr Vysotskyi >>> > >>> > >>> > On Fri, Dec 13, 2019 at 5:13 PM Charles Givre >>> wrote: >>> > >>> >> Hi Volodymyr, >>> >> I'm going to get DRILL-7484 resolved over the weekend, but when I >>> started >>> >> working on it, I found another issue with the PCAP reader that causes >>> >> NPEs. The resolution was a minor change,( >>> >> https://github.com/apache/drill/pull/1932 < >>> >> https://github.com/apache/drill/pull/1932>). Could we please include >>> >> DRILL-7485 as well in the next RC? >>> >> Thanks, >>> >> -- C >>> >> >>> >> >>> >> >>> >>> On Dec 13, 2019, at 9:08 AM, Volodymyr Vysotskyi < >>> volody...@apache.org> >>> >> wrote: >>> >>> >>> >>> Hi all, >>> >>> >>> >>> The vote for Apache Drill 1.17.0 - rc0 was canceled. >>> >>> >>> >>> Thanks to all who voted. >>> >>> >>> >>> A new release will be tagged as 1.17.0-rc1 and will be available >>> after >>> >> the >>> >>> following jiras are resolved: >>> >>> - DRILL-7473 <https://issues.apache.org/jira/browse/DRILL-7473> - >>> >> Parquet >>> >>> reader failed to get field of repeated map (Bohdan Kazydub is >>> working on >>> >>> the fix) >>> >>> - DRILL-7484 <https://issues.apache.org/jira/browse/DRILL-7484> - >>> >> Malware >>> >>> found with some antiviruses in the Drill test resources folder >>> (Charles >>> >>> Givre is working on the fix) >>> >>> >>> >>> Kind regards, >>> >>> Volodymyr Vysotskyi >>> >> >>> >> >>> >>>
Re: [CANCEL] [VOTE] Release Apache Drill 1.17.0 - rc0
Hi all, Here is the current status of the release-blocking issues and issues which will be included in this release: - DRILL-7484 <https://issues.apache.org/jira/browse/DRILL-7484> - ready to be merged; - DRILL-7485 <https://issues.apache.org/jira/browse/DRILL-7485> - ready to be merged; - DRILL-6332 <https://issues.apache.org/jira/browse/DRILL-6332> - ready to be merged; - DRILL-7472 <https://issues.apache.org/jira/browse/DRILL-7472> - ready to be merged; - DRILL-7474 <https://issues.apache.org/jira/browse/DRILL-7474> - ready to be merged; - DRILL-7476 <https://issues.apache.org/jira/browse/DRILL-7476> - ready to be merged; - DRILL-7481 <https://issues.apache.org/jira/browse/DRILL-7481> - ready to be merged; - DRILL-7482 <https://issues.apache.org/jira/browse/DRILL-7482> - ready to be merged; - DRILL-7473 <https://issues.apache.org/jira/browse/DRILL-7473> - PR is created and reviewed, CR comments should be addressed; - DRILL-7479 <https://issues.apache.org/jira/browse/DRILL-7479> - PR is created and reviewed, CR comments should be addressed; - DRILL-7483 <https://issues.apache.org/jira/browse/DRILL-7483> - PR is created and reviewed, CR comments should be addressed; Thanks to all for working on and reviewing fixes for these issues. Please let me know if there are any additional issues that should be included in this release, or if there are some troubles with the issues listed above. Kind regards, Volodymyr Vysotskyi On Fri, Dec 13, 2019 at 5:52 PM Vova Vysotskyi wrote: > If the PR will be ready to be merged before other blocking Jiras are > resolved, we can include it. > > Kind regards, > Volodymyr Vysotskyi > > > On Fri, Dec 13, 2019 at 5:42 PM Charles Givre wrote: > >> One more thing... >> Do you think we could include DRILL-6332 ( >> https://github.com/apache/drill/pull/1931 < >> https://github.com/apache/drill/pull/1931>)? This seems like a very >> minor change that could have a significant impact on potential users. >> >> I'll get DRILL-7484 and DRILL-7485 cleaned up and done over the weekend. >> --C >> >> > On Dec 13, 2019, at 10:40 AM, Volodymyr Vysotskyi >> wrote: >> > >> > Hi Charles, >> > >> > Yes, I think we can also include this one. Thanks for finding this issue >> > and submitting the fix! >> > >> > Kind regards, >> > Volodymyr Vysotskyi >> > >> > >> > On Fri, Dec 13, 2019 at 5:13 PM Charles Givre wrote: >> > >> >> Hi Volodymyr, >> >> I'm going to get DRILL-7484 resolved over the weekend, but when I >> started >> >> working on it, I found another issue with the PCAP reader that causes >> >> NPEs. The resolution was a minor change,( >> >> https://github.com/apache/drill/pull/1932 < >> >> https://github.com/apache/drill/pull/1932>). Could we please include >> >> DRILL-7485 as well in the next RC? >> >> Thanks, >> >> -- C >> >> >> >> >> >> >> >>> On Dec 13, 2019, at 9:08 AM, Volodymyr Vysotskyi < >> volody...@apache.org> >> >> wrote: >> >>> >> >>> Hi all, >> >>> >> >>> The vote for Apache Drill 1.17.0 - rc0 was canceled. >> >>> >> >>> Thanks to all who voted. >> >>> >> >>> A new release will be tagged as 1.17.0-rc1 and will be available after >> >> the >> >>> following jiras are resolved: >> >>> - DRILL-7473 <https://issues.apache.org/jira/browse/DRILL-7473> - >> >> Parquet >> >>> reader failed to get field of repeated map (Bohdan Kazydub is working >> on >> >>> the fix) >> >>> - DRILL-7484 <https://issues.apache.org/jira/browse/DRILL-7484> - >> >> Malware >> >>> found with some antiviruses in the Drill test resources folder >> (Charles >> >>> Givre is working on the fix) >> >>> >> >>> Kind regards, >> >>> Volodymyr Vysotskyi >> >> >> >> >> >>
Re: [CANCEL] [VOTE] Release Apache Drill 1.17.0 - rc0
Hi Charles, Yes, I think we can also include this one. Thanks for finding this issue and submitting the fix! Kind regards, Volodymyr Vysotskyi On Fri, Dec 13, 2019 at 5:13 PM Charles Givre wrote: > Hi Volodymyr, > I'm going to get DRILL-7484 resolved over the weekend, but when I started > working on it, I found another issue with the PCAP reader that causes > NPEs. The resolution was a minor change,( > https://github.com/apache/drill/pull/1932 < > https://github.com/apache/drill/pull/1932>). Could we please include > DRILL-7485 as well in the next RC? > Thanks, > -- C > > > > > On Dec 13, 2019, at 9:08 AM, Volodymyr Vysotskyi > wrote: > > > > Hi all, > > > > The vote for Apache Drill 1.17.0 - rc0 was canceled. > > > > Thanks to all who voted. > > > > A new release will be tagged as 1.17.0-rc1 and will be available after > the > > following jiras are resolved: > > - DRILL-7473 <https://issues.apache.org/jira/browse/DRILL-7473> - > Parquet > > reader failed to get field of repeated map (Bohdan Kazydub is working on > > the fix) > > - DRILL-7484 <https://issues.apache.org/jira/browse/DRILL-7484> - > Malware > > found with some antiviruses in the Drill test resources folder (Charles > > Givre is working on the fix) > > > > Kind regards, > > Volodymyr Vysotskyi > >
[CANCEL] [VOTE] Release Apache Drill 1.17.0 - rc0
Hi all, The vote for Apache Drill 1.17.0 - rc0 was canceled. Thanks to all who voted. A new release will be tagged as 1.17.0-rc1 and will be available after the following jiras are resolved: - DRILL-7473 <https://issues.apache.org/jira/browse/DRILL-7473> - Parquet reader failed to get field of repeated map (Bohdan Kazydub is working on the fix) - DRILL-7484 <https://issues.apache.org/jira/browse/DRILL-7484> - Malware found with some antiviruses in the Drill test resources folder (Charles Givre is working on the fix) Kind regards, Volodymyr Vysotskyi
[ANNOUNCE] New PMC member: Ihor Guzenko
I am pleased to announce that Drill PMC invited Ihor Guzenko to the PMC and he has accepted the invitation. Congratulations Ihor and welcome! - Vova (on behalf of Drill PMC)
Re: [VOTE] Release Apache Drill 1.17.0 - RC0
Thank you all for pointing out these issues. The vote for Apache Drill 1.17.0 - RC0 is canceled due to DRILL-7473 <https://issues.apache.org/jira/browse/DRILL-7473> and DRILL-7476 <https://issues.apache.org/jira/browse/DRILL-7476>. I'll create a new release candidate when these issues are fixed. Kind regards, Volodymyr Vysotskyi On Tue, Dec 10, 2019 at 3:34 PM Arina Yelchiyeva wrote: > And one more regression: https://issues.apache.org/jira/browse/DRILL-7476 > <https://issues.apache.org/jira/browse/DRILL-7476> > > Kind regards, > Arina > > > On Dec 10, 2019, at 1:59 PM, Igor Guzenko > wrote: > > > > Hi all, > > > > I've found regression with repeated maps in parquet [1]. Also I > personally > > don't like that difference between previous and current tar.gz size is > 124 > > Mb. > > > > My vote: -1. > > > > [1] https://issues.apache.org/jira/browse/DRILL-7473 > > > > Thanks, > > Igor > > > > > > On Mon, Dec 9, 2019 at 2:47 PM Volodymyr Vysotskyi > > > wrote: > > > >> Hi all, > >> > >> I'd like to propose the first release candidate (RC0) of Apache Drill, > >> version 1.17.0. > >> > >> The release candidate covers a total of 190 resolved JIRAs [1]. Thanks > to > >> everyone who contributed to this release. > >> > >> The tarball artifacts are hosted at [2] and the maven artifacts are > hosted > >> at [3]. > >> > >> This release candidate is based on commit > >> 4171eeac876249731ccf86d116455dd8d53c44e9 located at [4]. > >> > >> The vote ends at 13:00 PM UTC (5:00 AM PST, 3:00 PM EET, 6:30 PM IST), > >> December 12, 2019. > >> > >> [ ] +1 > >> [ ] +0 > >> [ ] -1 > >> > >> Here's my vote: +1 > >> > >> [1] > >> > >> > https://issues.apache.org/jira/secure/Releashttps://github.com/vvysotskyi/drill/commits/drill-1.17.0eNote.jspa?projectId=12313820=12344870 > >> < > >> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344870 > >>> > >> [2] http://home.apache.org/~volodymyr/drill/releases/1.17.0/rc0/ > >> [3] > >> https://repository.apache.org/content/repositories/orgapachedrill-1075/ > >> [4] https://github.com/vvysotskyi/drill/commits/drill-1.17.0 > >> > >> Kind regards, > >> Volodymyr Vysotskyi > >> > >
[VOTE] Release Apache Drill 1.17.0 - RC0
Hi all, I'd like to propose the first release candidate (RC0) of Apache Drill, version 1.17.0. The release candidate covers a total of 190 resolved JIRAs [1]. Thanks to everyone who contributed to this release. The tarball artifacts are hosted at [2] and the maven artifacts are hosted at [3]. This release candidate is based on commit 4171eeac876249731ccf86d116455dd8d53c44e9 located at [4]. The vote ends at 13:00 PM UTC (5:00 AM PST, 3:00 PM EET, 6:30 PM IST), December 12, 2019. [ ] +1 [ ] +0 [ ] -1 Here's my vote: +1 [1] https://issues.apache.org/jira/secure/Releashttps://github.com/vvysotskyi/drill/commits/drill-1.17.0eNote.jspa?projectId=12313820=12344870 <https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344870> [2] http://home.apache.org/~volodymyr/drill/releases/1.17.0/rc0/ [3] https://repository.apache.org/content/repositories/orgapachedrill-1075/ [4] https://github.com/vvysotskyi/drill/commits/drill-1.17.0 Kind regards, Volodymyr Vysotskyi
Re: [DISCUSS] 1.17.0 release
Hi all, I have merged all ready-to-commit pull requests into the master and I'm starting preparing a release candidate. Note for committers: until the release is not finished, please *do not merge any new commits into the master branch*. Kind regards, Volodymyr Vysotskyi On Fri, Nov 29, 2019 at 8:01 PM Volodymyr Vysotskyi wrote: > Hi all > > The release is coming, PR for DRILL-7450 > <https://issues.apache.org/jira/browse/DRILL-7450> created and approved, > I'll merge it with other ready-to-commit PRs (including DRILL-6540 > <https://issues.apache.org/jira/browse/DRILL-6540> since DRILL-7393 is > almost finished). Also, I'll try to fix some release-related issues like > DRILL-7220. > > So I wanted to inform you that the master branch will be frozen for > commits the next week due to the release preparation. I'll send an > additional announcement soon. > > Kind regards, > Volodymyr Vysotskyi > > > On Fri, Nov 22, 2019 at 8:06 PM Volodymyr Vysotskyi > wrote: > >> Hi all, >> >> I need more time to finish fix for DRILL-7450 >> <https://issues.apache.org/jira/browse/DRILL-7450>, so I suppose I'll >> finish it on the next week, and after it is merged, I'll start the release >> preparation. >> >> The good news is that DRILL-6540 >> <https://issues.apache.org/jira/browse/DRILL-6540> finished and all >> required checks were done, so it will be included in the release. >> >> Kind regards, >> Volodymyr Vysotskyi >> >> >> On Mon, Nov 18, 2019 at 1:09 PM Volodymyr Vysotskyi >> wrote: >> >>> Hi Charles, >>> >>> We found a blocking issue for the release: DRILL-7450 >>> <https://issues.apache.org/jira/browse/DRILL-7450>. >>> I need a couple of days to fix it. I'll share updates soon. >>> >>> Kind regards, >>> Volodymyr Vysotskyi >>> >>> >>> On Fri, Nov 15, 2019 at 6:56 PM Charles Givre wrote: >>> >>>> Hi Volodymyr, >>>> I wanted to follow up to see how we are doing with the Drill release >>>> candidate? Are we getting close to a release? >>>> Thanks, >>>> -- C >>>> >>>> >>>> > On Nov 4, 2019, at 1:57 PM, Volodymyr Vysotskyi >>>> wrote: >>>> > >>>> > Hello Drillers, >>>> > >>>> > It's about 6 months have passed since the previous release and its >>>> time to >>>> > discuss and start planning for the 1.17.0. >>>> > I volunteer to manage the new release. >>>> > >>>> > We have 6 Jira tickets with "reviewable" status, 2 "in progress" and >>>> 6 open >>>> > tickets [1]. Jira tickets marked as ready to commit will be merged >>>> soon and >>>> > I hope other PRs from this list will be completed before the cut-off >>>> date. >>>> > >>>> > Among these tickets, I want to include DRILL-7273 [2] to the release >>>> (pull >>>> > request is already opened, but CR comments should be addressed). >>>> > >>>> > I would like to propose a preliminary release cut-off date as the >>>> middle of >>>> > the next week (Nov, 13) or the beginning of the week after that (Nov, >>>> 18). >>>> > >>>> > Please let me know if there are any other Jira tickets you working on >>>> which >>>> > should be included in this release. >>>> > >>>> > [1] >>>> > >>>> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185=DRILL=1425 >>>> > [2] https://issues.apache.org/jira/browse/DRILL-7273 >>>> > >>>> > Kind regards, >>>> > Volodymyr Vysotskyi >>>> >>>>
Re: [DISCUSS] 1.17.0 release
Hi all The release is coming, PR for DRILL-7450 <https://issues.apache.org/jira/browse/DRILL-7450> created and approved, I'll merge it with other ready-to-commit PRs (including DRILL-6540 <https://issues.apache.org/jira/browse/DRILL-6540> since DRILL-7393 is almost finished). Also, I'll try to fix some release-related issues like DRILL-7220. So I wanted to inform you that the master branch will be frozen for commits the next week due to the release preparation. I'll send an additional announcement soon. Kind regards, Volodymyr Vysotskyi On Fri, Nov 22, 2019 at 8:06 PM Volodymyr Vysotskyi wrote: > Hi all, > > I need more time to finish fix for DRILL-7450 > <https://issues.apache.org/jira/browse/DRILL-7450>, so I suppose I'll > finish it on the next week, and after it is merged, I'll start the release > preparation. > > The good news is that DRILL-6540 > <https://issues.apache.org/jira/browse/DRILL-6540> finished and all > required checks were done, so it will be included in the release. > > Kind regards, > Volodymyr Vysotskyi > > > On Mon, Nov 18, 2019 at 1:09 PM Volodymyr Vysotskyi > wrote: > >> Hi Charles, >> >> We found a blocking issue for the release: DRILL-7450 >> <https://issues.apache.org/jira/browse/DRILL-7450>. >> I need a couple of days to fix it. I'll share updates soon. >> >> Kind regards, >> Volodymyr Vysotskyi >> >> >> On Fri, Nov 15, 2019 at 6:56 PM Charles Givre wrote: >> >>> Hi Volodymyr, >>> I wanted to follow up to see how we are doing with the Drill release >>> candidate? Are we getting close to a release? >>> Thanks, >>> -- C >>> >>> >>> > On Nov 4, 2019, at 1:57 PM, Volodymyr Vysotskyi >>> wrote: >>> > >>> > Hello Drillers, >>> > >>> > It's about 6 months have passed since the previous release and its >>> time to >>> > discuss and start planning for the 1.17.0. >>> > I volunteer to manage the new release. >>> > >>> > We have 6 Jira tickets with "reviewable" status, 2 "in progress" and 6 >>> open >>> > tickets [1]. Jira tickets marked as ready to commit will be merged >>> soon and >>> > I hope other PRs from this list will be completed before the cut-off >>> date. >>> > >>> > Among these tickets, I want to include DRILL-7273 [2] to the release >>> (pull >>> > request is already opened, but CR comments should be addressed). >>> > >>> > I would like to propose a preliminary release cut-off date as the >>> middle of >>> > the next week (Nov, 13) or the beginning of the week after that (Nov, >>> 18). >>> > >>> > Please let me know if there are any other Jira tickets you working on >>> which >>> > should be included in this release. >>> > >>> > [1] >>> > >>> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185=DRILL=1425 >>> > [2] https://issues.apache.org/jira/browse/DRILL-7273 >>> > >>> > Kind regards, >>> > Volodymyr Vysotskyi >>> >>>
Re: [DISCUSS] 1.17.0 release
Hi all, I need more time to finish fix for DRILL-7450 <https://issues.apache.org/jira/browse/DRILL-7450>, so I suppose I'll finish it on the next week, and after it is merged, I'll start the release preparation. The good news is that DRILL-6540 <https://issues.apache.org/jira/browse/DRILL-6540> finished and all required checks were done, so it will be included in the release. Kind regards, Volodymyr Vysotskyi On Mon, Nov 18, 2019 at 1:09 PM Volodymyr Vysotskyi wrote: > Hi Charles, > > We found a blocking issue for the release: DRILL-7450 > <https://issues.apache.org/jira/browse/DRILL-7450>. > I need a couple of days to fix it. I'll share updates soon. > > Kind regards, > Volodymyr Vysotskyi > > > On Fri, Nov 15, 2019 at 6:56 PM Charles Givre wrote: > >> Hi Volodymyr, >> I wanted to follow up to see how we are doing with the Drill release >> candidate? Are we getting close to a release? >> Thanks, >> -- C >> >> >> > On Nov 4, 2019, at 1:57 PM, Volodymyr Vysotskyi >> wrote: >> > >> > Hello Drillers, >> > >> > It's about 6 months have passed since the previous release and its time >> to >> > discuss and start planning for the 1.17.0. >> > I volunteer to manage the new release. >> > >> > We have 6 Jira tickets with "reviewable" status, 2 "in progress" and 6 >> open >> > tickets [1]. Jira tickets marked as ready to commit will be merged soon >> and >> > I hope other PRs from this list will be completed before the cut-off >> date. >> > >> > Among these tickets, I want to include DRILL-7273 [2] to the release >> (pull >> > request is already opened, but CR comments should be addressed). >> > >> > I would like to propose a preliminary release cut-off date as the >> middle of >> > the next week (Nov, 13) or the beginning of the week after that (Nov, >> 18). >> > >> > Please let me know if there are any other Jira tickets you working on >> which >> > should be included in this release. >> > >> > [1] >> > >> https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185=DRILL=1425 >> > [2] https://issues.apache.org/jira/browse/DRILL-7273 >> > >> > Kind regards, >> > Volodymyr Vysotskyi >> >>
Re: [DISCUSS] 1.17.0 release
Hi Charles, We found a blocking issue for the release: DRILL-7450 <https://issues.apache.org/jira/browse/DRILL-7450>. I need a couple of days to fix it. I'll share updates soon. Kind regards, Volodymyr Vysotskyi On Fri, Nov 15, 2019 at 6:56 PM Charles Givre wrote: > Hi Volodymyr, > I wanted to follow up to see how we are doing with the Drill release > candidate? Are we getting close to a release? > Thanks, > -- C > > > > On Nov 4, 2019, at 1:57 PM, Volodymyr Vysotskyi > wrote: > > > > Hello Drillers, > > > > It's about 6 months have passed since the previous release and its time > to > > discuss and start planning for the 1.17.0. > > I volunteer to manage the new release. > > > > We have 6 Jira tickets with "reviewable" status, 2 "in progress" and 6 > open > > tickets [1]. Jira tickets marked as ready to commit will be merged soon > and > > I hope other PRs from this list will be completed before the cut-off > date. > > > > Among these tickets, I want to include DRILL-7273 [2] to the release > (pull > > request is already opened, but CR comments should be addressed). > > > > I would like to propose a preliminary release cut-off date as the middle > of > > the next week (Nov, 13) or the beginning of the week after that (Nov, > 18). > > > > Please let me know if there are any other Jira tickets you working on > which > > should be included in this release. > > > > [1] > > > https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185=DRILL=1425 > > [2] https://issues.apache.org/jira/browse/DRILL-7273 > > > > Kind regards, > > Volodymyr Vysotskyi > >
Re: Draft ASF Board Report: 2.0
+1, thanks for making changes. Kind regards, Volodymyr Vysotskyi On Mon, Nov 11, 2019 at 12:49 PM Igor Guzenko wrote: > +1, thanks for the report. > > On Mon, Nov 11, 2019 at 12:30 PM Arina Yelchiyeva < > arina.yelchiy...@gmail.com> wrote: > > > +1 > > > > > On Nov 8, 2019, at 3:32 PM, Charles Givre wrote: > > > > > > All, > > > Below is the draft with some updates. If anyone has anything else, > > please get them to me over the weekend. > > > Thanks! > > > -- C > > > > > > ## Description: > > > The mission of Drill is the creation and maintenance of software > related > > to > > > Schema-free SQL Query Engine for Apache Hadoop, NoSQL and Cloud Storage > > > > > > ## Issues: > > > There are no issues requiring board attention at this time. > > > > > > ## Membership Data: > > > Apache Drill was founded 2014-11-18 (5 years ago) > > > There are currently 55 committers and 24 PMC members in this project. > > > The Committer-to-PMC ratio is roughly 7:3. > > > > > > Community changes, past quarter: > > > - No new PMC members. Last addition was Sorabh Hamirwasia on > 2019-04-04. > > > - No new committers. Last addition was Anton Gozhiy on 2019-07-22. > > > > > > ## Project Activity: > > > - Drill 1.16 was released on 2019-05-02. > > > - Drill 1.17 was delayed until end of November. > > > > > > ### Next Release > > > The next release of Drill (1.17) resolved many issues and added a lot > of > > new > > > functionality including: > > > - Enhanced Drill metastore > > > - Hive complex types support (arrays, structs, union) > > > - Canonical Map support > > > - Schema provisioning via table function > > > - Empty parquet files read / write support > > > - Run-time row group pruning > > > - Numerous enhancements and upgrades to Drill with Hive > > > - Format plugin for Excel Files > > > - Format plugin for ESRI Shape Files > > > - Add Variable Argument UDFs > > > - Add UDF to parse user agent strings > > > > > > ### Future Functionality in Development > > > There are a number of enhancements for which there are active PRs or > > discussions > > > on the various boards. > > > - Integration between Apache Drill and Apache Daffodil (Incubating) > > > - Storage plugin for Apache Druid > > > - Upgrading Drill to use Hadoop v. 3.0 > > > - Format plugin for HDF5 > > > > > > > > > ## Community Health: > > > Drill seems to be recovering from the acquisition of Drill's major > > backer, MapR. > > > > > > ### Development Activity > > > - 96 issues opened in JIRA (1% increase from last quarter) > > > - 85 issues closed in JIRA (28% increase from last quarter) > > > - 55 commits in past quarter (14% increase from last quarter) > > > - 15 contributors from last quarter (25% increase) > > > - 53 PRs opened on GitHub (no change from last quarter) > > > - 63 PRs closed on GitHub (no change from last quarter) > > > > > > ### Email Lists > > > - dev@drill.apache.org <mailto:dev@drill.apache.org> > > > - 46% increase in traffic in past quarter (1574 compared to 1073) > > > > > > - iss...@drill.apache.org <mailto:iss...@drill.apache.org> > > > - 47% increase in traffic in past quarter (2027 compared to 1377) > > > > > > - us...@drill.apache.org <mailto:us...@drill.apache.org> > > > - 27% decrease in traffic in past quarter (116 compared to 157) > > > > >
Re: [DRAFT] Drill Board Report: Comments due by 2019-11-08 1200
Hi Charles, Could you please add a couple of items to the list of upcoming features, for example, I think that we should mention the following improvements: - Hive complex types support (arrays, structs, union) - Canonical Map support - Schema provisioning via table function - Empty parquet files read / write support - Run-time row group pruning Please refer to the upcoming release notes to point other significant improvements: https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344870 . Kind regards, Volodymyr Vysotskyi On Wed, Nov 6, 2019 at 4:14 PM Charles Givre wrote: > Hello all, > Here is the proposed Drill board report. Could everyone please take a > look and add comments by noon thursday? I'd like to highlight both recently > submitted and other issues in progress. I put the ones that I knew, but > I'd like to highlight the work on the metastore, anything else that people > would like highlighted. > Thanks! > -- C > > > ## Description: > The mission of Drill is the creation and maintenance of software related > to > Schema-free SQL Query Engine for Apache Hadoop, NoSQL and Cloud Storage > > ## Issues: > There are no issues requiring board attention at this time. > > ## Membership Data: > Apache Drill was founded 2014-11-18 (5 years ago) > There are currently 55 committers and 24 PMC members in this project. > The Committer-to-PMC ratio is roughly 7:3. > > Community changes, past quarter: > - No new PMC members. Last addition was Sorabh Hamirwasia on 2019-04-04. > - No new committers. Last addition was Anton Gozhiy on 2019-07-22. > > ## Project Activity: > - Drill 1.16 was released on 2019-05-02. > - Drill 1.17 was delayed until end of November. > > ### Next Release > The next release of Drill (1.17) resolved many issues and added a lot of > new > functionality including: > - Enhanced Drill metastore > - Numerous enhancements and upgrades to Drill with Hive > - Format plugin for Excel Files > - Format plugin for ESRI Shape Files > - Add Variable Argument UDFs > - Add UDF to parse user agent strings > > ### Future Functionality in Development > There are a number of enhancements for which there are active PRs or > discussions > on the various boards. > - Integration between Apache Drill and Apache Daffodil (Incubating) > - Storage plugin for Apache Druid > - Upgrading Drill to use Hadoop v. 3.0 > - Format plugin for HDF5 > > > ## Community Health: > Drill seems to be recovering from the collapse of Drill's major backer > MapR. > > ### Development Activity > - 96 issues opened in JIRA (1% increase from last quarter) > - 85 issues closed in JIRA (28% increase from last quarter) > - 55 commits in past quarter (14% increase from last quarter) > - 15 contributors from last quarter (25% increase) > - 53 PRs opened on GitHub (no change from last quarter) > - 63 PRs closed on GitHub (no change from last quarter) > > ### Email Lists > - dev@drill.apache.org <mailto:dev@drill.apache.org> > - 46% increase in traffic in past quarter (1574 compared to 1073) > > - iss...@drill.apache.org <mailto:iss...@drill.apache.org> > - 47% increase in traffic in past quarter (2027 compared to 1377) > > - us...@drill.apache.org <mailto:us...@drill.apache.org> > - 27% decrease in traffic in past quarter (116 compared to 157)
Re: [DISCUSS] 1.17.0 release
I agree with you that this Jira is an important one, but I don't think that PR for DRILL-6540 will be included in this release since the PR itself is not completed and there are a lot of things which should be checked before merging this PR. I think it is better to have a stable version with known limitations than an unchecked new one. But we definitely should include it to the next 1.18.0 release. Kind regards, Volodymyr Vysotskyi On Tue, Nov 5, 2019 at 3:46 PM Charles Givre wrote: > One other question... > Are there any parts of Dril-6540 that we could include in version 1.17? > IMHO, this is an important one. > > Regards, > -- C > > > On Nov 4, 2019, at 2:08 PM, Vova Vysotskyi wrote: > > > > Hi Charles, > > > > Thanks for pointing to this Jira. I'm not sure that we can update most of > > the libraries pointed there considering current project dependencies. > > I'll add a comment to this Jira ticket with my thoughts. > > > > Kind regards, > > Volodymyr Vysotskyi > > > > > > On Mon, Nov 4, 2019 at 8:59 PM Charles Givre wrote: > > > >> Hi Volodymyr > >> I'd like to see if we can get some or all of DRILL-7416 in as well. > This > >> is a security update which is important IMHO. > >> Thanks, > >> -- C > >> > >>> On Nov 4, 2019, at 1:57 PM, Volodymyr Vysotskyi > >> wrote: > >>> > >>> Hello Drillers, > >>> > >>> It's about 6 months have passed since the previous release and its time > >> to > >>> discuss and start planning for the 1.17.0. > >>> I volunteer to manage the new release. > >>> > >>> We have 6 Jira tickets with "reviewable" status, 2 "in progress" and 6 > >> open > >>> tickets [1]. Jira tickets marked as ready to commit will be merged soon > >> and > >>> I hope other PRs from this list will be completed before the cut-off > >> date. > >>> > >>> Among these tickets, I want to include DRILL-7273 [2] to the release > >> (pull > >>> request is already opened, but CR comments should be addressed). > >>> > >>> I would like to propose a preliminary release cut-off date as the > middle > >> of > >>> the next week (Nov, 13) or the beginning of the week after that (Nov, > >> 18). > >>> > >>> Please let me know if there are any other Jira tickets you working on > >> which > >>> should be included in this release. > >>> > >>> [1] > >>> > >> > https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185=DRILL=1425 > >>> [2] https://issues.apache.org/jira/browse/DRILL-7273 > >>> > >>> Kind regards, > >>> Volodymyr Vysotskyi > >> > >> > >
[DISCUSS] 1.17.0 release
Hello Drillers, It's about 6 months have passed since the previous release and its time to discuss and start planning for the 1.17.0. I volunteer to manage the new release. We have 6 Jira tickets with "reviewable" status, 2 "in progress" and 6 open tickets [1]. Jira tickets marked as ready to commit will be merged soon and I hope other PRs from this list will be completed before the cut-off date. Among these tickets, I want to include DRILL-7273 [2] to the release (pull request is already opened, but CR comments should be addressed). I would like to propose a preliminary release cut-off date as the middle of the next week (Nov, 13) or the beginning of the week after that (Nov, 18). Please let me know if there are any other Jira tickets you working on which should be included in this release. [1] https://issues.apache.org/jira/secure/RapidBoard.jspa?rapidView=185=DRILL=1425 [2] https://issues.apache.org/jira/browse/DRILL-7273 Kind regards, Volodymyr Vysotskyi
Re: Next Release?
Hi all, I'm going to open PR for DRILL-7273 <https://issues.apache.org/jira/browse/DRILL-7273> on the next week, it is the last thing blocking the release, sorry for one more delay. I'm planning to start the release process preliminary on October, 28, so please let me know if there is something else that should be included in the release and make sure that features you are working on and want to include in the release are finished before that time. Kind regards, Volodymyr Vysotskyi On Wed, Sep 25, 2019 at 11:30 AM Volodymyr Vysotskyi wrote: > Thanks for moving this topic forward, yes, I think mid-October is > achievable date. > I'll start a pre-release discussion when will wrap up major things for > metastore. > > Kind regards, > Volodymyr Vysotskyi > > > On Mon, Sep 23, 2019 at 4:46 PM Charles Givre wrote: > >> That sounds good to me. It seems to me that there are several PRs which >> are relatively simple and could be cleared off the board as well. >> -- C >> >> >> > On Sep 23, 2019, at 8:41 AM, Arina Yelchiyeva < >> arina.yelchiy...@gmail.com> wrote: >> > >> > Metastore work was aimed to be included in this release, since delivery >> date was shifted due to larger scope of work than expected, we did not push >> for the release until it’s done but I think mid October is achievable due >> date. Volodymyr any thoughts? >> > >> > Kind regards, >> > Arina >> > >> >> On 23 Sep 2019, at 15:09, Charles Givre wrote: >> >> >> >> Hello All, >> >> I wanted to ask if we can start thinking about our next release? I >> seem to recall that there was discussion around a new release around >> mid-September which clearly didn't happen. So... What if we were to shoot >> for mid-October? >> >> -- C >> >>
Re: Next Release?
Thanks for moving this topic forward, yes, I think mid-October is achievable date. I'll start a pre-release discussion when will wrap up major things for metastore. Kind regards, Volodymyr Vysotskyi On Mon, Sep 23, 2019 at 4:46 PM Charles Givre wrote: > That sounds good to me. It seems to me that there are several PRs which > are relatively simple and could be cleared off the board as well. > -- C > > > > On Sep 23, 2019, at 8:41 AM, Arina Yelchiyeva < > arina.yelchiy...@gmail.com> wrote: > > > > Metastore work was aimed to be included in this release, since delivery > date was shifted due to larger scope of work than expected, we did not push > for the release until it’s done but I think mid October is achievable due > date. Volodymyr any thoughts? > > > > Kind regards, > > Arina > > > >> On 23 Sep 2019, at 15:09, Charles Givre wrote: > >> > >> Hello All, > >> I wanted to ask if we can start thinking about our next release? I > seem to recall that there was discussion around a new release around > mid-September which clearly didn't happen. So... What if we were to shoot > for mid-October? > >> -- C > >
[jira] [Created] (DRILL-7376) Drill ignores Hive schema for MaprDB tables when group scan has star column
Volodymyr Vysotskyi created DRILL-7376: -- Summary: Drill ignores Hive schema for MaprDB tables when group scan has star column Key: DRILL-7376 URL: https://issues.apache.org/jira/browse/DRILL-7376 Project: Apache Drill Issue Type: Bug Affects Versions: 1.17.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 For the case when group scan has star column, fix for DRILL-7369 doesn't work and hive schema is ignored. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (DRILL-7372) MethodAnalyzer consumes too many memory
Volodymyr Vysotskyi created DRILL-7372: -- Summary: MethodAnalyzer consumes too many memory Key: DRILL-7372 URL: https://issues.apache.org/jira/browse/DRILL-7372 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 In the scope of DRILL-6524 was added logic for determining whether a variable is assigned in conditional block to prevent incorrect scalar replacement for such cases. But for some queries, this logic consumes too many memory, for example, for the following query: {code:sql} SELECT * FROM cp.`employee.json` WHERE employee_id+0 < employee_id OR employee_id+1 < employee_id AND employee_id+2 < employee_id OR employee_id+3 < employee_id AND employee_id+4 < employee_id OR employee_id+5 < employee_id AND employee_id+6 < employee_id OR employee_id+7 < employee_id AND employee_id+8 < employee_id OR employee_id+9 < employee_id AND employee_id+10 < employee_id OR employee_id+11 < employee_id AND employee_id+12 < employee_id OR employee_id+13 < employee_id AND employee_id+14 < employee_id OR employee_id+15 < employee_id AND employee_id+16 < employee_id OR employee_id+17 < employee_id AND employee_id+18 < employee_id OR employee_id+19 < employee_id AND employee_id+20 < employee_id OR employee_id+21 < employee_id AND employee_id+22 < employee_id OR employee_id+23 < employee_id AND employee_id+24 < employee_id OR employee_id+25 < employee_id AND employee_id+26 < employee_id OR employee_id+27 < employee_id AND employee_id+28 < employee_id OR employee_id+29 < employee_id AND employee_id+30 < employee_id OR employee_id+31 < employee_id AND employee_id+32 < employee_id OR employee_id+33 < employee_id AND employee_id+34 < employee_id OR employee_id+35 < employee_id AND employee_id+36 < employee_id OR employee_id+37 < employee_id AND employee_id+38 < employee_id OR employee_id+39 < employee_id AND employee_id+40 < employee_id OR employee_id+41 < employee_id AND employee_id+42 < employee_id OR employee_id+43 < employee_id AND employee_id+44 < employee_id OR employee_id+45 < employee_id AND employee_id+46 < employee_id OR employee_id+47 < employee_id AND employee_id+48 < employee_id OR employee_id+49 < employee_id AND TRUE; {code} Drill consumes more than 6 GB memory. One of the issues to fix is to replace {{Deque> localVariablesSet;}} with {{Deque}}, it will reduce memory usage significantly. Additionally should be investigated why these objects cannot be collected by GC. -- This message was sent by Atlassian Jira (v8.3.2#803003)
[jira] [Created] (DRILL-7369) Schema for MaprDB tables is not used for the case when some field are queried
Volodymyr Vysotskyi created DRILL-7369: -- Summary: Schema for MaprDB tables is not used for the case when some field are queried Key: DRILL-7369 URL: https://issues.apache.org/jira/browse/DRILL-7369 Project: Apache Drill Issue Type: Bug Components: Storage - MapRDB Affects Versions: 1.17.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 In DRILL-7313 was allowed using Hive schema for MaprDB native reader when field was empty. But current code creates additional fields from schema only for the case when the incoming map is empty. The aim of this Jira is to allow using provided schema for the case when some fields already present. -- This message was sent by Atlassian Jira (v8.3.2#803003)
Re: Build failed in Jenkins: drill-scm #1050
Hi all, I have created INFRA-18945 <https://issues.apache.org/jira/browse/INFRA-18945> to address this failure. Kind regards, Volodymyr Vysotskyi On Tue, Aug 27, 2019 at 1:27 PM Apache Jenkins Server < jenk...@builds.apache.org> wrote: > See < > https://builds.apache.org/job/drill-scm/1050/display/redirect?page=changes > > > > Changes: > > [arina.yelchiyeva] DRILL-7156: Support empty Parquet files creation > > [arina.yelchiyeva] DRILL-7356: Introduce session options for the Drill > Metastore > > [arina.yelchiyeva] DRILL-7326: Support repeated lists for CTAS parquet > format > > [arina.yelchiyeva] DRILL-7339: Iceberg commit upgrade and Metastore tests > categorization > > [arina.yelchiyeva] DRILL-7222: Visualize estimated and actual row counts > for a query > > [arina.yelchiyeva] DRILL-7353: Wrong driver class is written to the > java.sql.Driver > > -- > Started by an SCM change > [EnvInject] - Loading node environment variables. > Building remotely on H24 (ubuntu xenial) in workspace < > https://builds.apache.org/job/drill-scm/ws/> > No credentials specified > Cloning the remote Git repository > Cloning repository https://git-wip-us.apache.org/repos/asf/drill.git > > git init <https://builds.apache.org/job/drill-scm/ws/> # timeout=10 > Fetching upstream changes from > https://git-wip-us.apache.org/repos/asf/drill.git > > git --version # timeout=10 > > git fetch --tags --progress > https://git-wip-us.apache.org/repos/asf/drill.git > +refs/heads/*:refs/remotes/origin/* > > git config remote.origin.url > https://git-wip-us.apache.org/repos/asf/drill.git # timeout=10 > > git config --add remote.origin.fetch > +refs/heads/*:refs/remotes/origin/* # timeout=10 > > git config remote.origin.url > https://git-wip-us.apache.org/repos/asf/drill.git # timeout=10 > Fetching upstream changes from > https://git-wip-us.apache.org/repos/asf/drill.git > > git fetch --tags --progress > https://git-wip-us.apache.org/repos/asf/drill.git > +refs/heads/*:refs/remotes/origin/* > > git rev-parse refs/remotes/origin/master^{commit} # timeout=10 > > git rev-parse refs/remotes/origin/origin/master^{commit} # timeout=10 > Checking out Revision 31a41995c3f708894cc77bad3b27ce72203c423c > (refs/remotes/origin/master) > > git config core.sparsecheckout # timeout=10 > > git checkout -f 31a41995c3f708894cc77bad3b27ce72203c423c > Commit message: "DRILL-7353: Wrong driver class is written to the > java.sql.Driver" > > git rev-list --no-walk 9c62bf1a91f611bdefa6f3a99e9dfbdf9b622413 # > timeout=10 > FATAL: Couldn?t find any executable in > /home/jenkins/tools/maven/apache-maven-3.5.2 > Build step 'Invoke top-level Maven targets' marked build as failure > Archiving artifacts >
[jira] [Created] (DRILL-7356) Introduce new session option for enabling metastore
Volodymyr Vysotskyi created DRILL-7356: -- Summary: Introduce new session option for enabling metastore Key: DRILL-7356 URL: https://issues.apache.org/jira/browse/DRILL-7356 Project: Apache Drill Issue Type: Task Affects Versions: 1.17.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 Introduce a new session option for enabling metastore: {{metastore.enabled}}. It will be used during the next steps of development. -- This message was sent by Atlassian Jira (v8.3.2#803003)
Re: [ANNOUNCE] New PMC Chair of Apache Drill
Congratulations, Charles! And thanks Arina for your effort to grow the community and make the project much better! It will be hard to keep the bar so high, but for now, I know who tops my chart of PMC chairs :) Kind regards, Volodymyr Vysotskyi On Thu, Aug 22, 2019 at 10:39 AM Igor Guzenko wrote: > Congratulations, Charles! Great job! > > On Thu, Aug 22, 2019 at 10:28 AM Arina Ielchiieva > wrote: > > > Hi all, > > > > It has been a honor to serve as Drill Chair during the past year but it's > > high time for the new one... > > > > I am very pleased to announce that the Drill PMC has voted to elect > Charles > > Givre as the new PMC chair of Apache Drill. He has also been approved > > unanimously by the Apache Board in last board meeting. > > > > Congratulations, Charles! > > > > Kind regards, > > Arina > > >
[jira] [Created] (DRILL-7352) Introduce new checkstyle rules to make code style more consistent
Volodymyr Vysotskyi created DRILL-7352: -- Summary: Introduce new checkstyle rules to make code style more consistent Key: DRILL-7352 URL: https://issues.apache.org/jira/browse/DRILL-7352 Project: Apache Drill Issue Type: Task Reporter: Volodymyr Vysotskyi List of rules to be enabled: * [LeftCurly|https://checkstyle.sourceforge.io/config_blocks.html#LeftCurly] - force placement of a left curly brace at the end of the line. * [RightCurly|https://checkstyle.sourceforge.io/config_blocks.html#RightCurly] - force placement of a right curly brace * [NewlineAtEndOfFile|https://checkstyle.sourceforge.io/config_misc.html#NewlineAtEndOfFile] * [UnnecessaryParentheses|https://checkstyle.sourceforge.io/config_coding.html#UnnecessaryParentheses] * [MethodParamPad|https://checkstyle.sourceforge.io/config_whitespace.html#MethodParamPad] and other -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (DRILL-7350) Move RowSet related classes from test folder
Volodymyr Vysotskyi created DRILL-7350: -- Summary: Move RowSet related classes from test folder Key: DRILL-7350 URL: https://issues.apache.org/jira/browse/DRILL-7350 Project: Apache Drill Issue Type: Task Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 Move RowSet related classes from test folder to main to be able to use them for Metastore. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (DRILL-7345) Strange Behavior for UDFs with ComplexWriter Output
[ https://issues.apache.org/jira/browse/DRILL-7345?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-7345. Resolution: Not A Problem Assignee: Volodymyr Vysotskyi Resolving this Jira since the described behavior is working as intended. > Strange Behavior for UDFs with ComplexWriter Output > --- > > Key: DRILL-7345 > URL: https://issues.apache.org/jira/browse/DRILL-7345 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.17.0 >Reporter: Charles Givre > Assignee: Volodymyr Vysotskyi >Priority: Minor > > I wrote some UDFs recently and noticed some strange behavior when debugging > them. > This behavior only occurs when there is ComplexWriter as output. > Basically, if the input to the UDF is nullable, Drill doesn't recognize the > UDF at all. I've found that the only way to get Drill to recognize UDFs that > have ComplexWriters as output is: > * Use a non-nullable holder as input > * Remove the null setting completely from the function parameters. > This approach has a drawback in that if the function receives a null value, > it will throw an error and halt execution. My preference would be to allow > null handling, but I've not figured out how to make that happen. > Note: This behavior ONLY occurs when using a ComplexWriter as output. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
Re: August Apache Drill board report
The report looks good, +1 Kind regards, Volodymyr Vysotskyi On Fri, Aug 9, 2019 at 4:12 PM Arina Ielchiieva wrote: > Thanks everybody for the feedback, made changes according to the feedbacks > and submitted the report. > Final report draft: > > ## Description: > - Drill is a Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud >Storage. > > ## Issues: > - There are no issues requiring board attention at this time. > > ## Activity: > - Drill User Meetup was held on May 22, 2019. > - Drill 1.17.0 release is planned in the end of August / beginning of > September. > > ## Health report: > - Development activity is almost 50% down due to acquisition of one of the > main Drill vendors. > - Activity on the dev and user mailing lists is slightly down compared to > previous periods. > - Four committers were added in the last period. > > ## PMC changes: > > - Currently 24 PMC members. > - No new PMC members added in the last 3 months > - Last PMC addition was Sorabh Hamirwasia on Fri Apr 05 2019 > > ## Committer base changes: > > - Currently 55 committers. > - New commmitters: > - Anton Gozhiy was added as a committer on Mon Jul 22 2019 > - Bohdan Kazydub was added as a committer on Mon Jul 15 2019 > - Igor Guzenko was added as a committer on Mon Jul 22 2019 > - Venkata Jyothsna Donapati was added as a committer on Mon May 13 2019 > > ## Releases: > > - Last release was 1.16.0 on Thu May 02 2019 > > ## Mailing list activity: > > - dev@drill.apache.org: > - 403 subscribers (down -5 in the last 3 months): > - 1156 emails sent to list ( in previous quarter) > > - iss...@drill.apache.org: > - 17 subscribers (up 0 in the last 3 months): > - 1496 emails sent to list (2315 in previous quarter) > > - u...@drill.apache.org: > - 575 subscribers (down -6 in the last 3 months): > - 157 emails sent to list (230 in previous quarter) > > ## JIRA activity: > > - 96 JIRA tickets created in the last 3 months > - 68 JIRA tickets closed/resolved in the last 3 months > > > On Thu, Aug 8, 2019 at 9:38 PM Aman Sinha wrote: > > > Thanks for putting this together Arina. One minor comment is that for a > > future release do we need to mention the feature set ? > > Typically we would enumerate those in the next board report after the > > release has happened. > > > > Aman > > > > On Thu, Aug 8, 2019 at 10:00 AM Sorabh Hamirwasia > > wrote: > > > > > Hi Arina, > > > Overall report looks good. One minor thing: > > > - Drill User Meetup was be held on May 22, 2019. > > > > > > > > > Thanks, > > > > > > Sorabh > > > > > > On Thu, Aug 8, 2019 at 7:05 AM Arina Ielchiieva > > wrote: > > > > > > > Hi all, > > > > > > > > please take a look at the draft board report for the last quarter and > > let > > > > me know if you have any comments. > > > > > > > > Thanks, > > > > Arina > > > > > > > > = > > > > > > > > ## Description: > > > > - Drill is a Schema-free SQL Query Engine for Hadoop, NoSQL and > Cloud > > > >Storage. > > > > > > > > ## Issues: > > > > - There are no issues requiring board attention at this time. > > > > > > > > ## Activity: > > > > - Drill User Meetup was be held on May 22, 2019. > > > > - Drill 1.17.0 release is planned in the end of August / beginning > > > > September, > > > >it will include the following improvements: > > > >- Drill Metastore implementation based on Iceberg tables and > > > integration > > > >- Hive arrays / structs support > > > >- Canonical Map support > > > >- Vararg UDFs support > > > >- Run-time row group pruning > > > >- Schema provisioning via table function > > > >- Empty parquet files read / write support > > > > > > > > ## Health report: > > > > - Development activity is almost 50% down due to acquisition one of > > the > > > > main Drill vendors. > > > > - Activity on the dev and user mailing lists is slightly down > compared > > > to > > > > previous periods. > > > > - Four committers were added in the last period. > > > > > > > > ## PMC changes: > > > > > > > > - Currently 24 PMC members. > > > >
[jira] [Resolved] (DRILL-7321) split function doesn't work without from
[ https://issues.apache.org/jira/browse/DRILL-7321?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-7321. Resolution: Fixed Fix Version/s: 1.17.0 Fixed in the sope of DRILL-7337. > split function doesn't work without from > > > Key: DRILL-7321 > URL: https://issues.apache.org/jira/browse/DRILL-7321 > Project: Apache Drill > Issue Type: Bug > Components: Functions - Drill >Affects Versions: 1.16.0 >Reporter: benj >Assignee: Volodymyr Vysotskyi >Priority: Minor > Fix For: 1.17.0 > > > {code:java} > SELECT upper('foo') AS a /* OK */; > +-+ > | a | > +-+ > | foo | > +-+ > {code} > but > {code:java} > SELECT split('foo,bar,buz',',') AS a /* NOK */; > Error: PLAN ERROR: Failure while materializing expression in constant > expression evaluator [SPLIT('foo,bar,buz', ',')]. Errors: > Error in expression at index -1. Error: Only ProjectRecordBatch could have > complex writer function. You are using complex writer function split in a > non-project operation!. Full expression: --UNKNOWN EXPRESSION--.{code} > Note that > {code:java} > SELECT split(a,',') AS a FROM (SELECT 'foo,bar,buz' AS a) /* OK */; > +-+ > | a | > +-+ > | ["foo","bar","buz"] | > +-+ > {code} > -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (DRILL-7340) Filter is not pushed to JDBC database when several databases are used in the query
Volodymyr Vysotskyi created DRILL-7340: -- Summary: Filter is not pushed to JDBC database when several databases are used in the query Key: DRILL-7340 URL: https://issues.apache.org/jira/browse/DRILL-7340 Project: Apache Drill Issue Type: Bug Components: Storage - JDBC Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Fix For: Future For the case when several databases are used in the query, some rules weren't added to the rule set for one of the conventions. It is observed in queries similar to the next query: {code:sql} select * from mysql.`drill_mysql_test`.person t1 INNER JOIN h2.drill_h2_test.person t2 on t1.person_id = t2.person_id where t1.first_name = 'first_name_1' and t2.last_name = 'last_name_1 {code} Plan for this query is the following: {noformat} 00-00Screen 00-01 Project(person_id=[$0], first_name=[$1], last_name=[$2], address=[$3], city=[$4], state=[$5], zip=[$6], json=[$7], bigint_field=[$8], smallint_field=[$9], numeric_field=[$10], boolean_field=[$11], double_field=[$12], float_field=[$13], real_field=[$14], time_field=[$15], timestamp_field=[$16], date_field=[$17], datetime_field=[$18], year_field=[$19], text_field=[$20], tiny_text_field=[$21], medium_text_field=[$22], long_text_field=[$23], blob_field=[$24], bit_field=[$25], enum_field=[$26], PERSON_ID0=[$27], FIRST_NAME0=[$28], LAST_NAME0=[$29], ADDRESS0=[$30], CITY0=[$31], STATE0=[$32], ZIP0=[$33], JSON0=[$34], BIGINT_FIELD0=[$35], SMALLINT_FIELD0=[$36], NUMERIC_FIELD0=[$37], BOOLEAN_FIELD0=[$38], DOUBLE_FIELD0=[$39], FLOAT_FIELD0=[$40], REAL_FIELD0=[$41], TIME_FIELD0=[$42], TIMESTAMP_FIELD0=[$43], DATE_FIELD0=[$44], CLOB_FIELD=[$45]) 00-02HashJoin(condition=[=($0, $27)], joinType=[inner], semi-join: =[false]) 00-03 Project(PERSON_ID0=[$0], FIRST_NAME0=[$1], LAST_NAME0=[$2], ADDRESS0=[$3], CITY0=[$4], STATE0=[$5], ZIP0=[$6], JSON0=[$7], BIGINT_FIELD0=[$8], SMALLINT_FIELD0=[$9], NUMERIC_FIELD0=[$10], BOOLEAN_FIELD0=[$11], DOUBLE_FIELD0=[$12], FLOAT_FIELD0=[$13], REAL_FIELD0=[$14], TIME_FIELD0=[$15], TIMESTAMP_FIELD0=[$16], DATE_FIELD0=[$17], CLOB_FIELD=[$18]) 00-05SelectionVectorRemover 00-06 Filter(condition=[=($2, 'last_name_1')]) 00-07Jdbc(sql=[SELECT * FROM "TMP"."DRILL_H2_TEST"."PERSON" ]) 00-04 Jdbc(sql=[SELECT * FROM `drill_mysql_test`.`person` WHERE `first_name` = 'first_name_1' ]) {noformat} {{DrillJdbcFilterRule}} wasn't applied for H2 convention and Filter wasn't pushed to H2 database. This issue may be fixed by specifying {{JdbcConvention}} in rules descriptions in Drill {{DrillJdbcFilterRule}} and {{DrillJdbcProjectRule}} rules and other rules should be fixed in Calcite in the scope of CALCITE-3115. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Created] (DRILL-7337) Add vararg UDFs support
Volodymyr Vysotskyi created DRILL-7337: -- Summary: Add vararg UDFs support Key: DRILL-7337 URL: https://issues.apache.org/jira/browse/DRILL-7337 Project: Apache Drill Issue Type: Sub-task Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 The aim of this Jira is to add support for vararg UDFs to simplify UDFs creation for the case when it is required to accept different numbers of arguments. h2. Requirements for vararg UDFs: * It should be possible to register vararg UDFs with the same name, but with different argument types; * Only vararg UDFs with a single variable-length argument placed after all other arguments should be allowed; * Vararg UDF should have less priority than the regular one for the case when they both are suitable; * Besides simple functions, vararg support should be added to the aggregate functions. h2. Implementation details The lifecycle of UDF is the following: * UDF is validated in {{FunctionConverter}} class and for the case when there is no problem (UDF has required fields with required types, required annotations, etc.), it is converted to the {{DrillFuncHolder}} to be registered in the function registry. Also, corresponding {{SqlFunction}} instances are created based on {{DrillFuncHolder}} to be used in Calcite; * When a query uses this UDF, Calcite validate that UDF with required name, arguments number and arguments types (for Drill arguments types are not checked at this stage) exists; * After Calcite was able to find the required {{SqlFunction instance}}, it uses Drill to find required {{DrillFuncHolder}}. All the work for determining the most suitable function is done in {{FunctionResolver}} and in {{TypeCastRules.getCost()}}; * At the execution stage, {{DrillFuncHolder}} found again using {{FunctionCall}} instance; * {{DrillFuncHolder}} is used for code generation. Considering these steps, the first thing to be done for adding support for vararg UDFs is updating logic in {{FunctionConverter}} to allow registering vararg UDFs taking into account requirements declared above. Calcite uses {{SqlOperandTypeChecker}} to verify arguments number, so Drill should provide its own for vararg UDFs to be able to use them. To determine whether UDF is vararg, new {{isVarArg}} property will be added to the {{FunctionTemplate}}. {{TypeCastRules.getCost()}} method should be updated to be able to find vararg UDFs and prioritize regular UDFs. Code generation logic should be updated to handle vararg UDFs. Generated code for varag argument will look in the following way: {code:java} NullableVarCharHolder[] inputs = new NullableVarCharHolder[3]; inputs[0] = out14; inputs[1] = out19; inputs[2] = out24; {code} To create own varagr UDF, new {{isVarArg}} property should be set to {{true}} in {{FunctionTemplate}}. After that, required vararg input should be declared as an array. Here is an example if vararg UDF: {code:java} @FunctionTemplate(name = "concat_varchar", isVarArg = true, scope = FunctionTemplate.FunctionScope.SIMPLE) public class VarCharConcatFunction implements DrillSimpleFunc { @Param *VarCharHolder[] inputs*; @Output VarCharHolder out; @Inject DrillBuf buffer; @Override public void setup() { } @Override public void eval() { int length = 0; for (VarCharHolder input : inputs) { length += input.end - input.start; } out.buffer = buffer = buffer.reallocIfNeeded(length); out.start = out.end = 0; for (VarCharHolder input : inputs) { for (int id = input.start; id < input.end; id++) { out.buffer.setByte(out.end++, input.buffer.getByte(id)); } } } } {code} h2. Limitations connected with VarArg UDFs: * Specified nulls handling in FunctionTemplate does not affect vararg parameters, i.e. the user should add UDFs with non-nullable and nullable value holder vararg fields; * VarArg UDFs supports only values of the same type including nullability for vararg arguments for value holder vararg fields. If vararg field is FieldReader, all the responsibility for handling types and nullability of input vararg fields is placed on the UDF implementation; * The scalar replacement does not happen for vararg arguments; * UDF implementation should consider the case when vararg field is empty. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[ANNOUNCE] New Committer: Anton Gozhyi
The Project Management Committee (PMC) for Apache Drill has invited Anton Gozhyi to become a committer, and we are pleased to announce that he has accepted. Anton Gozhyi has been contributing to Drill for more than a year and a half. He did significant contributions as a QA, including reporting non-trivial issues and working on automation of Drill tests. All the issues reported by Anton have a clear description of the problem, steps to reproduce and expected behavior. Besides contributions as a QA, Anton made high-quality fixes into Drill. Welcome Anton, and thank you for your contributions! - Volodymyr (on behalf of Drill PMC)
Re: [DISCUSS]: Adding Issue Label Bot to Drill Github
Hi Charles, Superset uses GitHub issues for tracking bugs, features, improvements, etc. So using an issue label bot is justified since it helps to highlight different kinds of the issues. But Drill uses Jira for these purposes, so I'm not sure that we should add this to the Drill. Kind regards, Volodymyr Vysotskyi On Sat, Jul 27, 2019 at 12:23 AM Charles Givre wrote: > All, > I've been following the Apache Superset project and they use an issue > label bot (https://github.com/marketplace/issue-label-bot < > https://github.com/marketplace/issue-label-bot>) which, as the name > implies, adds issue labels such as "Question", "Size" etc. automatically. > What does the community think about adding this to Drill? > Thanks, > -- C
Re: [ANNOUNCE] New Committer: Bohdan Kazydub
Congratulations, Bohdan! Thanks for your contributions! Kind regards, Volodymyr Vysotskyi On Thu, Jul 18, 2019 at 4:55 PM Kunal Khatua wrote: > Congratulations, Bohdan! > On 7/16/2019 2:21:14 PM, Robert Hou wrote: > Congratulations, Bohdan. Thanks for contributing to Drill! > > --Robert > > On Tue, Jul 16, 2019 at 11:50 AM hanu mapr wrote: > > > Congratulations Bohdan! > > > > On Tue, Jul 16, 2019 at 9:30 AM Gautam Parai wrote: > > > > > Congratulations Bohdan! > > > > > > Gautam > > > > > > On Mon, Jul 15, 2019 at 11:53 PM Bohdan Kazydub > > bohdan.kazy...@gmail.com> > > > wrote: > > > > > > > Thank you all for your support! > > > > > > > > On Tue, Jul 16, 2019 at 4:16 AM weijie tong > > > > wrote: > > > > > > > > > Congrats Bohdan! > > > > > > > > > > On Tue, Jul 16, 2019 at 12:54 AM Vitalii Diravka > > > > > > > > wrote: > > > > > > > > > > > Congrats Bohdan! Well deserved! > > > > > > > > > > > > Kind regards > > > > > > Vitalii > > > > > > > > > > > > > > > > > > On Mon, Jul 15, 2019 at 6:48 PM Paul Rogers > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > Congrats Bohdan! > > > > > > > - Paul > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Monday, July 15, 2019, 01:08:04 AM PDT, Arina Ielchiieva > > > > > > > ar...@apache.org> wrote: > > > > > > > > > > > > > > The Project Management Committee (PMC) for Apache Drill has > > > invited > > > > > > Bohdan > > > > > > > Kazydub to become a committer, and we are pleased to announce > > that > > > he > > > > > has > > > > > > > accepted. > > > > > > > > > > > > > > Bohdan has been contributing into Drill for more than a year. > His > > > > > > > contributions include > > > > > > > logging and various functions handling improvements, planning > > > > > > optimizations > > > > > > > and S3 improvements / fixes. His recent work includes Calcite > > 1.19 > > > / > > > > > 1.20 > > > > > > > [DRILL-7200] and implementation of canonical Map > > [DRILL-7096]. > > > > > > > > > > > > > > Welcome Bohdan, and thank you for your contributions! > > > > > > > > > > > > > > - Arina > > > > > > > (on behalf of the Apache Drill PMC) > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Re: [ANNOUNCE] New Committer: Igor Guzenko
Congratulations, Ihor! Thanks for your contributions! Kind regards, Volodymyr Vysotskyi On Mon, Jul 22, 2019 at 5:02 PM Arina Ielchiieva wrote: > The Project Management Committee (PMC) for Apache Drill has invited Igor > Guzenko to become a committer, and we are pleased to announce that he has > accepted. > > Igor has been contributing into Drill for 9 months and made a number of > significant contributions, including cross join syntax support, Hive views > support, as well as improving performance for Hive show schema and unit > tests. Currently he is working on supporting Hive complex types > [DRILL-3290]. He already added support for list type and working on struct > and canonical map. > > Welcome Igor, and thank you for your contributions! > > - Arina > (on behalf of the Apache Drill PMC) >
[jira] [Resolved] (DRILL-4123) DirectoryExplorers should refer to fully qualified variable names
[ https://issues.apache.org/jira/browse/DRILL-4123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-4123. Resolution: Fixed Fixed in DRILL-3944. > DirectoryExplorers should refer to fully qualified variable names > - > > Key: DRILL-4123 > URL: https://issues.apache.org/jira/browse/DRILL-4123 > Project: Apache Drill > Issue Type: Bug >Reporter: Hanifi Gunes >Assignee: Hanifi Gunes >Priority: Major > > Execution fails with {code}CompileException: Line 75, Column 70: Unknown > variable or type "FILE_SEPARATOR"{code} in case a directory explorer is used > in a projection. Also FILE_SEPARATOR should not be platform dependent. -- This message was sent by Atlassian JIRA (v7.6.14#76016)
[jira] [Resolved] (DRILL-3676) Group by ordinal number of an output column results in parse error
[ https://issues.apache.org/jira/browse/DRILL-3676?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-3676. Resolution: Fixed Fix Version/s: (was: Future) 1.17.0 Fixed in DRILL-1248. > Group by ordinal number of an output column results in parse error > -- > > Key: DRILL-3676 > URL: https://issues.apache.org/jira/browse/DRILL-3676 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.2.0 >Reporter: Khurram Faraaz >Priority: Major > Fix For: 1.17.0 > > > Group by number results in parse error. > {code} > 0: jdbc:drill:schema=dfs.tmp> select sub_q.col1 from (select col1 from > FEWRWSPQQ_101) sub_q group by 1; > Error: PARSE ERROR: At line 1, column 8: Expression 'q.col1' is not being > grouped > [Error Id: 0eedafd9-372e-4610-b7a8-d97e26458d58 on centos-02.qa.lab:31010] > (state=,code=0) > {code} > When we use the column name instead of the number, the query compiles and > returns results. > {code} > 0: jdbc:drill:schema=dfs.tmp> select col1 from (select col1 from > FEWRWSPQQ_101) group by col1; > +--+ > | col1 | > +--+ > | 65534| > | 1000 | > | -1 | > | 0| > | 1| > | 13 | > | 17 | > | 23 | > | 1000 | > | 999 | > | 30 | > | 25 | > | 1001 | > | -65535 | > | 5000 | > | 3000 | > | 200 | > | 197 | > | 4611686018427387903 | > | 9223372036854775806 | > | 9223372036854775807 | > | 92233720385475807| > +--+ > 22 rows selected (0.218 seconds) > {code} > {code} > 0: jdbc:drill:schema=dfs.tmp> select sub_query.col1 from (select col1 from > FEWRWSPQQ_101) sub_query group by sub_query.col1; > +--+ > | col1 | > +--+ > | 65534| > | 1000 | > | -1 | > | 0| > | 1| > | 13 | > | 17 | > | 23 | > | 1000 | > | 999 | > | 30 | > | 25 | > | 1001 | > | -65535 | > | 5000 | > | 3000 | > | 200 | > | 197 | > | 4611686018427387903 | > | 9223372036854775806 | > | 9223372036854775807 | > | 92233720385475807| > +--+ > 22 rows selected (0.177 seconds) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-3664) CAST integer zero , one to boolean false , true
[ https://issues.apache.org/jira/browse/DRILL-3664?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-3664. Resolution: Fixed Fixed in DRILL-4674 > CAST integer zero , one to boolean false , true > --- > > Key: DRILL-3664 > URL: https://issues.apache.org/jira/browse/DRILL-3664 > Project: Apache Drill > Issue Type: Bug > Components: SQL Parser >Affects Versions: 1.2.0 >Reporter: Khurram Faraaz >Priority: Major > Fix For: Future > > > We should be able to cast (zero) 0 to false and (one) 1 to true, currently we > report a parse error when an explicit cast is used in query. > col7 is of type Boolean in the below input parquet file. > {code} > 0: jdbc:drill:schema=dfs.tmp> select col7 from FEWRWSPQQ_101 where col7 IN > (cast(0 as boolean),cast(1 as boolean)); > Error: PARSE ERROR: From line 1, column 47 to line 1, column 64: Cast > function cannot convert value of type INTEGER to type BOOLEAN > [Error Id: d751945f-8a0f-4369-ae9e-c42504f6d978 on centos-04.qa.lab:31010] > (state=,code=0) > {code} > Without explicit cast we see SchemaChangeException. > {code} > 0: jdbc:drill:schema=dfs.tmp> select col7 from FEWRWSPQQ_101 where col7 IN > (0,1); > Error: SYSTEM ERROR: SchemaChangeException: Failure while trying to > materialize incoming schema. Errors: > > Error in expression at index -1. Error: Missing function implementation: > [castINT(BIT-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--. > Error in expression at index -1. Error: Missing function implementation: > [castINT(BIT-OPTIONAL)]. Full expression: --UNKNOWN EXPRESSION--.. > Fragment 0:0 > [Error Id: ecf51dae-62c5-40d7-b0f5-3b9bf9fd3377 on centos-04.qa.lab:31010] > (state=,code=0) > {code} > Postgres results for the same query. > {code} > postgres=# select col7 from FEWRWSPQQ_101 where col7 IN (cast(0 as > boolean),cast(1 as boolean)); > col7 > -- > f > t > f > t > f > t > f > t > f > t > f > t > f > t > f > t > f > t > f > t > f > t > (22 rows) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7317) Close ClassLoaders used for udf jars uploading when closing FunctionImplementationRegistry
Volodymyr Vysotskyi created DRILL-7317: -- Summary: Close ClassLoaders used for udf jars uploading when closing FunctionImplementationRegistry Key: DRILL-7317 URL: https://issues.apache.org/jira/browse/DRILL-7317 Project: Apache Drill Issue Type: Task Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 For the case when {{FunctionImplementationRegistry}} is closed, class loaders which were used for uploading jars are still open and don't close file descriptors for these jars. The proposal is to unregister jars to close its {{ClassLoader}} when {{FunctionImplementationRegistry}} is closed. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7316) Move classes from org.apache.drill.metastore into org.apache.drill.exec.metastore package in java-exec module
Volodymyr Vysotskyi created DRILL-7316: -- Summary: Move classes from org.apache.drill.metastore into org.apache.drill.exec.metastore package in java-exec module Key: DRILL-7316 URL: https://issues.apache.org/jira/browse/DRILL-7316 Project: Apache Drill Issue Type: Task Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 Move classes from {{org.apache.drill.metastore}} into {{org.apache.drill.exec.metastore}} package in {{java-exec}} module -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7315) Revise precision and scale order in the method arguments
Volodymyr Vysotskyi created DRILL-7315: -- Summary: Revise precision and scale order in the method arguments Key: DRILL-7315 URL: https://issues.apache.org/jira/browse/DRILL-7315 Project: Apache Drill Issue Type: Task Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 The current code has different variations of scale and precision orderings in the method arguments. The goal for this Jira is to make it more consistent. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7313) Use Hive schema for MaprDB native reader when field was empty
Volodymyr Vysotskyi created DRILL-7313: -- Summary: Use Hive schema for MaprDB native reader when field was empty Key: DRILL-7313 URL: https://issues.apache.org/jira/browse/DRILL-7313 Project: Apache Drill Issue Type: Task Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 Currently, when an external Hive MaprDB table is queried using hive plugin with enabled {{store.hive.maprdb_json.optimize_scan_with_native_reader}}, some queries may fail due to soft schema change, though Hive knows actual data types. For example, when we have a table with several fields, and one of them has only several non-null values, queries with grouping by such field will fail due to schema change. The goal of this Jira is to allow using types from Hive when a non-existing field is created, so it will allow avoiding such issues. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7310) Move schema-related classes from exec module to be able to use them in metastore module
Volodymyr Vysotskyi created DRILL-7310: -- Summary: Move schema-related classes from exec module to be able to use them in metastore module Key: DRILL-7310 URL: https://issues.apache.org/jira/browse/DRILL-7310 Project: Apache Drill Issue Type: Sub-task Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 Currently, most of the schema related classes are placed in the {{exec}} module, but some of them should be used in {{metastore}} module. {{metastore}} module doesn't have a dependency onto exec one. The solution is to move these classes from {{exec}} into another module which is used by {{metastore}}, so they will be accessible for {{metastore}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[DISCUSS] New approach for using Drill-Calcite fork
Hi all, Currently, Calcite fork with Drill-specific commits is placed in https://github.com/mapr/incubator-calcite. Though it is a public repository, it is problematic to provide writable access for most of the cases. Another more frequent problem is deploying new Drill-Calcite versions to the maven repository (currently they are deployed to http://repository.mapr.com/nexus/content/repositories/drill-optiq/). Only several people have writable access for it, and there is no way to provide it to more people, in particular committers and PMCs. To resolve these problems, I propose to create a personal repository (I can create it, here is a test version: https://github.com/vvysotskyi/drill-calcite) and add all committers and PMCs as collaborators to it. To resolve the problem with deploys I propose to use https://jitpack.io, so it will automatically deploy a newer version when it will be required. The minor thing which should be mentioned - instead of *org.apache.calcite* groupId will be used groupId for specific GitHub repo. The following rules will be used to clarify the process: - Push changes to the repository (including commit with bumping up the version) - Create a new tag for bumped-up version - tag name should match the version, for example, *1.18.0-drill-r2* and push it to the remote repo - Bump up version in Drill Are there are any objections or ideas on this? Kind regards, Volodymyr Vysotskyi
[jira] [Created] (DRILL-7297) Query hangs in planning stage when Error is thrown
Volodymyr Vysotskyi created DRILL-7297: -- Summary: Query hangs in planning stage when Error is thrown Key: DRILL-7297 URL: https://issues.apache.org/jira/browse/DRILL-7297 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Query hangs in the planning stage when Error (not OOM or AssertionError) is thrown during query planning. After canceling the query it will stay in Cancellation Requested state. Such error may be thrown due to the mistake in the code, including UDF. Since the user may provide custom UDFs, Drill should be able to handle such cases also. Steps to reproduce this issue: 1. Create UDF which throws Error in either {{eval()}} or {{setup()}} method (instructions how to create custom UDF may be found [here|https://drill.apache.org/docs/tutorial-develop-a-simple-function/]. 2. Register custom UDF which throws an error (instruction is [here|https://drill.apache.org/docs/adding-custom-functions-to-drill-introduction/]). 3. Run the query with this UDF. After submitting the query, the following stack trace is printed: {noformat} Exception in thread "drill-executor-1" java.lang.Error at org.apache.drill.contrib.function.FunctionExample.setup(FunctionExample.java:19) at org.apache.drill.exec.expr.fn.interpreter.InterpreterEvaluator.evaluateFunction(InterpreterEvaluator.java:139) at org.apache.drill.exec.expr.fn.interpreter.InterpreterEvaluator$EvalVisitor.visitFunctionHolderExpression(InterpreterEvaluator.java:355) at org.apache.drill.exec.expr.fn.interpreter.InterpreterEvaluator$EvalVisitor.visitFunctionHolderExpression(InterpreterEvaluator.java:204) at org.apache.drill.common.expression.FunctionHolderExpression.accept(FunctionHolderExpression.java:53) at org.apache.drill.exec.expr.fn.interpreter.InterpreterEvaluator.evaluateConstantExpr(InterpreterEvaluator.java:70) at org.apache.drill.exec.planner.logical.DrillConstExecutor.reduce(DrillConstExecutor.java:152) at org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressionsInternal(ReduceExpressionsRule.java:620) at org.apache.calcite.rel.rules.ReduceExpressionsRule.reduceExpressions(ReduceExpressionsRule.java:541) at org.apache.calcite.rel.rules.ReduceExpressionsRule$ProjectReduceExpressionsRule.onMatch(ReduceExpressionsRule.java:288) at org.apache.calcite.plan.volcano.VolcanoRuleCall.onMatch(VolcanoRuleCall.java:212) at org.apache.calcite.plan.volcano.VolcanoPlanner.findBestExp(VolcanoPlanner.java:643) at org.apache.calcite.tools.Programs$RuleSetProgram.run(Programs.java:339) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:430) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.transform(DefaultSqlHandler.java:370) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToRawDrel(DefaultSqlHandler.java:250) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.convertToDrel(DefaultSqlHandler.java:319) at org.apache.drill.exec.planner.sql.handlers.DefaultSqlHandler.getPlan(DefaultSqlHandler.java:177) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getQueryPlan(DrillSqlWorker.java:226) at org.apache.drill.exec.planner.sql.DrillSqlWorker.convertPlan(DrillSqlWorker.java:124) at org.apache.drill.exec.planner.sql.DrillSqlWorker.getPlan(DrillSqlWorker.java:90) at org.apache.drill.exec.work.foreman.Foreman.runSQL(Foreman.java:593) at org.apache.drill.exec.work.foreman.Foreman.run(Foreman.java:276) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) {noformat} 4. Check that query is still in progress state, cancel query. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7294) Prevent generating java beans using protostuff to avoid overriding classes with the same simple name declared as nested in the proto files
Volodymyr Vysotskyi created DRILL-7294: -- Summary: Prevent generating java beans using protostuff to avoid overriding classes with the same simple name declared as nested in the proto files Key: DRILL-7294 URL: https://issues.apache.org/jira/browse/DRILL-7294 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 Currently, {{protostuff-maven-plugin}} generates java-bean classes from proto files. But these classes already generated by protobuf, the single difference is that they are placed in a different package, and preserved the nesting of the classes as they are declared in the proto files. protostuff creates new files for nested classes, and it causes problems for the case when several nested classes have the same name - they override each other, for example here is Travis failure caused by this problem: https://travis-ci.org/apache/drill/jobs/545013395 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7281) Unable to submit physical plan with maprdb-json-scan operator
Volodymyr Vysotskyi created DRILL-7281: -- Summary: Unable to submit physical plan with maprdb-json-scan operator Key: DRILL-7281 URL: https://issues.apache.org/jira/browse/DRILL-7281 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi When submitting the following plan which corresponds to a simple query on MaprDB table: {code:sql} select * from dfs.`/tmp/nulls` {code} {noformat} { "head" : { "version" : 1, "generator" : { "type" : "ExplainHandler", "info" : "" }, "type" : "APACHE_DRILL_PHYSICAL", "options" : [ { "kind" : "BOOLEAN", "accessibleScopes" : "ALL", "name" : "store.hive.maprdb_json.optimize_scan_with_native_reader", "bool_val" : true, "scope" : "SESSION" }, { "kind" : "BOOLEAN", "accessibleScopes" : "ALL", "name" : "planner.enable_hashagg", "bool_val" : false, "scope" : "SESSION" }, { "kind" : "LONG", "accessibleScopes" : "ALL", "name" : "planner.slice_target", "num_val" : 1, "scope" : "SESSION" } ], "queue" : 0, "hasResourcePlan" : false, "resultMode" : "EXEC" }, "graph" : [ { "pop" : "maprdb-json-scan", "@id" : 2, "userName" : "mapr", "scanSpec" : { "tableName" : "/tmp/nulls", "indexDesc" : null, "startRow" : "", "stopRow" : "", "serializedFilter" : null, "secondaryIndex" : false }, "storage" : { "type" : "file", "connection" : "maprfs:///", "config" : null, "workspaces" : { "tmp" : { "location" : "/tmp", "writable" : true, "defaultInputFormat" : null, "allowAccessOutsideWorkspace" : false }, "root" : { "location" : "/", "writable" : false, "defaultInputFormat" : null, "allowAccessOutsideWorkspace" : false } }, "formats" : { "psv" : { "type" : "text", "extensions" : [ "tbl" ], "delimiter" : "|" }, "csv" : { "type" : "text", "extensions" : [ "csv" ], "delimiter" : "," }, "tsv" : { "type" : "text", "extensions" : [ "tsv" ], "delimiter" : "\t" }, "httpd" : { "type" : "httpd", "logFormat" : "%h %t \"%r\" %>s %b \"%{Referer}i\"" }, "parquet" : { "type" : "parquet" }, "json" : { "type" : "json", "extensions" : [ "json" ] }, "pcap" : { "type" : "pcap" }, "pcapng" : { "type" : "pcapng", "extensions" : [ "pcapng" ] }, "avro" : { "type" : "avro" }, "sequencefile" : { "type" : "sequencefile", "extensions" : [ "seq" ] }, "csvh" : { "type" : "text", "extensions" : [ "csvh" ], "extractHeader" : true, "delimiter" : "," }, "image" : { "type" : "image", "extensions" : [ "jpg", "jpeg", "jpe", "tif", "tiff", "dng", "psd", "png", "bmp", "gif", "ico", "pcx", "wav", "wave", "avi&q
Apache Drill Hangout - May 28, 2019
Hi Drillers, We will have our bi-weekly hangout tomorrow May 28th, at 10 AM PST (link: https://meet.google.com/yki-iqdf-tai ). If there are any topics you would like to discuss during the hangout please respond to this email. Kind regards, Volodymyr Vysotskyi
[jira] [Created] (DRILL-7250) Query with CTE fails when its name matches to the table name without access
Volodymyr Vysotskyi created DRILL-7250: -- Summary: Query with CTE fails when its name matches to the table name without access Key: DRILL-7250 URL: https://issues.apache.org/jira/browse/DRILL-7250 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 When impersonation is enabled, and for example, we have {{lineitem}} table with permissions {{750}} which is owned by {{user0_1:group0_1}} and {{user2_1}} don't have access to it. The following query: {code:sql} use mini_dfs_plugin.user0_1; with lineitem as (SELECT 1 as a) select * from lineitem {code} submitted from {{user2_1}} fails with the following error: {noformat} java.lang.Exception: org.apache.hadoop.security.AccessControlException: Permission denied: user=user2_1, access=READ_EXECUTE, inode="/user/user0_1/lineitem":user0_1:group0_1:drwxr-x--- at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.check(FSPermissionChecker.java:317) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:229) at org.apache.hadoop.hdfs.server.namenode.FSPermissionChecker.checkPermission(FSPermissionChecker.java:199) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1752) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPermission(FSDirectory.java:1736) at org.apache.hadoop.hdfs.server.namenode.FSDirectory.checkPathAccess(FSDirectory.java:1710) at org.apache.hadoop.hdfs.server.namenode.FSDirStatAndListingOp.getListingInt(FSDirStatAndListingOp.java:70) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getListing(FSNamesystem.java:4432) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getListing(NameNodeRpcServer.java:999) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getListing(ClientNamenodeProtocolServerSideTranslatorPB.java:646) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:982) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2217) at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2213) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2213) at ...(:0) ~[na:na] at org.apache.drill.exec.util.FileSystemUtil.listRecursive(FileSystemUtil.java:253) ~[classes/:na] at org.apache.drill.exec.util.FileSystemUtil.list(FileSystemUtil.java:208) ~[classes/:na] at org.apache.drill.exec.util.FileSystemUtil.listFiles(FileSystemUtil.java:104) ~[classes/:na] at org.apache.drill.exec.util.DrillFileSystemUtil.listFiles(DrillFileSystemUtil.java:86) ~[classes/:na] at org.apache.drill.exec.store.dfs.FileSelection.minusDirectories(FileSelection.java:178) ~[classes/:na] at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.detectEmptySelection(WorkspaceSchemaFactory.java:669) ~[classes/:na] at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:633) ~[classes/:na] at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.create(WorkspaceSchemaFactory.java:283) ~[classes/:na] at org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.getNewEntry(ExpandingConcurrentMap.java:96) ~[classes/:na] at org.apache.drill.exec.planner.sql.ExpandingConcurrentMap.get(ExpandingConcurrentMap.java:90) ~[classes/:na] at org.apache.drill.exec.store.dfs.WorkspaceSchemaFactory$WorkspaceSchema.getTable(WorkspaceSchemaFactory.java:439) ~[classes/:na] at org.apache.calcite.jdbc.SimpleCalciteSchema.getImplicitTable(SimpleCalciteSchema.java:83) ~[calcite-core-1.18.0-drill-r1.jar:1.18.0-drill-r1] at org.apache.calcite.jdbc.CalciteSchema.getTable(CalciteSchema.java:286) ~[calcite-core-1.18.0-drill-r1.jar:1.18.0-drill-r1] at org.apache.calcite.sql.validate.SqlValidatorUtil.getTableEntryFrom(SqlValidatorUtil.java:1046) ~[calcite-core-1.18.0-drill-r1.jar:1.18.0-drill-r1] at org.apache.calcite.sql.validate.SqlValidatorUtil.getTableEntry(SqlValidatorUtil.java:1003) ~[calcite-core-1.18.0-drill-r1.jar:1.18.
[jira] [Resolved] (DRILL-3995) Scalar replacement bug with Common Subexpression Elimination
[ https://issues.apache.org/jira/browse/DRILL-3995?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-3995. Resolution: Cannot Reproduce Fix Version/s: 1.17.0 > Scalar replacement bug with Common Subexpression Elimination > > > Key: DRILL-3995 > URL: https://issues.apache.org/jira/browse/DRILL-3995 > Project: Apache Drill > Issue Type: Bug >Reporter: Steven Phillips >Priority: Major > Fix For: 1.17.0 > > > The following query: > {code} > select t1.full_name from cp.`employee.json` t1, cp.`department.json` t2 where > t1.department_id = t2.department_id and t1.position_id = t2.department_id > {code} > fails with the following: > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > RuntimeException: Error at instruction 43: Expected an object reference, but > found . setValue(II)V > 0 R I I . . . . : :L0 > 1 R I I . . . . : : LINENUMBER 249 L0 > 2 R I I . . . . : : ICONST_0 > 3 R I I . . . . : I : ISTORE 3 > 4 R I I I . . . : : LCONST_0 > 5 R I I I . . . : J : LSTORE 4 > 6 R I I I J . . : :L1 > 7 R I I I J . . : : LINENUMBER 251 L1 > 8 R I I I J . . : : ALOAD 0 > 9 R I I I J . . : R : GETFIELD > org/apache/drill/exec/test/generated/HashTableGen2$BatchHolder.vv20 : > Lorg/apache/drill/exec/vector/NullableBigIntVector; > 00010 R I I I J . . : R : INVOKEVIRTUAL > org/apache/drill/exec/vector/NullableBigIntVector.getAccessor > ()Lorg/apache/drill/exec/vector/NullableBigIntVector$Accessor; > 00011 R I I I J . . : R : ILOAD 1 > 00012 R I I I J . . : R I : INVOKEVIRTUAL > org/apache/drill/exec/vector/NullableBigIntVector$Accessor.isSet (I)I > 00013 R I I I J . . : I : ISTORE 3 > 00014 R I I I J . . : :L2 > 00015 R I I I J . . : : LINENUMBER 252 L2 > 00016 R I I I J . . : : ILOAD 3 > 00017 R I I I J . . : I : ICONST_1 > 00018 R I I I J . . : I I : IF_ICMPNE L3 > 00019 R I I I J . . : :L4 > 00020 ? : LINENUMBER 253 L4 > 00021 ? : ALOAD 0 > 00022 ? : GETFIELD > org/apache/drill/exec/test/generated/HashTableGen2$BatchHolder.vv20 : > Lorg/apache/drill/exec/vector/NullableBigIntVector; > 00023 ? : INVOKEVIRTUAL > org/apache/drill/exec/vector/NullableBigIntVector.getAccessor > ()Lorg/apache/drill/exec/vector/NullableBigIntVector$Accessor; > 00024 ? : ILOAD 1 > 00025 ? : INVOKEVIRTUAL > org/apache/drill/exec/vector/NullableBigIntVector$Accessor.get (I)J > 00026 ? : LSTORE 4 > 00027 R I I I J . . : :L3 > 00028 R I I I J . . : : LINENUMBER 256 L3 > 00029 R I I I J . . : : ILOAD 3 > 00030 R I I I J . . : I : ICONST_0 > 00031 R I I I J . . : I I : IF_ICMPEQ L5 > 00032 R I I I J . . : :L6 > 00033 ? : LINENUMBER 257 L6 > 00034 ? : ALOAD 0 > 00035 ? : GETFIELD > org/apache/drill/exec/test/generated/HashTableGen2$BatchHolder.vv24 : > Lorg/apache/drill/exec/vector/NullableBigIntVector; > 00036 ? : INVOKEVIRTUAL > org/apache/drill/exec/vector/NullableBigIntVector.getMutator > ()Lorg/apache/drill/exec/vector/NullableBigIntVector$Mutator; > 00037 ? : ILOAD 2 > 00038 ? : ILOAD 3 > 00039 ? : LLOAD 4 > 00040 ? : INVOKEVIRTUAL > org/apache/drill/exec/vector/NullableBigIntVector$Mutator.set (IIJ)V > 00041 R I I I J . . : :L5 > 00042 R I I I J . . : : LINENUMBER 259 L5 > 00043 R I I I J . . : : ALOAD 6 > 00044 ? : GETFIELD > org/apache/drill/exec/expr/holders/NullableBigIntHolder.isSet : I > 00045 ? : ICONST_0 > 00046 ? : IF_ICMPEQ L7 > 00047 ? :L8 > 00048 ? : LINENUMBER 260 L8 > 00049 ? : ALOAD 0 > 00050 ? : GETFIELD > org/apache/drill/exec/test/generated/HashTableGen2$BatchHolder.vv27 : > Lorg/apache/drill/exec/vector/NullableBigIntVector; > 00051 ? : INVOKEVIRTUAL > org/apache/drill/exec/vector/NullableBigIntVector.getMutator > ()Lorg/apache/drill/exec/vector/NullableBigIntVector$Mutator; > 00052 ? : ILOAD 2 > 00053 ? : ALOAD 6 > 00054 ? : GETFIELD > org/apache/drill/exec/expr/holders/NullableBigIntHolder.isSet : I &g
[jira] [Resolved] (DRILL-4098) Assembly code in drillbit.log
[ https://issues.apache.org/jira/browse/DRILL-4098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-4098. Resolution: Fixed Fix Version/s: 1.16.0 Fixed in DRILL-2326 > Assembly code in drillbit.log > - > > Key: DRILL-4098 > URL: https://issues.apache.org/jira/browse/DRILL-4098 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.3.0 >Reporter: Khurram Faraaz >Priority: Major > Fix For: 1.16.0 > > > We are seeing the below assembly code and the stack trace in drillbit.log > after a Functional test run on a 4 node cluster on CentOS using MapR-Drill > 1.3 RPM (latest 11/11). > From drillbit.log > {code} > 2015-11-12 06:36:53,553 [29bbcc7a-36dd-dc7a-d77a-388b228896a4:frag:0:0] INFO > o.a.d.e.w.f.FragmentStatusReporter - > 29bbcc7a-36dd-dc7a-d77a-388b228896a4:0:0: State to report: RUNNING > 2015-11-12 06:36:53,588 [29bbcc7a-36dd-dc7a-d77a-388b228896a4:frag:0:0] ERROR > o.a.drill.exec.compile.MergeAdapter - Failure while merging classes. > java.lang.RuntimeException: Error at instruction 26: Expected an object > reference, but found . doEval(II)V > 0 R I I . . . : :L0 > 1 R I I . . . : : LINENUMBER 104 L0 > 2 R I I . . . : : LCONST_0 > 3 R I I . . . : J : LSTORE 3 > 4 R I I J . . : :L1 > 5 R I I J . . : : LINENUMBER 106 L1 > 6 R I I J . . : : ALOAD 0 > 7 R I I J . . : R : GETFIELD > org/apache/drill/exec/test/generated/ProjectorGen4245.vv0 : > Lorg/apache/drill/exec/vector/BigIntVector; > 8 R I I J . . : R : INVOKEVIRTUAL > org/apache/drill/exec/vector/BigIntVector.getAccessor > ()Lorg/apache/drill/exec/vector/BigIntVector$Accessor; > 9 R I I J . . : R : ILOAD 1 > 00010 R I I J . . : R I : INVOKEVIRTUAL > org/apache/drill/exec/vector/BigIntVector$Accessor.get (I)J > 00011 R I I J . . : J : LSTORE 3 > 00012 R I I J . . : :L2 > 00013 R I I J . . : : LINENUMBER 108 L2 > 00014 R I I J . . : : ALOAD 0 > 00015 R I I J . . : R : GETFIELD > org/apache/drill/exec/test/generated/ProjectorGen4245.vv4 : > Lorg/apache/drill/exec/vector/BigIntVector; > 00016 R I I J . . : R : INVOKEVIRTUAL > org/apache/drill/exec/vector/BigIntVector.getMutator > ()Lorg/apache/drill/exec/vector/BigIntVector$Mutator; > 00017 R I I J . . : R : ILOAD 2 > 00018 R I I J . . : R I : LLOAD 3 > 00019 R I I J . . : R I J : INVOKEVIRTUAL > org/apache/drill/exec/vector/BigIntVector$Mutator.set (IJ)V > 00020 R I I J . . : :L3 > 00021 R I I J . . : : LINENUMBER 109 L3 > 00022 R I I J . . : : ALOAD 0 > 00023 R I I J . . : R : GETFIELD > org/apache/drill/exec/test/generated/ProjectorGen4245.vv7 : > Lorg/apache/drill/exec/vector/BigIntVector; > 00024 R I I J . . : R : INVOKEVIRTUAL > org/apache/drill/exec/vector/BigIntVector.getMutator > ()Lorg/apache/drill/exec/vector/BigIntVector$Mutator; > 00025 R I I J . . : R : ILOAD 2 > 00026 R I I J . . : R I : ALOAD 5 > 00027 ? : GETFIELD > org/apache/drill/exec/expr/holders/BigIntHolder.value : J > 00028 ? : INVOKEVIRTUAL > org/apache/drill/exec/vector/BigIntVector$Mutator.set (IJ)V > 00029 ? :L4 > 00030 ? : LINENUMBER 110 L4 > 00031 ? : ALOAD 0 > 00032 ? : GETFIELD > org/apache/drill/exec/test/generated/ProjectorGen4245.vv10 : > Lorg/apache/drill/exec/vector/BigIntVector; > 00033 ? : INVOKEVIRTUAL > org/apache/drill/exec/vector/BigIntVector.getMutator > ()Lorg/apache/drill/exec/vector/BigIntVector$Mutator; > 00034 ? : ILOAD 2 > 00035 ? : ALOAD 5 > 00036 ? : GETFIELD > org/apache/drill/exec/expr/holders/BigIntHolder.value : J > 00037 ? : INVOKEVIRTUAL > org/apache/drill/exec/vector/BigIntVector$Mutator.set (IJ)V > 00038 ? :L5 > 00039 ? : LINENUMBER 111 L5 > 00040 ? : ALOAD 0 > 00041 ? : GETFIELD > org/apache/drill/exec/test/generated/ProjectorGen4245.vv13 : > Lorg/apache/drill/exec/vector/BigIntVector; > 00042 ? : INVOKEVIRTUAL > org/apache/drill/exec/vector/BigIntVector.getMutator > ()Lorg/apache/drill/exec/vector/BigIntVector$Mutator; > 00043 ? : ILOAD 2 > 00044 ? : ALOAD 5 > 00045 ? : GETFIELD > org/apache/drill/exec/expr/holders/BigIntHol
[jira] [Resolved] (DRILL-4299) Query that involves convert_from succeeds, exception is found in drillbit.log
[ https://issues.apache.org/jira/browse/DRILL-4299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-4299. Resolution: Fixed Fix Version/s: 1.16.0 Fixed in DRILL-2326 > Query that involves convert_from succeeds, exception is found in drillbit.log > - > > Key: DRILL-4299 > URL: https://issues.apache.org/jira/browse/DRILL-4299 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Codegen >Reporter: Victoria Markman >Priority: Major > Fix For: 1.16.0 > > Attachments: data, drillbit.log > > > Here is example of the exception, please refer to drillbit.log that is > attached. > {code} > 2016-01-22 00:16:12,065 [295e8b32-e2e1-6966-2b13-389cd724ac73:foreman] INFO > o.a.drill.exec.work.foreman.Foreman - Query text for query id > 295e8b32-e2e1-6966-2b13-389cd724ac73: select c_row, c_float4, > convert_from(convert_to(c_float4, 'FLOAT'), 'FLOAT') from data > 2016-01-22 00:16:12,166 [295e8b32-e2e1-6966-2b13-389cd724ac73:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Took 0 ms to get file statuses > 2016-01-22 00:16:12,173 [295e8b32-e2e1-6966-2b13-389cd724ac73:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of > 1 using 1 threads. Time: 6ms total, 6.100645ms avg, 6ms max. > 2016-01-22 00:16:12,173 [295e8b32-e2e1-6966-2b13-389cd724ac73:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Fetch parquet metadata: Executed 1 out of > 1 using 1 threads. Earliest start: 1.338000 μs, Latest start: 1.338000 μs, > Average start: 1.338000 μs . > 2016-01-22 00:16:12,173 [295e8b32-e2e1-6966-2b13-389cd724ac73:foreman] INFO > o.a.d.exec.store.parquet.Metadata - Took 6 ms to read file metadata > 2016-01-22 00:16:12,229 [295e8b32-e2e1-6966-2b13-389cd724ac73:frag:0:0] INFO > o.a.d.e.w.fragment.FragmentExecutor - > 295e8b32-e2e1-6966-2b13-389cd724ac73:0:0: State change requested > AWAITING_ALLOCATION --> RUNNING > 2016-01-22 00:16:12,229 [295e8b32-e2e1-6966-2b13-389cd724ac73:frag:0:0] INFO > o.a.d.e.w.f.FragmentStatusReporter - > 295e8b32-e2e1-6966-2b13-389cd724ac73:0:0: State to report: RUNNING > 2016-01-22 00:16:12,300 [295e8b32-e2e1-6966-2b13-389cd724ac73:frag:0:0] ERROR > o.a.drill.exec.compile.MergeAdapter - Failure while merging classes. > java.lang.RuntimeException: Error at instruction 147: Expected an object > reference, but found . doEval(II)V > 0 R I I . . . . . . . . . . . . . . . . . . . . . . : :L0 > 1 R I I . . . . . . . . . . . . . . . . . . . . . . : : LINENUMBER > 60 L0 > 2 R I I . . . . . . . . . . . . . . . . . . . . . . : : ICONST_0 > 3 R I I . . . . . . . . . . . . . . . . . . . . . . : I : ISTORE 3 > 4 R I I I . . . . . . . . . . . . . . . . . . . . . : : FCONST_0 > 5 R I I I . . . . . . . . . . . . . . . . . . . . . : F : FSTORE 4 > 6 R I I I F . . . . . . . . . . . . . . . . . . . . : :L1 > 7 R I I I F . . . . . . . . . . . . . . . . . . . . : : LINENUMBER > 62 L1 > 8 R I I I F . . . . . . . . . . . . . . . . . . . . : : ALOAD 0 > 9 R I I I F . . . . . . . . . . . . . . . . . . . . : R : GETFIELD > org/apache/drill/exec/test/generated/ProjectorGen11.vv1 : > Lorg/apache/drill/exec/vector/NullableFloat4Vector; > 00010 R I I I F . . . . . . . . . . . . . . . . . . . . : R : > INVOKEVIRTUAL org/apache/drill/exec/vector/NullableFloat4Vector.getAccessor > ()Lorg/apache/drill/exec/vector/NullableFloat4Vector$Accessor; > 00011 R I I I F . . . . . . . . . . . . . . . . . . . . : R : ILOAD 1 > 00012 R I I I F . . . . . . . . . . . . . . . . . . . . : R I : > INVOKEVIRTUAL > org/apache/drill/exec/vector/NullableFloat4Vector$Accessor.isSet (I)I > 00013 R I I I F . . . . . . . . . . . . . . . . . . . . : I : ISTORE 3 > 00014 R I I I F . . . . . . . . . . . . . . . . . . . . : :L2 > 00015 R I I I F . . . . . . . . . . . . . . . . . . . . : : LINENUMBER > 63 L2 > 00016 R I I I F . . . . . . . . . . . . . . . . . . . . : : ILOAD 3 > 00017 R I I I F . . . . . . . . . . . . . . . . . . . . : I : ICONST_1 > 00018 R I I I F . . . . . . . . . . . . . . . . . . . . : I I : > IF_ICMPNE L3 > 00019 R I I I F . . . . . . . . . . . . . . . . . . . . : :L4 > 00020 ? : LINENUMBER 64 L4 > 00021 ? : ALOAD 0 > 00022 ? : GETFIELD > org/apache/drill/exec/test/generat
[jira] [Created] (DRILL-7241) Hash aggregate does not work with interval types
Volodymyr Vysotskyi created DRILL-7241: -- Summary: Hash aggregate does not work with interval types Key: DRILL-7241 URL: https://issues.apache.org/jira/browse/DRILL-7241 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Queries with hash aggregation for interval data types fail with Code generation error. *Steps to reproduce:* disable stream aggregate to force hash aggregate: {code:sql} set `planner.enable_streamagg`=false; {code} Submit query with aggregation for interval type: {code:sql} select max(age) as max_age from (select AGE('1957-06-13') as age) group by age; {code} It fails with the error: {noformat} Error: UNSUPPORTED_OPERATION ERROR: Code generation error - likely an error in the code. Fragment 0:0 [Error Id: fdd9ee7d-9991-42e2-ad2b-eb36503051f8 on user515050-pc:31019] (state=,code=0) {noformat} Stack trace from logs: {noformat} 2019-05-06 16:42:06,643 [Client-1] INFO o.a.d.j.i.DrillCursor$ResultsListener - [#10] Query failed: org.apache.drill.common.exceptions.UserRemoteException: UNSUPPORTED_OPERATION ERROR: Code generation error - likely an error in the code. Fragment 0:0 [Error Id: 65d849df-408c-4851-88be-b13d99c7b6e6 on user515050-pc:31019] at org.apache.drill.exec.rpc.user.QueryResultHandler.resultArrived(QueryResultHandler.java:123) [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT] at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:422) [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT] at org.apache.drill.exec.rpc.user.UserClient.handle(UserClient.java:96) [drill-java-exec-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT] at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:273) [drill-rpc-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT] at org.apache.drill.exec.rpc.RpcBus$InboundHandler.decode(RpcBus.java:243) [drill-rpc-1.17.0-SNAPSHOT.jar:1.17.0-SNAPSHOT] at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:88) [netty-codec-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.handler.timeout.IdleStateHandler.channelRead(IdleStateHandler.java:287) [netty-handler-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.handler.codec.MessageToMessageDecoder.channelRead(MessageToMessageDecoder.java:102) [netty-codec-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.handler.codec.ByteToMessageDecoder.fireChannelRead(ByteToMessageDecoder.java:312) [netty-codec-4.0.48.Final.jar:4.0.48.Final] at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:286) [netty-codec-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:356) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:342) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:335) [netty-transport-4.0.48.Final.jar:4.0.48.Final] at io.netty.channel.ChannelInboundHandlerAdapter.channelRead(ChannelInboundHandlerAdapter.java:86) [netty-transport-4.0.48.Final.jar:4.0.48.Final
Re: May Apache Drill board report
Looks good, +1 Пт, 3 трав. 2019 23:32 користувач Arina Ielchiieva пише: > Hi all, > > please take a look at the draft board report for the last quarter and let > me know if you have any comments. > > Thanks, > Arina > > = > > ## Description: > - Drill is a Schema-free SQL Query Engine for Hadoop, NoSQL and Cloud > Storage. > > ## Issues: > - There are no issues requiring board attention at this time. > > ## Activity: > - Since the last board report, Drill has released version 1.16.0, including > the following enhancements: > - CREATE OR REPLACE SCHEMA command to define a schema for text files > - REFRESH TABLE METADATA command can generate metadata cache files for > specific columns > - ANALYZE TABLE statement to computes statistics on Parquet data > - SYSLOG (RFC-5424) Format Plugin > - NEAREST DATE function to facilitate time series analysis > - Format plugin for LTSV files > - Ability to query Hive views > - Upgrade to SQLLine 1.7 > - Apache Calcite upgrade to 1.18.0 > - Several Drill Web UI improvements, including: > - Storage plugin management improvements > - Query progress indicators and warnings > - Ability to limit the result size for better UI response > - Ability to sort the list of profiles in the Drill Web UI > - Display query state in query result page > - Button to reset the options filter > > - Drill User Meetup will be held on May 22, 2019. Two talks are planned: > - Alibaba's Usage of Apache Drill for querying a Time Series Database > - What’s new with Apache Drill 1.16 & a demo of Schema Provisioning > > ## Health report: > - The project is healthy. Development activity as reflected in the pull > requests and JIRAs is good. > - Activity on the dev and user mailing lists are stable. > - One PMC member was added in the last period. > > ## PMC changes: > > - Currently 24 PMC members. > - Sorabh Hamirwasia was added to the PMC on Fri Apr 05 2019 > > ## Committer base changes: > > - Currently 51 committers. > - No new committers added in the last 3 months > - Last committer addition was Salim Achouche at Mon Dec 17 2018 > > ## Releases: > > - 1.16.0 was released on Thu May 02 2019 > > ## Mailing list activity: > > - dev@drill.apache.org: >- 406 subscribers (down -10 in the last 3 months): >- 2299 emails sent to list (1903 in previous quarter) > > - iss...@drill.apache.org: >- 17 subscribers (down -1 in the last 3 months): >- 2373 emails sent to list (2233 in previous quarter) > > - u...@drill.apache.org: >- 582 subscribers (down -15 in the last 3 months): >- 235 emails sent to list (227 in previous quarter) > > ## JIRA activity: > > - 214 JIRA tickets created in the last 3 months > - 212 JIRA tickets closed/resolved in the last 3 months >
Re: [VOTE] Apache Drill Release 1.16.0 - RC2
Downloaded binary and source tarballs, verified signatures and checksums. Run unit tests on ubuntu with JDK 8 using tar with sources (31:52 min), built on Windows using sources tarball without any issues. Run Drill in embedded and distributed modes on Ubuntu and in embedded mode on Windows, submitted several TPC-DS queries, verified that profiles displayed correctly. Checked JDBC driver using SQuirreL SQL client and custom java client. Verified that jars from prebuilt tar do not contain excessive files I mentioned before. +1 (binding) Kind regards, Volodymyr Vysotskyi On Wed, May 1, 2019 at 5:50 AM Boaz Ben-Zvi wrote: > Downloaded both the binary and src tarballs, and verified the SHA > signatures and the PGP. > > Built and ran the full unit tests on both Linux and Mac. > > Successfully ran some old favorite queries, and several manual tests of > REFRESH METADATA with COLUMNS, and verified the metadata files and > summaries. > >+1 from me for RC2 . > > -- Boaz > > On 4/30/19 11:26 AM, Kunal Khatua wrote: > > Ran manual tests with random queries, trying out the UI and running > joins on small tables. > > > > HOCON export of storage plugins does not actually export in HOCON > format, but that is not a blocker. > > > > +1 (binding) > > > > ~ Kunal > > > > On 4/30/2019 4:53:48 AM, Arina Yelchiyeva > wrote: > > Downloaded binary tarball and ran Drill in embedded mode. > > Verified schema provisioning for text files, dynamic UDFs. > > Ran random queries, including long-running, queried system tables, > created tables with different formats. > > Checked Web UI (queries, profiles, storage plugins, logs pages). > > > > +1 (binding) > > > > Kind regards, > > Arina > > > >> On Apr 30, 2019, at 8:33 AM, Aman Sinha wrote: > >> > >> Downloaded binary tarball on my Mac and ran in embedded mode. > >> Verified Sorabh's release signature and the tar file's checksum > >> Did a quick glance through maven artifacts > >> Did some manual tests with TPC-DS Web_Sales table and ran REFRESH > METADATA > >> command against the same table > >> Checked runtime query profiles of above queries and verified COUNT(*), > >> COUNT(column) optimization is getting applied. > >> Also did a build from source on my linux VM. > >> > >> RC2 looks good ! +1 > >> > >> On Fri, Apr 26, 2019 at 8:28 AM SorabhApache wrote: > >> > >>> Hi Drillers, > >>> I'd like to propose the third release candidate (RC2) for the Apache > Drill, > >>> version 1.16.0. > >>> > >>> Changes since the previous release candidate: > >>> DRILL-7201: Strange symbols in error window (Windows) > >>> DRILL-7202: Failed query shows warning that fragments has made no > progress > >>> DRILL-7207: Update the copyright year in NOTICE.txt file > >>> DRILL-7212: Add gpg key with apache.org email for sorabh > >>> DRILL-7213: drill-format-mapr.jar contains stale git.properties file > >>> > >>> The RC2 includes total of 220 resolved JIRAs [1]. > >>> Thanks to everyone for their hard work to contribute to this release. > >>> > >>> The tarball artifacts are hosted at [2] and the maven artifacts are > hosted > >>> at [3]. > >>> > >>> This release candidate is based on commit > >>> 751e87736c2ddbc184b52cfa56f4e29c68417cfe located at [4]. > >>> > >>> Please download and try out the release candidate. > >>> > >>> The vote ends at 04:00 PM UTC (09:00 AM PDT, 07:00 PM EET, 09:30 PM > IST), > >>> May 1st, 2019 > >>> > >>> [ ] +1 > >>> [ ] +0 > >>> [ ] -1 > >>> > >>> Here is my vote: +1 > >>> [1] > >>> > >>> > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344284 > >>> [2] http://home.apache.org/~sorabh/drill/releases/1.16.0/rc2/ > >>> [3] > >>> > https://repository.apache.org/content/repositories/orgapachedrill-1073/ > >>> [4] https://github.com/sohami/drill/commits/drill-1.16.0 > >>> > >>> Thanks, > >>> Sorabh > >>> > > >
Re: [VOTE] Apache Drill Release 1.16.0 - RC1
Hi Sorabh, I have noticed that jars in prebuild tar contain some strange files, for example, *drill-jdbc-all-1.16.0.jar* contains the following files: *javac.sh* *org.codehaus.plexus.compiler.javac.JavacCompiler1256088670033285178arguments* *org.codehaus.plexus.compiler.javac.JavacCompiler1458111453480208588arguments* *org.codehaus.plexus.compiler.javac.JavacCompiler2392560589194600493arguments* *org.codehaus.plexus.compiler.javac.JavacCompiler4475905192586529595arguments* *org.codehaus.plexus.compiler.javac.JavacCompiler4524532450095901144arguments* *org.codehaus.plexus.compiler.javac.JavacCompiler4670895443631397937arguments* *org.codehaus.plexus.compiler.javac.JavacCompiler5215058338087807885arguments* *org.codehaus.plexus.compiler.javac.JavacCompiler7526103232425779297arguments* which contain some info about your machine (username, etc.) Jars from the previous release didn't contain these files. Also, I have built master on my machine and these files are absent for me. Could you please take a look? This problem is observed for both RCs. Kind regards, Volodymyr Vysotskyi On Thu, Apr 25, 2019 at 8:20 AM SorabhApache wrote: > Update: > 1) DRILL-7208 is there in 1.15 release as well, so it's not a blocker for > 1.16 > 2) DRILL-7213: drill-format-mapr.jar contains stale git.properties file > >- Still investigating on the above issue. > > 3) DRILL-7201: Strange symbols in error window (Windows) > >- Issue is not reproducible on Kunal's machine. He is having discussion >on JIRA to see if it's treated as a blocker or not. > > To investigate for DRILL-7213 I have to drop the RC1 candidate since again > performing the release required to push it to my remote repo and publish to > maven repo as well. So I don't have RC1 binaries if we consider all the > issues as non-blocking. > > I will re-share the RC candidate once either fix for DRILL-7213/DRILL-7201 > are available or it's considered as non-blockers. Any thoughts? > > Thanks, > Sorabh > > On Wed, Apr 24, 2019 at 9:17 PM Boaz Ben-Zvi wrote: > > > Downloaded both the binary and src tarballs, and verified the SHA > > signatures and the PGP. > > > > Built and ran the full unit tests on both Linux and Mac (took 3:05 hours > > on my Mac). > > > > Successfully ran some old favorite queries with Sort/Hash-Join/Hash-Agg > > spilling. > > > > Ran several manual tests of REFRESH METADATA with COLUMNS, and verified > > the metadata files and summaries. > > > > Noticed that when specifying a COLUMN which a sub-field in a complex > > type (e.g., a key in a map), the whole column (i.e. all the other keys > > as well) was marked as "interesting"; but this may be "by design", as > > the refresh granularity is the whole column. > > > > Also noticed the sys.version issue (DRILL-7208 > > <https://issues.apache.org/jira/browse/DRILL-7208>) - should be minor as > > only affecting users of the SRC tarball, likely developers who > > build/modify the code anyway. > > > > Hence my vote is +1 . > > > >-- Boaz > > > > On 4/24/19 10:57 AM, Kunal Khatua wrote: > > > Downloaded the tarball and tried it in embedded mode. > > > > > > Ran simple join queries and interacted with the WebUI. > > > > > > Issues confirmed were DRILL-7192 and DRILL-7203. > > > I'm unable to repro DRILL-7201 and DRILL-7202, though I have a fix for > > the latter. Will work with Arina to identify repro steps. > > > > > > None of these are blockers IMO, so I'll vote +1. > > > > > > ~ Kunal > > > > > > > > > On 4/24/2019 10:38:31 AM, Khurram Faraaz wrote: > > > i see the correct version and commit, I deployed the binaries to test. > > > > > > Apache Drill 1.16.0 > > > "Start your SQL engine." > > > apache drill> select * from sys.version; > > > > > > +-+--+-+---+---+---+ > > > | version | commit_id | > > > commit_message | commit_time | > > > build_email | build_time | > > > > > > +-+--+-+---+---+---+ > > > | 1.16.0 | cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 | > > > [maven-release-plugin] prepare release drill-1.16.0 | 22.04.2019 @ > > 09:08:36 > > > PDT | sor...@apache.org | 22.04.2019 @ 09:53:25 PDT | > >
Re: [VOTE] Apache Drill Release 1.16.0 - RC1
Hi Aman, There are two different issues connected with *git.properties* file. Regarding the problem I have mentioned, prebuilt tar (apache-drill-1.16.0.tar.gz) contains *drill-format-mapr-1.16.0.jar* jar which contains a *git.properties* file with the incorrect version. When *select * from sys.version* query is submitted, class loader finds the first file named as *git.properties* from the classpath (each drill jar contains its own *git.properties* file) and for my case file from *drill-format-mapr-1.16.0.jar *is picked up, so the incorrect result is returned. But it may not be reproducible for other machines since it depends on the order of files for the class loader. Regarding the problem Anton has mentioned, Drill should be built from the sources (apache-drill-1.16.0-src.tar.gz), and for that version, *select * from sys.version* returns the result without information about commit. Kind regards, Volodymyr Vysotskyi On Wed, Apr 24, 2019 at 6:33 PM Aman Sinha wrote: > This works fine for me with the binary tarball that I installed on my Mac. > ..it shows the correct commit message. > > Apache Drill 1.16.0 > > "This isn't your grandfather's SQL." > > apache drill> *select* * *from* sys.version; > > > +-+--+-+---+---+---+ > | version |commit_id | > commit_message|commit_time| > build_email|build_time | > > +-+--+-+---+---+---+ > | 1.16.0 | cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 | > [maven-release-plugin] prepare release drill-1.16.0 | 22.04.2019 @ 09:08:36 > PDT | sor...@apache.org | 22.04.2019 @ 09:54:09 PDT | > > +-+--+-+---+---+---+ > > I don't see any extraneous git.properties anywhere in the source > distribution that I downloaded: > > [root@aman1 apache-drill-1.16.0-src]# find . -name "git.properties" > > > ./distribution/target/apache-drill-1.16.0/apache-drill-1.16.0/git.properties > > ./git.properties > > > > On Wed, Apr 24, 2019 at 4:51 AM Arina Ielchiieva wrote: > > > Taking into account previous emails, looks like we'll need to have new > RC. > > I also suggest to include > https://issues.apache.org/jira/browse/DRILL-7201 > > into > > new RC. > > > > Kind regards, > > Arina > > > > On Wed, Apr 24, 2019 at 2:44 PM Volodymyr Vysotskyi < > volody...@apache.org> > > wrote: > > > > > Also, I have noticed that for the prebuilt tar, the following query on > my > > > machine returns the wrong results: > > > > > > apache drill> select * from sys.version; > > > > > > > > > +-+--++---+---+---+ > > > | version |commit_id | > > >commit_message |commit_time| > > > build_email|build_time | > > > > > > > > > +-+--++---+---+---+ > > > | 1.16.0 | b3db1ff4b0d29210593c4485125578cca7a64b42 | DRILL-7188: > Revert > > > DRILL-6642: Update protocol-buffers version | 21.04.2019 @ 15:35:28 > PDT | > > > sor...@apache.org | 22.04.2019 @ 09:07:35 PDT | > > > > > > > > > +-+--++---+-------+---+ > > > 1 row selected (1.318 seconds) > > > > > > The root cause for this problem is that drill-format-mapr-1.16.0.jar > jar > > > contains git.properties file with incorrect version, and this file was > > the > > > first one which was found by the class loader. > > > > > > I think this is a blocker for the release. > > > > > > Kind regards, > > > Volodymyr Vysotskyi > > > > > > > > > On Wed, Apr 24, 2019 at 2:31 PM Anton Gozhiy
Re: [VOTE] Apache Drill Release 1.16.0 - RC1
Also, I have noticed that for the prebuilt tar, the following query on my machine returns the wrong results: apache drill> select * from sys.version; +-+--++---+---+---+ | version |commit_id | commit_message |commit_time| build_email|build_time | +-+--++---+---+---+ | 1.16.0 | b3db1ff4b0d29210593c4485125578cca7a64b42 | DRILL-7188: Revert DRILL-6642: Update protocol-buffers version | 21.04.2019 @ 15:35:28 PDT | sor...@apache.org | 22.04.2019 @ 09:07:35 PDT | +-+--++---+---+---+ 1 row selected (1.318 seconds) The root cause for this problem is that drill-format-mapr-1.16.0.jar jar contains git.properties file with incorrect version, and this file was the first one which was found by the class loader. I think this is a blocker for the release. Kind regards, Volodymyr Vysotskyi On Wed, Apr 24, 2019 at 2:31 PM Anton Gozhiy wrote: > Clarification to my last message: > I downloaded Drill from here: > > http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/apache-drill-1.16.0-src.tar.gz > and built it by command: > mvn clean install -DskipTests > > On Wed, Apr 24, 2019 at 1:53 PM Anton Gozhiy wrote: > > > Hi All, > > > > I found an issue with Drill version, used the provided rc1 source: > > apache drill> select * from sys.version; > > > > > +-+---++-+-++ > > | version | commit_id | commit_message | commit_time | build_email | > > build_time | > > > > > +-+---++-+-++ > > | 1.16.0 | Unknown || | Unknown | > > | > > > > > +-+---++-+-++ > > > > Although there is a valid git.properties file in the Drill root > directory: > > #Generated by Git-Commit-Id-Plugin > > #Mon Apr 22 09:52:07 PDT 2019 > > git.branch=cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 > > git.build.host=SHamirw-E755.local > > git.build.time=22.04.2019 @ 09\:52\:07 PDT > > git.build.user.email=sor...@apache.org > > git.build.user.name=Sorabh Hamirwasia > > git.build.version=1.16.0 > > git.closest.tag.commit.count=0 > > git.closest.tag.name=drill-1.16.0 > > git.commit.id=cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 > > git.commit.id.abbrev=cf5b758 > > git.commit.id.describe=drill-1.16.0-0-gcf5b758 > > git.commit.id.describe-short=drill-1.16.0-0 > > git.commit.message.full=[maven-release-plugin] prepare release > drill-1.16.0 > > git.commit.message.short=[maven-release-plugin] prepare release > > drill-1.16.0 > > git.commit.time=22.04.2019 @ 09\:08\:36 PDT > > git.commit.user.email=sor...@apache.org > > git.commit.user.name=Sorabh Hamirwasia > > git.dirty=false > > git.remote.origin.url=https\://github.com/apache/drill.git > > git.tags=drill-1.16.0 > > git.total.commit.count=3568 > > > > But looks like it doesn't get into the classpath. > > Could someone take a look into this? > > > > Thanks! > > > > On Wed, Apr 24, 2019 at 11:50 AM Volodymyr Vysotskyi < > volody...@apache.org> > > wrote: > > > >> Hi Sorabh, > >> > >> Sorry for being picky, but looks like the key you have published was > >> generated for non-apache email: sohami.apa...@gmail.com. According to > the > >> [1], it is highly recommended to use Apache email address as the primary > >> User-ID. > >> > >> [1] https://www.apache.org/dev/release-signing#user-id > >> > >> Kind regards, > >> Volodymyr Vysotskyi > >> > >> > >> On Wed, Apr 24, 2019 at 10:10 AM Jyothsna Reddy > > >> wrote: > >> > >> > Built it from cloning the git branch and unit tests on my Linux VM > (time > >> > taken - 43 min). > >> > Tested new features of metadata caching by creating v4 cache files > using > >> > new Refresh Metadata commands and manually verified the cache files. > >> Tried > >> >
Re: [VOTE] Apache Drill Release 1.16.0 - RC1
Hi Sorabh, Sorry for being picky, but looks like the key you have published was generated for non-apache email: sohami.apa...@gmail.com. According to the [1], it is highly recommended to use Apache email address as the primary User-ID. [1] https://www.apache.org/dev/release-signing#user-id Kind regards, Volodymyr Vysotskyi On Wed, Apr 24, 2019 at 10:10 AM Jyothsna Reddy wrote: > Built it from cloning the git branch and unit tests on my Linux VM (time > taken - 43 min). > Tested new features of metadata caching by creating v4 cache files using > new Refresh Metadata commands and manually verified the cache files. Tried > a few queries that use metadata cache and verified results. > > The release looks good to me +1. > > Thank you, > Jyothsna > > > > [image: Mailtrack] > < > https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5; > > > Sender > notified by > Mailtrack > < > https://mailtrack.io?utm_source=gmail_medium=signature_campaign=signaturevirality5; > > > 04/24/19, > 12:09:52 AM > > On Wed, Apr 24, 2019 at 12:09 AM Jyothsna Reddy > wrote: > > > Built it from cloning the git branch and unit tests on my Linux VM (time > > taken - 43 min). > > Tested new features of metadata caching by creating v4 cache files using > > new Refresh Metadata commands and manually verified the cache files. > Tried > > a few queries that use metadata cache and verified results. > > Did a few manual tests with REFRESH METADATA by creating the new V4 > > > > The release looks good to me +1. > > > > Thank you, > > Jyothsna > > > > On Tue, Apr 23, 2019 at 1:56 PM Sorabh Hamirwasia > > wrote: > > > >> Hi Volodymyr, > >> The KEYS file on svn will be updated when a release candidate is > approved > >> and all the artifacts are copied to the svn. > >> > >> NOTICE is not updated per release so I won't treat it as blocker. But > >> would > >> be good to add it in the wiki below to ensure from next time onwards > it's > >> updated. > >> > >> For release I am following this wiki[1] which is part of Parth's > >> repository. I will update it to include both the steps above as well. > >> > >> [1]: https://github.com/parthchandra/drill/wiki/Drill-Release-Process > >> > >> Thanks, > >> Sorabh > >> > >> On Tue, Apr 23, 2019 at 1:19 PM Volodymyr Vysotskyi < > volody...@apache.org > >> > > >> wrote: > >> > >> > Sorabh, could you please add your key to the > >> > https://dist.apache.org/repos/dist/release/drill/KEYS file? > >> > > >> > Not sure that it is a blocker, but the year in NOTICE is 2018. > >> > > >> > Do we have any guides for basic checks for release? If no, it would be > >> good > >> > to introduce such a list of things to check for the release manager. > >> > > >> > Kind regards, > >> > Volodymyr Vysotskyi > >> > > >> > > >> > On Tue, Apr 23, 2019 at 11:09 PM Aman Sinha > >> wrote: > >> > > >> > > Downloaded source tarball on my Linux VM and built and ran unit > tests > >> > > successfully (elapsed time 46 mins). > >> > > Downloaded binary tarball on my Mac and ran in embedded mode. > >> > > Verified Sorabh's release signature using gpg --verify > >> > > Checked the maven artifacts are published > >> > > Checked Ran a few queries against TPC-DS SF1 and examined query > >> profiles > >> > in > >> > > the Web UI. Looked good. > >> > > Did a few manual tests with REFRESH METADATA by creating the new V4 > >> > > metadata cache and checked EXPLAIN plans and query results. > >> > > Found an issue with control-c handling and filed DRILL-7198 and > >> noted in > >> > > the JIRA that I don't think it is a blocker. > >> > > > >> > > Overall, release looks good ! +1 > >> > > > >> > > Aman > >> > > > >> > > > >> > > On Tue, Apr 23, 2019 at 10:01 AM SorabhApache > >> wrote: > >> > > > >> > > > Thanks Aman and Volodymyr for discussing on this issue. Just to > >> clarify > >> > > on > >> > > > the thread that RC1 still stands as valid, since the issue is not > >> > blocker > >> > > > anymore. > >&g
Re: [VOTE] Apache Drill Release 1.16.0 - RC1
Sorabh, could you please add your key to the https://dist.apache.org/repos/dist/release/drill/KEYS file? Not sure that it is a blocker, but the year in NOTICE is 2018. Do we have any guides for basic checks for release? If no, it would be good to introduce such a list of things to check for the release manager. Kind regards, Volodymyr Vysotskyi On Tue, Apr 23, 2019 at 11:09 PM Aman Sinha wrote: > Downloaded source tarball on my Linux VM and built and ran unit tests > successfully (elapsed time 46 mins). > Downloaded binary tarball on my Mac and ran in embedded mode. > Verified Sorabh's release signature using gpg --verify > Checked the maven artifacts are published > Checked Ran a few queries against TPC-DS SF1 and examined query profiles in > the Web UI. Looked good. > Did a few manual tests with REFRESH METADATA by creating the new V4 > metadata cache and checked EXPLAIN plans and query results. > Found an issue with control-c handling and filed DRILL-7198 and noted in > the JIRA that I don't think it is a blocker. > > Overall, release looks good ! +1 > > Aman > > > On Tue, Apr 23, 2019 at 10:01 AM SorabhApache wrote: > > > Thanks Aman and Volodymyr for discussing on this issue. Just to clarify > on > > the thread that RC1 still stands as valid, since the issue is not blocker > > anymore. > > > > On Tue, Apr 23, 2019 at 9:18 AM Volodymyr Vysotskyi < > volody...@apache.org> > > wrote: > > > > > Discussed with Aman and concluded that this issue is not a blocker for > > the > > > release. > > > > > > Kind regards, > > > Volodymyr Vysotskyi > > > > > > > > > On Tue, Apr 23, 2019 at 6:39 PM Aman Sinha > wrote: > > > > > > > Hi Vova, > > > > I added some thoughts in the DRILL-7195 JIRA. > > > > > > > > Aman > > > > > > > > On Tue, Apr 23, 2019 at 6:06 AM Volodymyr Vysotskyi < > > > volody...@apache.org> > > > > wrote: > > > > > > > > > Hi all, > > > > > > > > > > I did some checks and found the following issues: > > > > > - DRILL-7195 <https://issues.apache.org/jira/browse/DRILL-7195> > > > > > - DRILL-7194 <https://issues.apache.org/jira/browse/DRILL-7194> > > > > > - DRILL-7192 <https://issues.apache.org/jira/browse/DRILL-7192> > > > > > > > > > > One of them (DRILL-7194) is also reproduced on the previous > version, > > > > > another is connected with the new feature (DRILL-7192), so I don't > > > think > > > > > that we should treat them as blockers. > > > > > The third one (DRILL-7195) is a regression and in some cases may > > cause > > > > the > > > > > wrong results, so I think that it should be fixed before the > release. > > > > > Any thoughts? > > > > > > > > > > Kind regards, > > > > > Volodymyr Vysotskyi > > > > > > > > > > > > > > > On Mon, Apr 22, 2019 at 8:58 PM SorabhApache > > > wrote: > > > > > > > > > > > *< Please disregard previous email, one of the link is not > correct > > in > > > > it. > > > > > > Use the information in this email instead >* > > > > > > > > > > > > Hi Drillers, > > > > > > I'd like to propose the second release candidate (RC1) for the > > Apache > > > > > > Drill, > > > > > > version 1.16.0. > > > > > > > > > > > > Changes since the previous release candidate: > > > > > > DRILL-7185: Drill Fails to Read Large Packets > > > > > > DRILL-7186: Missing storage.json REST endpoint > > > > > > DRILL-7190: Missing backward compatibility for REST API with > > > DRILL-6562 > > > > > > > > > > > > Also below 2 JIRA's were created to separately track revert of > > > protbuf > > > > > > changes in 1.16.0: > > > > > > DRILL-7188: Revert DRILL-6642: Update protocol-buffers version > > > > > > DRILL-7189: Revert DRILL-7105 Error while building the Drill > native > > > > > client > > > > > > > > > > > > The RC1 includes total of 215 resolved JIRAs [1]. > > > > > > Thanks to everyone for their hard work to contribute to this > > release. > > > > > > > > > > > > The tarball artifacts are hosted at [2] and the maven artifacts > are > > > > > hosted > > > > > > at [3]. > > > > > > > > > > > > This release candidate is based on commit > > > > > > cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 located at [4]. > > > > > > > > > > > > Please download and try out the release candidate. > > > > > > > > > > > > The vote ends at 06:00 PM UTC (11:00 AM PDT, 09:00 PM EET, 11:30 > PM > > > > IST), > > > > > > Apr 25th, 2019 > > > > > > > > > > > > [ ] +1 > > > > > > [ ] +0 > > > > > > [ ] -1 > > > > > > > > > > > > Here is my vote: +1 > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344284 > > > > > > [2] http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/ > > > > > > [3] > > > > > > > > > > > > https://repository.apache.org/content/repositories/orgapachedrill-1067/ > > > > > > [4] https://github.com/sohami/drill/commits/drill-1.16.0 > > > > > > > > > > > > Thanks, > > > > > > Sorabh > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > >
Re: [VOTE] Apache Drill Release 1.16.0 - RC1
Discussed with Aman and concluded that this issue is not a blocker for the release. Kind regards, Volodymyr Vysotskyi On Tue, Apr 23, 2019 at 6:39 PM Aman Sinha wrote: > Hi Vova, > I added some thoughts in the DRILL-7195 JIRA. > > Aman > > On Tue, Apr 23, 2019 at 6:06 AM Volodymyr Vysotskyi > wrote: > > > Hi all, > > > > I did some checks and found the following issues: > > - DRILL-7195 <https://issues.apache.org/jira/browse/DRILL-7195> > > - DRILL-7194 <https://issues.apache.org/jira/browse/DRILL-7194> > > - DRILL-7192 <https://issues.apache.org/jira/browse/DRILL-7192> > > > > One of them (DRILL-7194) is also reproduced on the previous version, > > another is connected with the new feature (DRILL-7192), so I don't think > > that we should treat them as blockers. > > The third one (DRILL-7195) is a regression and in some cases may cause > the > > wrong results, so I think that it should be fixed before the release. > > Any thoughts? > > > > Kind regards, > > Volodymyr Vysotskyi > > > > > > On Mon, Apr 22, 2019 at 8:58 PM SorabhApache wrote: > > > > > *< Please disregard previous email, one of the link is not correct in > it. > > > Use the information in this email instead >* > > > > > > Hi Drillers, > > > I'd like to propose the second release candidate (RC1) for the Apache > > > Drill, > > > version 1.16.0. > > > > > > Changes since the previous release candidate: > > > DRILL-7185: Drill Fails to Read Large Packets > > > DRILL-7186: Missing storage.json REST endpoint > > > DRILL-7190: Missing backward compatibility for REST API with DRILL-6562 > > > > > > Also below 2 JIRA's were created to separately track revert of protbuf > > > changes in 1.16.0: > > > DRILL-7188: Revert DRILL-6642: Update protocol-buffers version > > > DRILL-7189: Revert DRILL-7105 Error while building the Drill native > > client > > > > > > The RC1 includes total of 215 resolved JIRAs [1]. > > > Thanks to everyone for their hard work to contribute to this release. > > > > > > The tarball artifacts are hosted at [2] and the maven artifacts are > > hosted > > > at [3]. > > > > > > This release candidate is based on commit > > > cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 located at [4]. > > > > > > Please download and try out the release candidate. > > > > > > The vote ends at 06:00 PM UTC (11:00 AM PDT, 09:00 PM EET, 11:30 PM > IST), > > > Apr 25th, 2019 > > > > > > [ ] +1 > > > [ ] +0 > > > [ ] -1 > > > > > > Here is my vote: +1 > > > [1] > > > > > > > > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344284 > > > [2] http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/ > > > [3] > > > > https://repository.apache.org/content/repositories/orgapachedrill-1067/ > > > [4] https://github.com/sohami/drill/commits/drill-1.16.0 > > > > > > Thanks, > > > Sorabh > > > > > > > > > > > > >
Re: [VOTE] Apache Drill Release 1.16.0 - RC1
Hi all, I did some checks and found the following issues: - DRILL-7195 <https://issues.apache.org/jira/browse/DRILL-7195> - DRILL-7194 <https://issues.apache.org/jira/browse/DRILL-7194> - DRILL-7192 <https://issues.apache.org/jira/browse/DRILL-7192> One of them (DRILL-7194) is also reproduced on the previous version, another is connected with the new feature (DRILL-7192), so I don't think that we should treat them as blockers. The third one (DRILL-7195) is a regression and in some cases may cause the wrong results, so I think that it should be fixed before the release. Any thoughts? Kind regards, Volodymyr Vysotskyi On Mon, Apr 22, 2019 at 8:58 PM SorabhApache wrote: > *< Please disregard previous email, one of the link is not correct in it. > Use the information in this email instead >* > > Hi Drillers, > I'd like to propose the second release candidate (RC1) for the Apache > Drill, > version 1.16.0. > > Changes since the previous release candidate: > DRILL-7185: Drill Fails to Read Large Packets > DRILL-7186: Missing storage.json REST endpoint > DRILL-7190: Missing backward compatibility for REST API with DRILL-6562 > > Also below 2 JIRA's were created to separately track revert of protbuf > changes in 1.16.0: > DRILL-7188: Revert DRILL-6642: Update protocol-buffers version > DRILL-7189: Revert DRILL-7105 Error while building the Drill native client > > The RC1 includes total of 215 resolved JIRAs [1]. > Thanks to everyone for their hard work to contribute to this release. > > The tarball artifacts are hosted at [2] and the maven artifacts are hosted > at [3]. > > This release candidate is based on commit > cf5b758e0a4c22b75bfb02ac2653ff09415ddf53 located at [4]. > > Please download and try out the release candidate. > > The vote ends at 06:00 PM UTC (11:00 AM PDT, 09:00 PM EET, 11:30 PM IST), > Apr 25th, 2019 > > [ ] +1 > [ ] +0 > [ ] -1 > > Here is my vote: +1 > [1] > > https://issues.apache.org/jira/secure/ReleaseNote.jspa?projectId=12313820=12344284 > [2] http://home.apache.org/~sorabh/drill/releases/1.16.0/rc1/ > [3] > https://repository.apache.org/content/repositories/orgapachedrill-1067/ > [4] https://github.com/sohami/drill/commits/drill-1.16.0 > > Thanks, > Sorabh > > > >
[jira] [Created] (DRILL-7195) Query returns incorrect result or does not fail when cast with is null is used in filter condition
Volodymyr Vysotskyi created DRILL-7195: -- Summary: Query returns incorrect result or does not fail when cast with is null is used in filter condition Key: DRILL-7195 URL: https://issues.apache.org/jira/browse/DRILL-7195 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Fix For: 1.16.0 1. For the case when a query contains filter with a {{cast}} which cannot be done with {{is null}}, the query does not fail: {code:sql} select * from dfs.tmp.`a.json` as t where cast(t.a as integer) is null; +---+ | a | +---+ +---+ No rows selected (0.142 seconds) {code} where {noformat} cat /tmp/a.json {"a":"aaa"} {noformat} But for the case when this condition is specified in project, query, as it is expected, fails: {code:sql} select cast(t.a as integer) is null from dfs.tmp.`a.json` t; Error: SYSTEM ERROR: NumberFormatException: aaa Fragment 0:0 Please, refer to logs for more information. [Error Id: ed3982ce-a12f-4d63-bc6e-cafddf28cc24 on user515050-pc:31010] (state=,code=0) {code} This is a regression, for Drill 1.15 the first and the second queries are failed: {code:sql} select * from dfs.tmp.`a.json` as t where cast(t.a as integer) is null; Error: SYSTEM ERROR: NumberFormatException: aaa Fragment 0:0 Please, refer to logs for more information. [Error Id: 2f878f15-ddaa-48cd-9dfb-45c04db39048 on user515050-pc:31010] (state=,code=0) {code} 2. For the case when {{drill.exec.functions.cast_empty_string_to_null}} is enabled, this issue will cause wrong results: {code:sql} alter system set `drill.exec.functions.cast_empty_string_to_null`=true; select * from dfs.tmp.`a1.json` t where cast(t.a as integer) is null; +---+ | a | +---+ +---+ No rows selected (1.759 seconds) {code} where {noformat} cat /tmp/a1.json {"a":"1"} {"a":""} {noformat} Result for Drill 1.15.0: {code:sql} select * from dfs.tmp.`a1.json` t where cast(t.a as integer) is null; ++ | a | ++ || ++ 1 row selected (1.724 seconds) {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7194) Wrong result when non-deterministic functions are used in filter
Volodymyr Vysotskyi created DRILL-7194: -- Summary: Wrong result when non-deterministic functions are used in filter Key: DRILL-7194 URL: https://issues.apache.org/jira/browse/DRILL-7194 Project: Apache Drill Issue Type: Bug Reporter: Volodymyr Vysotskyi Fix For: Future Drill returns the wrong result when non-deterministic functions are used in filter condition, for example, the next query: {code:sql} select 1 from (values(1)) where random()=random(); {code} returns {noformat} ++ | EXPR$0 | ++ | 1 | ++ 1 row selected (0.105 seconds) {noformat} but {{random()=random()}} should be {{false}}, and therefore query shouldn't return any rows. If this condition is used in projection, it returns the correct result: {code:sql} select random()=random(); {code} returns {noformat} ++ | EXPR$0 | ++ | false | ++ 1 row selected (1.558 seconds) {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7192) Drill limits rows when autoLimit is disabled
Volodymyr Vysotskyi created DRILL-7192: -- Summary: Drill limits rows when autoLimit is disabled Key: DRILL-7192 URL: https://issues.apache.org/jira/browse/DRILL-7192 Project: Apache Drill Issue Type: Bug Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Fix For: Future In DRILL-7048 was implemented autoLimit for JDBC and rest clients. *Steps to reproduce the issue:* 1. Check that autoLimit was disabled, if not, disable it and restart Drill. 2. Submit any query, and verify that rows count is correct, for example, {code:sql} SELECT * FROM cp.`employee.json`; {code} returns 1,155 rows 3. Enable autoLimit for sqlLine sqlLine client: {code:sql} !set rowLimit 10 {code} 4. Submit the same query and verify that the result has 10 rows. 5. Disable autoLimit: {code:sql} !set rowLimit 0 {code} 6. Submit the same query, but for this time, *it returns 10 rows instead of 1,155*. Correct rows count is returned only after creating a new connection. The same issue is also observed for SQuirreL SQL client, but for example, for Postgres, it works correctly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-7059) Operator profile does not show JDBC metrics
[ https://issues.apache.org/jira/browse/DRILL-7059?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-7059. Resolution: Duplicate Fix Version/s: 1.14.0 It was fixed in DRILL-6455. > Operator profile does not show JDBC metrics > --- > > Key: DRILL-7059 > URL: https://issues.apache.org/jira/browse/DRILL-7059 > Project: Apache Drill > Issue Type: Improvement > Components: Execution - Monitoring >Affects Versions: 1.14.0, 1.15.0 > Environment: Drill 1.14 >Reporter: Aditya Allamraju >Priority: Major > Fix For: 1.14.0 > > Attachments: 23a59a1c-e0bb-d5d2-a686-8c4c9ddc6e0f.sys.drill, > 23a59f4b-7424-d19e-3b83-d871c2e58ff8.sys.drill > > > This issue was discovered while debugging a performance issue of a query that > is run on an Oracle(via storage plugin) and Hive tables. > > Listing the query as below that is taking nearly 4 hrs. > Oracle table: *e35_eos.finapp.sales* > Hive table: *hive.temp.sales* > {code:java} > SELECT Sum(source_cnt) source_cnt, > Sum(target_cnt) target_cnt, > sale_id, > prod_id, > cust_id, > time_id, > channel_id, > promo_id, > quantity_sold, > amount_sold, > createddate, > modifieddate > FROM (SELECT 1 source_cnt, > 0 target_cnt, > sale_id, > prod_id, > cust_id, > time_id, > Trim(channel_id) CHANNEL_ID, > promo_id, > quantity_sold, > amount_sold, > createddate, > modifieddate > FROM e35_eos.finapp.sales > UNION ALL > SELECT 0 source_cnt, > 1 target_cnt, > sale_id, > prod_id, > cust_id, > time_id, > Trim(channel_id) CHANNEL_ID, > promo_id, > quantity_sold, > amount_sold, > createddate, > modifieddate > FROM hive.temp.sales > WHERE top_rank = 1 > AND header__change_oper <> 'D') > GROUP BY sale_id, > prod_id, > cust_id, > time_id, > channel_id, > promo_id, > quantity_sold, > amount_sold, > createddate, > modifieddate > HAVING Sum(source_cnt) <> Sum(target_cnt) > LIMIT 1000 > {code} > The Physical Plan shows the step for Operator(02-08). But the "*operator > profile*", is missing the step. > > {code:java} > 02-08Jdbc(sql=[SELECT 1 "source_cnt", 0 > "target_cnt", "SALE_ID", "PROD_ID", "CUST_ID", "TIME_ID", TRIM(BOTH ' ' FROM > "CHANNEL_ID") "CHANNEL_ID", "PROMO_ID", "QUANTITY_SOLD", "AMOUNT_SOLD", > "CREATEDDATE", "MODIFIEDDATE" > FROM "FINAPP"."SALES"]) : rowType = RecordType(INTEGER source_cnt, INTEGER > target_cnt, DECIMAL(0, 0) SALE_ID, DECIMAL(6, 0) PROD_ID, DECIMAL(0, 0) > CUST_ID, TIMESTAMP(0) TIME_ID, VARCHAR(65535) CHANNEL_ID, DECIMAL(6, 0) > PROMO_ID, DECIMAL(3, 0) QUANTITY_SOLD, DECIMAL(10, 2) AMOUNT_SOLD, > TIMESTAMP(0) CREATEDDATE, TIMESTAMP(0) MODIFIEDDATE): rowcount = 100.0, > cumulative cost = {100.0 rows, 100.0 cpu, 0.0 io, 0.0 network, 0.0 memory}, > id = 4648 > {code} > Below is the attached profile for your reference: > [^23a59a1c-e0bb-d5d2-a686-8c4c9ddc6e0f.sys.drill] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-7116) Adapt statistics to use Drill Metastore API
[ https://issues.apache.org/jira/browse/DRILL-7116?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-7116. Resolution: Fixed Fix Version/s: (was: 1.17.0) 1.16.0 Fixed int the scope of DRILL-7089 > Adapt statistics to use Drill Metastore API > --- > > Key: DRILL-7116 > URL: https://issues.apache.org/jira/browse/DRILL-7116 > Project: Apache Drill > Issue Type: Sub-task >Affects Versions: 1.16.0 > Reporter: Volodymyr Vysotskyi > Assignee: Volodymyr Vysotskyi >Priority: Major > Fix For: 1.16.0 > > > The current implementation of statistics supposes the usage of files for > storing and reading statistics. > The aim of this Jira is to adapt statistics to use Drill Metastore API so in > future it may be stored in other metastore implementations. > Implementation details: > - Move statistics info into {{TableMetadata}} > - Provide a way for obtaining {{TableMetadata}} in the places where > statistics may be used (partially implemented in the scope of DRILL-7089) > - Investigate and implement (if possible) lazy materialization of > {{DrillStatsTable}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-4946) org.objectweb.asm.tree.analysis.AnalyzerException printed to console in embedded mode
[ https://issues.apache.org/jira/browse/DRILL-4946?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-4946. Resolution: Cannot Reproduce Fix Version/s: 1.16.0 Resolving this Jira, since it is not reproduced anymore using query from the description. > org.objectweb.asm.tree.analysis.AnalyzerException printed to console in > embedded mode > - > > Key: DRILL-4946 > URL: https://issues.apache.org/jira/browse/DRILL-4946 > Project: Apache Drill > Issue Type: Bug >Reporter: Chunhui Shi >Assignee: Chunhui Shi >Priority: Critical > Fix For: 1.16.0 > > > Testing by querying a json file got AnalyzerException printed. > The problem was due to scalar_replacement mode is default to be 'try', and > org.objectweb.asm.util.CheckMethodAdapter is printing stack trace to stderr. > [shi@cshi-centos1 private-drill]$ cat /tmp/conv.json > {"row": "0", "key": "\\x4a\\x31\\x39\\x38", "key2": "4a313938", "kp1": > "4a31", "kp2": "38"} > {"row": "1", "key": null, "key2": null, "kp1": null, "kp2": null} > {"row": "2", "key": "\\x4e\\x4f\\x39\\x51", "key2": "4e4f3951", "kp1": > "4e4f", "kp2": "51"} > {"row": "3", "key": "\\x6e\\x6f\\x39\\x31", "key2": "6e6f3931", "kp1": > "6e6f", "kp2": "31"} > 0: jdbc:drill:zk=local> SELECT convert_from(binary_string(key), 'INT_BE') as > intkey from dfs.`/tmp/conv.json`; > org.objectweb.asm.tree.analysis.AnalyzerException: Error at instruction 158: > Expected an object reference, but found . > at org.objectweb.asm.tree.analysis.Analyzer.analyze(Analyzer.java:294) > at > org.objectweb.asm.util.CheckMethodAdapter$1.visitEnd(CheckMethodAdapter.java:450) > at org.objectweb.asm.MethodVisitor.visitEnd(MethodVisitor.java:877) > at > org.objectweb.asm.util.CheckMethodAdapter.visitEnd(CheckMethodAdapter.java:1028) > at org.objectweb.asm.MethodVisitor.visitEnd(MethodVisitor.java:877) > at > org.apache.drill.exec.compile.CheckMethodVisitorFsm.visitEnd(CheckMethodVisitorFsm.java:114) > at org.objectweb.asm.MethodVisitor.visitEnd(MethodVisitor.java:877) > at > org.apache.drill.exec.compile.CheckMethodVisitorFsm.visitEnd(CheckMethodVisitorFsm.java:114) > at org.objectweb.asm.MethodVisitor.visitEnd(MethodVisitor.java:877) > at org.objectweb.asm.MethodVisitor.visitEnd(MethodVisitor.java:877) > at > org.apache.drill.exec.compile.bytecode.InstructionModifier.visitEnd(InstructionModifier.java:508) > at org.objectweb.asm.tree.MethodNode.accept(MethodNode.java:837) > at > org.apache.drill.exec.compile.bytecode.ScalarReplacementNode.visitEnd(ScalarReplacementNode.java:87) > at org.objectweb.asm.MethodVisitor.visitEnd(MethodVisitor.java:877) > at > org.apache.drill.exec.compile.bytecode.AloadPopRemover.visitEnd(AloadPopRemover.java:136) > at org.objectweb.asm.tree.MethodNode.accept(MethodNode.java:837) > at org.objectweb.asm.tree.MethodNode.accept(MethodNode.java:726) > at org.objectweb.asm.tree.ClassNode.accept(ClassNode.java:412) > at > org.apache.drill.exec.compile.MergeAdapter.getMergedClass(MergeAdapter.java:223) > at > org.apache.drill.exec.compile.ClassTransformer.getImplementationClass(ClassTransformer.java:263) > at > org.apache.drill.exec.compile.CodeCompiler$Loader.load(CodeCompiler.java:78) > at > org.apache.drill.exec.compile.CodeCompiler$Loader.load(CodeCompiler.java:74) > at > com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3527) > at > com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2319) > at > com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2282) > at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2197) > at com.google.common.cache.LocalCache.get(LocalCache.java:3937) > at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3941) > at > com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4824) > at > org.apache.drill.exec.compile.CodeCompiler.getImplementa
[jira] [Created] (DRILL-7160) exec.query.max_rows QUERY-level options are shown on Profiles tab
Volodymyr Vysotskyi created DRILL-7160: -- Summary: exec.query.max_rows QUERY-level options are shown on Profiles tab Key: DRILL-7160 URL: https://issues.apache.org/jira/browse/DRILL-7160 Project: Apache Drill Issue Type: Bug Components: Web Server Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Assignee: Kunal Khatua Fix For: 1.16.0 As [~arina] has noticed, option {{exec.query.max_rows}} is shown on Web UI's Profiles even when it was not set explicitly. The issue is because the option is being set on the query level internally. >From the code, looks like it is set in >{{DrillSqlWorker.checkAndApplyAutoLimit()}}, and perhaps a check whether the >value differs from the existing one should be added. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7150) Fix timezone conversion for timestamp from maprdb after the transition from PDT to PST
Volodymyr Vysotskyi created DRILL-7150: -- Summary: Fix timezone conversion for timestamp from maprdb after the transition from PDT to PST Key: DRILL-7150 URL: https://issues.apache.org/jira/browse/DRILL-7150 Project: Apache Drill Issue Type: Bug Components: Storage - MapRDB Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.16.0 Steps to reproduce: 0. Set PST timezone and date {{date +%Y%m%d -s "20190329"}} 1. Create the table in MaprDB shell: {noformat} create /tmp/testtimestamp insert /tmp/testtimestamp --value '{"_id":"eot","str":"-01-01T23:59:59.999","ts":{"$date":"-01-02T07:59:59.999Z"}}' insert /tmp/testtimestamp --value '{"_id":"pdt","str":"2019-04-01T23:59:59.999","ts":{"$date":"2019-04-02T06:59:59.999Z"}}' insert /tmp/testtimestamp --value '{"_id":"pst","str":"2019-01-01T23:59:59.999","ts":{"$date":"2019-01-02T07:59:59.999Z"}}' insert /tmp/testtimestamp --value '{"_id":"unk","str":"2017-07-08T20:01:49.885","ts":{"$date":"2017-07-09T03:01:49.885Z"}}' {noformat} 2. Create a hive table: {code:sql} CREATE EXTERNAL TABLE default.timeTest (`_id` string, `str` string, `ts` timestamp) ROW FORMAT SERDE 'org.apache.hadoop.hive.maprdb.json.serde.MapRDBSerDe' STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler' TBLPROPERTIES ( 'maprdb.column.id'='_id', 'maprdb.table.name'='/tmp/timeTest') {code} 3. Enable native reader and timezone conversion for maprdb timestamp: {code:sql} alter session set store.hive.maprdb_json.optimize_scan_with_native_reader=true; alter session store.hive.maprdb_json.read_timestamp_with_timezone_offset=true; {code} 4. Run the query on the table from Drill using hive plugin: {code} 0: jdbc:drill:drillbit=ldevdmhn005:31010> select * from hive.default.timeTest; +--+--+--+ | _id | str|ts| +--+--+--+ | eot | -01-01T23:59:59.999 | -01-02 00:59:59.999 | | pdt | 2019-04-01T23:59:59.999 | 2019-04-01 23:59:59.999 | | pst | 2019-01-01T23:59:59.999 | 2019-01-02 00:59:59.999 | | unk | 2017-07-08T20:01:49.885 | 2017-07-08 20:01:49.885 | +--+--+--+ 4 rows selected (0.343 seconds) {code} Plese note that the results for {{eot}} and {{pst}} values are wrong. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7116) Adapt statistics to use Drill Metastore API
Volodymyr Vysotskyi created DRILL-7116: -- Summary: Adapt statistics to use Drill Metastore API Key: DRILL-7116 URL: https://issues.apache.org/jira/browse/DRILL-7116 Project: Apache Drill Issue Type: Sub-task Affects Versions: 1.16.0 Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 The current implementation of statistics supposes the usage of files for storing and reading statistics. The aim of this Jira is to adapt statistics to use Drill Metastore API so in future it may be stored in other metastore implementations. Implementation details: - Move statistics info into {{TableMetadata}} - Provide a way for obtaining {{TableMetadata}} in the places where statistics may be used (partially implemented in the scope of DRILL-7089) - Investigate and implement (if possible) lazy materialization of {{DrillStatsTable}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-5581) Query with CASE statement returns wrong results
[ https://issues.apache.org/jira/browse/DRILL-5581?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-5581. Resolution: Fixed Assignee: Volodymyr Vysotskyi (was: Karthikeyan Manivannan) Fix Version/s: 1.16.0 Fixed in DRILL-6524 > Query with CASE statement returns wrong results > --- > > Key: DRILL-5581 > URL: https://issues.apache.org/jira/browse/DRILL-5581 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Flow >Affects Versions: 1.11.0 >Reporter: Khurram Faraaz >Assignee: Volodymyr Vysotskyi >Priority: Major > Fix For: 1.16.0 > > > A query that uses case statement, returns wrong results. > {noformat} > Apache Drill 1.11.0-SNAPSHOT, commit id: 874bf629 > [test@centos-101 ~]# cat order_sample.csv > 202634342,2101,20160301 > apache drill 1.11.0-SNAPSHOT > "this isn't your grandfather's sql" > 0: jdbc:drill:schema=dfs.tmp> ALTER SESSION SET `store.format`='csv'; > +---++ > | ok |summary | > +---++ > | true | store.format updated. | > +---++ > 1 row selected (0.245 seconds) > 0: jdbc:drill:schema=dfs.tmp> CREATE VIEW `vw_order_sample_csv` as > . . . . . . . . . . . . . . > SELECT > . . . . . . . . . . . . . . > `columns`[0] AS `ND`, > . . . . . . . . . . . . . . > CAST(`columns`[1] AS BIGINT) AS `col1`, > . . . . . . . . . . . . . . > CAST(`columns`[2] AS BIGINT) AS `col2` > . . . . . . . . . . . . . . > FROM `order_sample.csv`; > +---+--+ > | ok | summary > | > +---+--+ > | true | View 'vw_order_sample_csv' created successfully in 'dfs.tmp' schema > | > +---+--+ > 1 row selected (0.253 seconds) > 0: jdbc:drill:schema=dfs.tmp> select > . . . . . . . . . . . . . . > case > . . . . . . . . . . . . . . > when col1 > col2 then col1 > . . . . . . . . . . . . . . > else col2 > . . . . . . . . . . . . . . > end as temp_col, > . . . . . . . . . . . . . . > case > . . . . . . . . . . . . . . > when col1 = 2101 and (20170302 - col2) > > 1 then 'D' > . . . . . . . . . . . . . . > when col2 = 2101 then 'P' > . . . . . . . . . . . . . . > when col1 - col2 > 1 then '0' > . . . . . . . . . . . . . . > else 'A' > . . . . . . . . . . . . . . > end as status > . . . . . . . . . . . . . . > from `vw_order_sample_csv`; > +---+-+ > | temp_col | status | > +---+-+ > | 20160301 | A | > +---+-+ > 1 row selected (0.318 seconds) > 0: jdbc:drill:schema=dfs.tmp> explain plan for > . . . . . . . . . . . . . . > select > . . . . . . . . . . . . . . > case > . . . . . . . . . . . . . . > when col1 > col2 then col1 > . . . . . . . . . . . . . . > else col2 > . . . . . . . . . . . . . . > end as temp_col, > . . . . . . . . . . . . . . > case > . . . . . . . . . . . . . . > when col1 = 2101 and (20170302 - col2) > > 1 then 'D' > . . . . . . . . . . . . . . > when col2 = 2101 then 'P' > . . . . . . . . . . . . . . > when col1 - col2 > 1 then '0' > . . . . . . . . . . . . . . > else 'A' > . . . . . . . . . . . . . . > end as status > . . . . . . . . . . . . . . > from `vw_order_sample_csv`; > +--+--+ > | text | json | > +--+--+ > | 00-00Screen > 00-01 Project(temp_col=[CASE(>(CAST(ITEM($0, 1)):BIGINT, CAST(ITEM($0, > 2)):BIGINT), CAST(ITEM($0, 1)):BIGINT, CAST(ITEM($0, 2)):BIGINT)], > status=[CASE(AND(=(CAST(ITEM($0, 1)):BIGINT, 2101), >(-(20170302, > CAST(ITEM($0, 2)):BIGINT), 1)), 'D', =(CAST(ITEM($0, 2)):BIGINT, > 2101), 'P', >(-(CAST(ITEM($0, 1)):BIGINT, CAST(ITEM($0, 2)):BIGINT), > 1), '0', 'A')]) > 00-02Scan(groupscan=[EasyGroupScan > [selectionRoot=maprfs:/tmp/order_sample.csv, numFiles=1, > columns=[`columns`[1], `columns`[2]], > files=[maprfs:///tmp/order_sample.csv]]]) > // Details of Java compiler from sys.options > 0: jdbc:drill:schema=dfs.tmp> select name, status from sys.options where name > like '%java_compiler%'; > ++--+ > |
[jira] [Resolved] (DRILL-6722) Query from parquet with case-then and arithmetic operation returns a wrong result
[ https://issues.apache.org/jira/browse/DRILL-6722?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-6722. Resolution: Fixed Assignee: Volodymyr Vysotskyi Fix Version/s: 1.16.0 Fixed in DRILL-6524 > Query from parquet with case-then and arithmetic operation returns a wrong > result > - > > Key: DRILL-6722 > URL: https://issues.apache.org/jira/browse/DRILL-6722 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Codegen >Affects Versions: 1.14.0 >Reporter: Oleg Zinoviev > Assignee: Volodymyr Vysotskyi >Priority: Major > Fix For: 1.16.0 > > Attachments: JaininoJava.class, JaininoJava2_merged.class, > correct.csv, result.csv > > > Steps to reproduce: > 1) Create sample table: > {code:sql} > create table dfs.tmp.test as > select 1 as a, 2 as b > union all > select 3 as a, 2 as b > union all > select 1 as a, 4 as b > union all > select 2 as a, 2 as b > {code} > 2) Execute query: > {code:sql} > select > case when s.a > s.b then s.a else s.b end as b, > abs(s.a - s.b) as d > from dfs.tmp.test s > {code} > 3) Drill returns: [^result.csv] > 4) Result of query without parquet: > {code:sql} > select > case when s.a > s.b then s.a else s.b end as b, > abs(s.a - s.b) as d > from ( > select 1 as a, 2 as b > union all > select 3 as a, 2 as b > union all > select 1 as a, 4 as b > union all > select 2 as a, 2 as b > ) s > {code} > [^correct.csv] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-5683) Incorrect query result when query uses NOT(IS NOT NULL) expression
[ https://issues.apache.org/jira/browse/DRILL-5683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-5683. Resolution: Fixed Fix Version/s: 1.16.0 Fixed in DRILL-6524 > Incorrect query result when query uses NOT(IS NOT NULL) expression > --- > > Key: DRILL-5683 > URL: https://issues.apache.org/jira/browse/DRILL-5683 > Project: Apache Drill > Issue Type: Bug >Reporter: Jinfeng Ni > Assignee: Volodymyr Vysotskyi >Priority: Major > Fix For: 1.16.0 > > > The following repo was modified from a testcase provided by Arjun > Rajan(ara...@mapr.com). > 1. Prepare dataset with null. > {code} > create table dfs.tmp.t1 as > select r_regionkey, r_name, case when mod(r_regionkey, 3) > 0 then > mod(r_regionkey, 3) else null end as flag > from cp.`tpch/region.parquet`; > select * from dfs.tmp.t1; > +--+--+---+ > | r_regionkey |r_name| flag | > +--+--+---+ > | 0| AFRICA | null | > | 1| AMERICA | 1 | > | 2| ASIA | 2 | > | 3| EUROPE | null | > | 4| MIDDLE EAST | 1 | > +--+--+---+ > {code} > 2. Query with NOT(IS NOT NULL) expression in the filter. > {code} > select * from dfs.tmp.t1 where NOT (flag IS NOT NULL); > +--+-+---+ > | r_regionkey | r_name | flag | > +--+-+---+ > | 0| AFRICA | null | > | 3| EUROPE | null | > +--+-+---+ > {code} > 3. Switch run-time code compiler from default to 'JDK', and get wrong result. > {code} > alter system set `exec.java_compiler` = 'JDK'; > +---+--+ > | ok | summary| > +---+--+ > | true | exec.java_compiler updated. | > +---+--+ > select * from dfs.tmp.t1 where NOT (flag IS NOT NULL); > +--+--+---+ > | r_regionkey |r_name| flag | > +--+--+---+ > | 0| AFRICA | null | > | 1| AMERICA | 1 | > | 2| ASIA | 2 | > | 3| EUROPE | null | > | 4| MIDDLE EAST | 1 | > +--+--+---+ > {code} > 4. Wrong result could happen too, when NOT(IS NOT NULL) in Project operator. > {code} > select r_regionkey, r_name, NOT(flag IS NOT NULL) as exp1 from dfs.tmp.t1; > +--+--+---+ > | r_regionkey |r_name| exp1 | > +--+--+---+ > | 0| AFRICA | true | > | 1| AMERICA | true | > | 2| ASIA | true | > | 3| EUROPE | true | > | 4| MIDDLE EAST | true | > +--+--+---+ > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-4093) Use CalciteSchema from Calcite master branch
[ https://issues.apache.org/jira/browse/DRILL-4093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-4093. Resolution: Done Done in the scope of DRILL-3993. > Use CalciteSchema from Calcite master branch > > > Key: DRILL-4093 > URL: https://issues.apache.org/jira/browse/DRILL-4093 > Project: Apache Drill > Issue Type: Task > Components: Query Planning Optimization >Reporter: Jinfeng Ni >Assignee: Jinfeng Ni >Priority: Major > > Calcite-911 (https://issues.apache.org/jira/browse/CALCITE-911) pushed some > Drill specific change related to CalciteSchema code to Calcite master branch. > Drill should pick those changes in the fork, and make adjustment in Drill's > code. > This would reduce the required effort when Drill wants to rebase the fork > onto Calcite master, or wants to get rid of the fork. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (DRILL-4004) Fix bugs in JDK8 Tests before updating enforcer to JDK8
[ https://issues.apache.org/jira/browse/DRILL-4004?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-4004. Resolution: Fixed Fix Version/s: 1.13.0 Fixed in DRILL-4329 > Fix bugs in JDK8 Tests before updating enforcer to JDK8 > --- > > Key: DRILL-4004 > URL: https://issues.apache.org/jira/browse/DRILL-4004 > Project: Apache Drill > Issue Type: Bug >Reporter: Jacques Nadeau >Priority: Major > Fix For: 1.13.0 > > > The following tests fail on JDK8 > {code} > org.apache.drill.exec.store.mongo.TestMongoFilterPushDown.testFilterPushDownIsEqual > org.apache.drill.exec.store.mongo.TestMongoFilterPushDown.testFilterPushDownGreaterThanWithSingleField > org.apache.drill.exec.store.mongo.TestMongoFilterPushDown.testFilterPushDownLessThanWithSingleField > org.apache.drill.TestFrameworkTest.testRepeatedColumnMatching > org.apache.drill.TestFrameworkTest.testCSVVerificationOfOrder_checkFailure > org.apache.drill.exec.physical.impl.flatten.TestFlattenPlanning.testFlattenPlanningAvoidUnnecessaryProject > org.apache.drill.exec.record.vector.TestValueVector.testFixedVectorReallocation > org.apache.drill.exec.record.vector.TestValueVector.testVariableVectorReallocation > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7091) Query with EXISTS and correlated subquery fails with NPE in HashJoinMemoryCalculatorImpl$BuildSidePartitioningImpl
Volodymyr Vysotskyi created DRILL-7091: -- Summary: Query with EXISTS and correlated subquery fails with NPE in HashJoinMemoryCalculatorImpl$BuildSidePartitioningImpl Key: DRILL-7091 URL: https://issues.apache.org/jira/browse/DRILL-7091 Project: Apache Drill Issue Type: Bug Affects Versions: 1.15.0 Reporter: Volodymyr Vysotskyi Steps to reproduce: 1. Create view: {code:sql} create view dfs.tmp.nation_view as select * from cp.`tpch/nation.parquet`; {code} Run the following query: {code:sql} SELECT n_nationkey, n_name FROM dfs.tmp.nation_view a WHERE EXISTS (SELECT 1 FROM cp.`tpch/region.parquet` b WHERE b.r_regionkey = a.n_regionkey) {code} This query fails with NPE: {noformat} [Error Id: 9a592635-f792-4403-965c-bd2eece7e8fc on cv1:31010] at org.apache.drill.common.exceptions.UserException$Builder.build(UserException.java:633) ~[drill-common-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.sendFinalState(FragmentExecutor.java:364) [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.cleanup(FragmentExecutor.java:219) [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:330) [drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) [drill-common-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [na:1.8.0_161] at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [na:1.8.0_161] at java.lang.Thread.run(Thread.java:748) [na:1.8.0_161] Caused by: java.lang.NullPointerException: null at org.apache.drill.exec.physical.impl.join.HashJoinMemoryCalculatorImpl$BuildSidePartitioningImpl.initialize(HashJoinMemoryCalculatorImpl.java:267) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.join.HashJoinBatch.executeBuildPhase(HashJoinBatch.java:959) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.join.HashJoinBatch.innerNext(HashJoinBatch.java:525) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:141) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.test.generated.HashAggregatorGen2.doWork(HashAggTemplate.java:642) ~[na:na] at org.apache.drill.exec.physical.impl.aggregate.HashAggBatch.innerNext(HashAggBatch.java:295) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:141) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126) ~[drill-java-exec-1.16.0-SNAPSHOT.jar:1.16.0-SNAPSHOT
[jira] [Resolved] (DRILL-5295) Unable to query INFORMATION_SCHEMA.`TABLES` if MySql storage plugin enabled
[ https://issues.apache.org/jira/browse/DRILL-5295?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Volodymyr Vysotskyi resolved DRILL-5295. Resolution: Cannot Reproduce Fix Version/s: 1.16.0 > Unable to query INFORMATION_SCHEMA.`TABLES` if MySql storage plugin enabled > --- > > Key: DRILL-5295 > URL: https://issues.apache.org/jira/browse/DRILL-5295 > Project: Apache Drill > Issue Type: Bug >Affects Versions: 1.9.0 > Environment: Drill 1.9. Error can be reproduced running Drill locally > on Windows and on Linux within zookeeper. I can reproduce it with 2 mysql > servers. >Reporter: Martina Ponca >Priority: Major > Fix For: 1.16.0 > > > Impact: Unable to connect from Qlik Sense to Drill because of MySql Storage > Plugin enabled. > Steps to repro: > 1. Create a new storage plugin to MySql Community Edition 5.5.43. Enable it. > 2. Run query: "select * from INFORMATION_SCHEMA.`TABLES`" > 3. Error: > {code} > Error: SYSTEM ERROR: NullPointerException: Error. Type information for table > bistoremysql.information_schema.CHARACTER_SETS provided is null. > Fragment 0:0 > [Error Id: 2717cfe1-413d-4330-ab3f-720ae92ebc50 on > mycomputer.domain.lan:31010] > (java.lang.NullPointerException) Error. Type information for table > bistoremysql.information_schema.CHARACTER_SETS provided is null. > com.google.common.base.Preconditions.checkNotNull():250 > > org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator$Tables.visitTableWithType():314 > > org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator$Tables.visitTables():308 > > org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema():215 > > org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema():208 > > org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema():208 > > org.apache.drill.exec.store.ischema.InfoSchemaRecordGenerator.scanSchema():195 > > org.apache.drill.exec.store.ischema.InfoSchemaTableType.getRecordReader():58 > org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch():36 > org.apache.drill.exec.store.ischema.InfoSchemaBatchCreator.getBatch():30 > org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():148 > org.apache.drill.exec.physical.impl.ImplCreator.getChildren():171 > org.apache.drill.exec.physical.impl.ImplCreator.getRecordBatch():128 > org.apache.drill.exec.physical.impl.ImplCreator.getChildren():171 > org.apache.drill.exec.physical.impl.ImplCreator.getRootExec():101 > org.apache.drill.exec.physical.impl.ImplCreator.getExec():79 > org.apache.drill.exec.work.fragment.FragmentExecutor.run():206 > org.apache.drill.common.SelfCleaningRunnable.run():38 > java.util.concurrent.ThreadPoolExecutor.runWorker():1145 > java.util.concurrent.ThreadPoolExecutor$Worker.run():615 > java.lang.Thread.run():745 (state=,code=0) > {code} > The full query Qlik Sense runs: > {code:sql} > select TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME, TABLE_TYPE from > INFORMATION_SCHEMA.`TABLES` WHERE TABLE_CATALOG LIKE 'DRILL' ESCAPE '\' AND > TABLE_SCHEMA <> 'sys' AND TABLE_SCHEMA <> 'INFORMATION_SCHEMA'ORDER BY > TABLE_TYPE, TABLE_CATALOG, TABLE_SCHEMA, TABLE_NAME > {code} > If I disable the mysql storage plugin, I can run the query and connect from > Qlik (not a workaround). > This issue cannot be reproduced using Drill 1.5. -- This message was sent by Atlassian JIRA (v7.6.3#76005)