[jira] [Commented] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage
[ https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16955023#comment-16955023 ] ASF GitHub Bot commented on DRILL-7405: --- sohami commented on issue #1874: DRILL-7405: Avoiding download of TPC-H data URL: https://github.com/apache/drill/pull/1874#issuecomment-544024689 Looks like these files were packaged as jar in Drill class path as an example data for users to run some exploratory queries. I think putting these files as part of source repo should be fine. @vvysotskyi : I think your main concern is related to the unit tests data files which are merged with the source files. I guess that was done to keep the test execution time lower otherwise ideally unit tests should use the in-memory data generator for it's use. May be we should come up with some policies which can dictate when is it fine to check in the test data file and when one should use in-memory data generator. Also how does moving data files to a separate git repo will help here ? This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build fails due to inaccessible apache-drill on S3 storage > -- > > Key: DRILL-7405 > URL: https://issues.apache.org/jira/browse/DRILL-7405 > Project: Apache Drill > Issue Type: Task > Components: Tools, Build Test >Affects Versions: 1.16.0 >Reporter: Boaz Ben-Zvi >Assignee: Abhishek Girish >Priority: Critical > Fix For: 1.17.0 > > > A new clean build (e.g. after deleting the ~/.m2 local repository) would > fail now due to: > Access denied to: > [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=] > > (e.g., for the test data sf-0.01_tpc-h_parquet_typed.tgz ) > A new publicly available storage place is needed, plus appropriate changes in > Drill to get to these resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7411) DRILL 1.16
Sorabh Hamirwasia created DRILL-7411: Summary: DRILL 1.16 Key: DRILL-7411 URL: https://issues.apache.org/jira/browse/DRILL-7411 Project: Apache Drill Issue Type: Sub-task Reporter: Sorabh Hamirwasia Design document for following features are added in this JIRA: -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7410) Design Documents
Sorabh Hamirwasia created DRILL-7410: Summary: Design Documents Key: DRILL-7410 URL: https://issues.apache.org/jira/browse/DRILL-7410 Project: Apache Drill Issue Type: Task Reporter: Sorabh Hamirwasia This Jira is created to track the design documents available for all the features developed in Apache Drill. It serves as an index for easy access of these document for future reference. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (DRILL-7391) Wrong result when doing left outer join on CSV table
[ https://issues.apache.org/jira/browse/DRILL-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Aman Sinha reassigned DRILL-7391: - Assignee: Vova Vysotskyi (was: Aman Sinha) > Wrong result when doing left outer join on CSV table > > > Key: DRILL-7391 > URL: https://issues.apache.org/jira/browse/DRILL-7391 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.16.0 >Reporter: Aman Sinha >Assignee: Vova Vysotskyi >Priority: Major > Fix For: 1.17.0 > > Attachments: tt5.tar.gz, tt6.tar.gz > > > The following query shows 1 row that is incorrect. For the non-null rows, > both columns should have the same value. This is on CSV sample data (I will > attach the files). > {noformat} > apache drill (dfs.tmp)> select tt5.columns[0], tt6.columns[0] from tt5 left > outer join tt6 on tt5.columns[0] = tt6.columns[0]; > +++ > | EXPR$0 | EXPR$1 | > +++ > | 455| null | > | 455| null | > | 555| null | > | 1414 | 1414 | > | 455| null | > | 580| null | > | | null | > | 555| null | > | 455| null | > | 455| null | > | 455| null | > | 455| null | > | 455| null | > | 555| null | > | 455| null | > | 455| null | > | 455| null | > | 580| null | > | 6767 | null | > | 455| null | > | 555| null | > | 455| null | > | 555| null | > | 555| null | > | 555| null | > | 455| null | > | 555| null | > | 455| null | > | 455| null | > | 455| null | > | 6767 | null | > | 555| null | > | 555| null | > | 455| null | > | 555| null | > | 555| null | > | 1414 | 1414 | > | 455| null | > | 555| null | > | 555| null | > | 455| null | > | 455| null | > | 555| null | > | 455| null | > | 555| null | > | 555| null | > | 455| null | > | 455| null | > | 9669 | 1414 | <--- Wrong result > | 555| null | > | 455| null | > | 455| null | > | 455| null | > | 555| null | > | 580| null | > | 455| null | > | 555| null | > | 455| null | > | 555| null | > | 455| null | > | 455| null | > | 409| null | > | 455| null | > | 555| null | > | 555| null | > | 455| null | > | 455| null | > | 555| null | > | 455| null | > | 555| null | > | 1414 | 1414 | > | 455| null | > | 555| null | > | 555| null | > | 555| null | > +++ > 75 rows selected > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7391) Wrong result when doing left outer join on CSV table
[ https://issues.apache.org/jira/browse/DRILL-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954921#comment-16954921 ] Aman Sinha commented on DRILL-7391: --- [~volodymyr], I assume this fix would go into 1.17 since it is a wrong result. Since I will be offline for a few days, I am assigning this to you for updating the pom.xml file for Calcite version once CALCITE-3390 is merged and also adding the following unit test to TestExampleQueries.java: {noformat} @Test // DRILL-7391 public void testItemPushdownPastLeftOuterJoin() throws Exception { String query = "select t1.columns[0] as a, t2.columns[0] as b from cp.`store/text/data/regions.csv` t1 " + " left outer join cp.`store/text/data/regions.csv` t2 on t1.columns[0] = t2.columns[0]"; PlanTestBase.testPlanMatchingPatterns(query, new String[] {}, // exclude pattern where Project is projecting the 'columns' field new String[]{"Project.*columns"}); } {noformat} > Wrong result when doing left outer join on CSV table > > > Key: DRILL-7391 > URL: https://issues.apache.org/jira/browse/DRILL-7391 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.16.0 >Reporter: Aman Sinha >Assignee: Aman Sinha >Priority: Major > Fix For: 1.17.0 > > Attachments: tt5.tar.gz, tt6.tar.gz > > > The following query shows 1 row that is incorrect. For the non-null rows, > both columns should have the same value. This is on CSV sample data (I will > attach the files). > {noformat} > apache drill (dfs.tmp)> select tt5.columns[0], tt6.columns[0] from tt5 left > outer join tt6 on tt5.columns[0] = tt6.columns[0]; > +++ > | EXPR$0 | EXPR$1 | > +++ > | 455| null | > | 455| null | > | 555| null | > | 1414 | 1414 | > | 455| null | > | 580| null | > | | null | > | 555| null | > | 455| null | > | 455| null | > | 455| null | > | 455| null | > | 455| null | > | 555| null | > | 455| null | > | 455| null | > | 455| null | > | 580| null | > | 6767 | null | > | 455| null | > | 555| null | > | 455| null | > | 555| null | > | 555| null | > | 555| null | > | 455| null | > | 555| null | > | 455| null | > | 455| null | > | 455| null | > | 6767 | null | > | 555| null | > | 555| null | > | 455| null | > | 555| null | > | 555| null | > | 1414 | 1414 | > | 455| null | > | 555| null | > | 555| null | > | 455| null | > | 455| null | > | 555| null | > | 455| null | > | 555| null | > | 555| null | > | 455| null | > | 455| null | > | 9669 | 1414 | <--- Wrong result > | 555| null | > | 455| null | > | 455| null | > | 455| null | > | 555| null | > | 580| null | > | 455| null | > | 555| null | > | 455| null | > | 555| null | > | 455| null | > | 455| null | > | 409| null | > | 455| null | > | 555| null | > | 555| null | > | 455| null | > | 455| null | > | 555| null | > | 455| null | > | 555| null | > | 1414 | 1414 | > | 455| null | > | 555| null | > | 555| null | > | 555| null | > +++ > 75 rows selected > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage
[ https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954898#comment-16954898 ] ASF GitHub Bot commented on DRILL-7405: --- Agirish commented on issue #1874: DRILL-7405: Avoiding download of TPC-H data URL: https://github.com/apache/drill/pull/1874#issuecomment-543879126 @vvysotskyi , I understand but I am not for using personal GitHub repositories. I don't think it's the best approach for an Apache project. The files are small enough ~3 MB. I'm not sure I agree that this would constitute as large. I think what we have here would be more straightforward. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build fails due to inaccessible apache-drill on S3 storage > -- > > Key: DRILL-7405 > URL: https://issues.apache.org/jira/browse/DRILL-7405 > Project: Apache Drill > Issue Type: Task > Components: Tools, Build Test >Affects Versions: 1.16.0 >Reporter: Boaz Ben-Zvi >Assignee: Abhishek Girish >Priority: Critical > Fix For: 1.17.0 > > > A new clean build (e.g. after deleting the ~/.m2 local repository) would > fail now due to: > Access denied to: > [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=] > > (e.g., for the test data sf-0.01_tpc-h_parquet_typed.tgz ) > A new publicly available storage place is needed, plus appropriate changes in > Drill to get to these resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7391) Wrong result when doing left outer join on CSV table
[ https://issues.apache.org/jira/browse/DRILL-7391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954729#comment-16954729 ] Aman Sinha commented on DRILL-7391: --- Adding link to CALCITE-3390 on which this fix depends. > Wrong result when doing left outer join on CSV table > > > Key: DRILL-7391 > URL: https://issues.apache.org/jira/browse/DRILL-7391 > Project: Apache Drill > Issue Type: Bug > Components: Query Planning Optimization >Affects Versions: 1.16.0 >Reporter: Aman Sinha >Assignee: Aman Sinha >Priority: Major > Fix For: 1.17.0 > > Attachments: tt5.tar.gz, tt6.tar.gz > > > The following query shows 1 row that is incorrect. For the non-null rows, > both columns should have the same value. This is on CSV sample data (I will > attach the files). > {noformat} > apache drill (dfs.tmp)> select tt5.columns[0], tt6.columns[0] from tt5 left > outer join tt6 on tt5.columns[0] = tt6.columns[0]; > +++ > | EXPR$0 | EXPR$1 | > +++ > | 455| null | > | 455| null | > | 555| null | > | 1414 | 1414 | > | 455| null | > | 580| null | > | | null | > | 555| null | > | 455| null | > | 455| null | > | 455| null | > | 455| null | > | 455| null | > | 555| null | > | 455| null | > | 455| null | > | 455| null | > | 580| null | > | 6767 | null | > | 455| null | > | 555| null | > | 455| null | > | 555| null | > | 555| null | > | 555| null | > | 455| null | > | 555| null | > | 455| null | > | 455| null | > | 455| null | > | 6767 | null | > | 555| null | > | 555| null | > | 455| null | > | 555| null | > | 555| null | > | 1414 | 1414 | > | 455| null | > | 555| null | > | 555| null | > | 455| null | > | 455| null | > | 555| null | > | 455| null | > | 555| null | > | 555| null | > | 455| null | > | 455| null | > | 9669 | 1414 | <--- Wrong result > | 555| null | > | 455| null | > | 455| null | > | 455| null | > | 555| null | > | 580| null | > | 455| null | > | 555| null | > | 455| null | > | 555| null | > | 455| null | > | 455| null | > | 409| null | > | 455| null | > | 555| null | > | 555| null | > | 455| null | > | 455| null | > | 555| null | > | 455| null | > | 555| null | > | 1414 | 1414 | > | 455| null | > | 555| null | > | 555| null | > | 555| null | > +++ > 75 rows selected > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7017) lz4 codec for (un)compression
[ https://issues.apache.org/jira/browse/DRILL-7017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954676#comment-16954676 ] Arina Ielchiieva commented on DRILL-7017: - To support lz4 native lz4 lib should be accessible (i.e. included in Drill classpath). Similar issues was discussed in other project - https://issues.apache.org/jira/browse/KYLIN-3201. > lz4 codec for (un)compression > - > > Key: DRILL-7017 > URL: https://issues.apache.org/jira/browse/DRILL-7017 > Project: Apache Drill > Issue Type: Wish > Components: Storage - Text CSV >Affects Versions: 1.15.0 >Reporter: benj >Priority: Major > > I didn't find in the documentation what compression formats are supported. > But as it's possible to use drill on compressed file, like > {code:java} > SELECT * FROM tmp.`myfile.csv.gz`; > {code} > It will be useful to have the possibility to use this functionality for lz4 > file ([https://github.com/lz4/lz4]) > -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7401) Sqlline 1.9 upgrade
[ https://issues.apache.org/jira/browse/DRILL-7401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Vova Vysotskyi updated DRILL-7401: -- Labels: ready-to-commit (was: ) > Sqlline 1.9 upgrade > --- > > Key: DRILL-7401 > URL: https://issues.apache.org/jira/browse/DRILL-7401 > Project: Apache Drill > Issue Type: Task >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Labels: ready-to-commit > Fix For: 1.17.0 > > > Upgrade to SqlLine 1.9 once it is released > (https://github.com/julianhyde/sqlline/issues/350). > *TODO:* > 1. Add SqlLine properties: > {{connectInteractionMode: useNPTogetherOrEmpty}} - supports connection > mehanism used in SqlLine 1.17 and earlier: > a. if user and password are not indicated, connects without them (user and > password are set t empty string): {{./drill-embedded}} > b. if user is indicated, asks for password in interactive mode: > {{./drill-embedded -n "user1"}} > c. if user is indicated as empty string, behaives like in point a (user and > password are set t empty string): {{./drill-embedded -n ""}} > d. if user and password are indicated, connects using provided input > {{./drill-embedded -n "user1" -p "123"}} > {{showLineNumbers: true}} - adds line numbers when query is more than one > line: > {noformat} > apache drill> select > 2..semicolon> * > 3..semicolon> from > 4..semicolon> sys.version; > {noformat} > 2. Remove nohup support code from sqlline.sh since it is not needed any more > (nohup support wroks without flag): > {code} > To add nohup support for SQLline script > if [[ ( ! $(ps -o stat= -p $$) =~ "+" ) && ! ( -p /dev/stdin ) ]]; then >export SQLLINE_JAVA_OPTS="$SQLLINE_JAVA_OPTS > -Djline.terminal=jline.UnsupportedTerminal" > fi > {code} > 3. Add {{-Dorg.jline.terminal.dumb=true}} to avoid JLine terminal warning > when submitting query in sqlline.sh to execute via {{-e}} or {{-f}}: > {noformat} > Oct 11, 2019 2:14:45 PM org.jline.utils.Log logr > WARNING: Unable to create a system terminal, creating a dumb terminal (enable > debug logging for more information) > {noformat} > 4. Remove unneeded echo commands in sqlline.bat during start up: > {noformat} > drill-embedded.bat > DRILL_ARGS - " -u jdbc:drill:zk=local -n user1 -p ppp" > Calculating HADOOP_CLASSPATH ... > HBASE_HOME not detected... > Calculating Drill classpath... > Apache Drill 1.17.0-SNAPSHOT > "Data is the new oil. Ready to Drill some?" > apache drill> > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7401) Sqlline 1.9 upgrade
[ https://issues.apache.org/jira/browse/DRILL-7401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954574#comment-16954574 ] ASF GitHub Bot commented on DRILL-7401: --- arina-ielchiieva commented on pull request #1875: DRILL-7401: Upgrade to SqlLine 1.9.0 URL: https://github.com/apache/drill/pull/1875 Jira - [DRILL-7401](https://issues.apache.org/jira/browse/DRILL-7401). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Sqlline 1.9 upgrade > --- > > Key: DRILL-7401 > URL: https://issues.apache.org/jira/browse/DRILL-7401 > Project: Apache Drill > Issue Type: Task >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.17.0 > > > Upgrade to SqlLine 1.9 once it is released > (https://github.com/julianhyde/sqlline/issues/350). > *TODO:* > 1. Add SqlLine properties: > {{connectInteractionMode: useNPTogetherOrEmpty}} - supports connection > mehanism used in SqlLine 1.17 and earlier: > a. if user and password are not indicated, connects without them (user and > password are set t empty string): {{./drill-embedded}} > b. if user is indicated, asks for password in interactive mode: > {{./drill-embedded -n "user1"}} > c. if user is indicated as empty string, behaives like in point a (user and > password are set t empty string): {{./drill-embedded -n ""}} > d. if user and password are indicated, connects using provided input > {{./drill-embedded -n "user1" -p "123"}} > {{showLineNumbers: true}} - adds line numbers when query is more than one > line: > {noformat} > apache drill> select > 2..semicolon> * > 3..semicolon> from > 4..semicolon> sys.version; > {noformat} > 2. Remove nohup support code from sqlline.sh since it is not needed any more > (nohup support wroks without flag): > {code} > To add nohup support for SQLline script > if [[ ( ! $(ps -o stat= -p $$) =~ "+" ) && ! ( -p /dev/stdin ) ]]; then >export SQLLINE_JAVA_OPTS="$SQLLINE_JAVA_OPTS > -Djline.terminal=jline.UnsupportedTerminal" > fi > {code} > 3. Add {{-Dorg.jline.terminal.dumb=true}} to avoid JLine terminal warning > when submitting query in sqlline.sh to execute via {{-e}} or {{-f}}: > {noformat} > Oct 11, 2019 2:14:45 PM org.jline.utils.Log logr > WARNING: Unable to create a system terminal, creating a dumb terminal (enable > debug logging for more information) > {noformat} > 4. Remove unneeded echo commands in sqlline.bat during start up: > {noformat} > drill-embedded.bat > DRILL_ARGS - " -u jdbc:drill:zk=local -n user1 -p ppp" > Calculating HADOOP_CLASSPATH ... > HBASE_HOME not detected... > Calculating Drill classpath... > Apache Drill 1.17.0-SNAPSHOT > "Data is the new oil. Ready to Drill some?" > apache drill> > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7401) Sqlline 1.9 upgrade
[ https://issues.apache.org/jira/browse/DRILL-7401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954575#comment-16954575 ] ASF GitHub Bot commented on DRILL-7401: --- arina-ielchiieva commented on issue #1875: DRILL-7401: Upgrade to SqlLine 1.9.0 URL: https://github.com/apache/drill/pull/1875#issuecomment-543730392 @vvysotskyi please review. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Sqlline 1.9 upgrade > --- > > Key: DRILL-7401 > URL: https://issues.apache.org/jira/browse/DRILL-7401 > Project: Apache Drill > Issue Type: Task >Reporter: Arina Ielchiieva >Assignee: Arina Ielchiieva >Priority: Major > Fix For: 1.17.0 > > > Upgrade to SqlLine 1.9 once it is released > (https://github.com/julianhyde/sqlline/issues/350). > *TODO:* > 1. Add SqlLine properties: > {{connectInteractionMode: useNPTogetherOrEmpty}} - supports connection > mehanism used in SqlLine 1.17 and earlier: > a. if user and password are not indicated, connects without them (user and > password are set t empty string): {{./drill-embedded}} > b. if user is indicated, asks for password in interactive mode: > {{./drill-embedded -n "user1"}} > c. if user is indicated as empty string, behaives like in point a (user and > password are set t empty string): {{./drill-embedded -n ""}} > d. if user and password are indicated, connects using provided input > {{./drill-embedded -n "user1" -p "123"}} > {{showLineNumbers: true}} - adds line numbers when query is more than one > line: > {noformat} > apache drill> select > 2..semicolon> * > 3..semicolon> from > 4..semicolon> sys.version; > {noformat} > 2. Remove nohup support code from sqlline.sh since it is not needed any more > (nohup support wroks without flag): > {code} > To add nohup support for SQLline script > if [[ ( ! $(ps -o stat= -p $$) =~ "+" ) && ! ( -p /dev/stdin ) ]]; then >export SQLLINE_JAVA_OPTS="$SQLLINE_JAVA_OPTS > -Djline.terminal=jline.UnsupportedTerminal" > fi > {code} > 3. Add {{-Dorg.jline.terminal.dumb=true}} to avoid JLine terminal warning > when submitting query in sqlline.sh to execute via {{-e}} or {{-f}}: > {noformat} > Oct 11, 2019 2:14:45 PM org.jline.utils.Log logr > WARNING: Unable to create a system terminal, creating a dumb terminal (enable > debug logging for more information) > {noformat} > 4. Remove unneeded echo commands in sqlline.bat during start up: > {noformat} > drill-embedded.bat > DRILL_ARGS - " -u jdbc:drill:zk=local -n user1 -p ppp" > Calculating HADOOP_CLASSPATH ... > HBASE_HOME not detected... > Calculating Drill classpath... > Apache Drill 1.17.0-SNAPSHOT > "Data is the new oil. Ready to Drill some?" > apache drill> > {noformat} -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage
[ https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva reassigned DRILL-7405: --- Assignee: Abhishek Girish (was: Arina Ielchiieva) > Build fails due to inaccessible apache-drill on S3 storage > -- > > Key: DRILL-7405 > URL: https://issues.apache.org/jira/browse/DRILL-7405 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build Test >Affects Versions: 1.16.0 >Reporter: Boaz Ben-Zvi >Assignee: Abhishek Girish >Priority: Critical > Fix For: 1.17.0 > > > A new clean build (e.g. after deleting the ~/.m2 local repository) would > fail now due to: > Access denied to: > [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=] > > (e.g., for the test data sf-0.01_tpc-h_parquet_typed.tgz ) > A new publicly available storage place is needed, plus appropriate changes in > Drill to get to these resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage
[ https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7405: Issue Type: Task (was: Bug) > Build fails due to inaccessible apache-drill on S3 storage > -- > > Key: DRILL-7405 > URL: https://issues.apache.org/jira/browse/DRILL-7405 > Project: Apache Drill > Issue Type: Task > Components: Tools, Build Test >Affects Versions: 1.16.0 >Reporter: Boaz Ben-Zvi >Assignee: Abhishek Girish >Priority: Critical > Fix For: 1.17.0 > > > A new clean build (e.g. after deleting the ~/.m2 local repository) would > fail now due to: > Access denied to: > [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=] > > (e.g., for the test data sf-0.01_tpc-h_parquet_typed.tgz ) > A new publicly available storage place is needed, plus appropriate changes in > Drill to get to these resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage
[ https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva reassigned DRILL-7405: --- Assignee: Arina Ielchiieva (was: Abhishek Girish) > Build fails due to inaccessible apache-drill on S3 storage > -- > > Key: DRILL-7405 > URL: https://issues.apache.org/jira/browse/DRILL-7405 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build Test >Affects Versions: 1.16.0 >Reporter: Boaz Ben-Zvi >Assignee: Arina Ielchiieva >Priority: Critical > Fix For: 1.17.0 > > > A new clean build (e.g. after deleting the ~/.m2 local repository) would > fail now due to: > Access denied to: > [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=] > > (e.g., for the test data sf-0.01_tpc-h_parquet_typed.tgz ) > A new publicly available storage place is needed, plus appropriate changes in > Drill to get to these resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7403) Validate batch checks, vector integretity in unit tests
[ https://issues.apache.org/jira/browse/DRILL-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7403: Labels: ready-to-commit (was: ) > Validate batch checks, vector integretity in unit tests > --- > > Key: DRILL-7403 > URL: https://issues.apache.org/jira/browse/DRILL-7403 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.16.0, 1.17.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Labels: ready-to-commit > Fix For: 1.17.0 > > > Drill provides a {{BatchValidator}} that checks vectors. It is disabled by > default. This enhancement adds more checks, including checks for row counts > (of which there are surprisingly many.) > Since most operators will fail if the check is enabled, this enhancement also > adds a table to keep track of which operators pass the checks (and for which > checks should be enabled) and those that still need work. This allows the > checks to exist in the code, and to be enabled incrementally as we fix the > various problems. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7403) Validate batch checks, vector integretity in unit tests
[ https://issues.apache.org/jira/browse/DRILL-7403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954521#comment-16954521 ] ASF GitHub Bot commented on DRILL-7403: --- arina-ielchiieva commented on issue #1871: DRILL-7403: Validate batch checks, vector integretity in unit tests URL: https://github.com/apache/drill/pull/1871#issuecomment-543689950 LGTM, +1 This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Validate batch checks, vector integretity in unit tests > --- > > Key: DRILL-7403 > URL: https://issues.apache.org/jira/browse/DRILL-7403 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.16.0, 1.17.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.17.0 > > > Drill provides a {{BatchValidator}} that checks vectors. It is disabled by > default. This enhancement adds more checks, including checks for row counts > (of which there are surprisingly many.) > Since most operators will fail if the check is enabled, this enhancement also > adds a table to keep track of which operators pass the checks (and for which > checks should be enabled) and those that still need work. This allows the > checks to exist in the code, and to be enabled incrementally as we fix the > various problems. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7402) Suppress batch dumps for expected failures in tests
[ https://issues.apache.org/jira/browse/DRILL-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7402: Labels: ready-to-commit (was: ) > Suppress batch dumps for expected failures in tests > --- > > Key: DRILL-7402 > URL: https://issues.apache.org/jira/browse/DRILL-7402 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.16.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Labels: ready-to-commit > Fix For: 1.17.0 > > > Drill provides a way to dump the last few batches when an error occurs. > However, in tests, we often deliberately cause something to fail. In this > case, the batch dump is unnecessary. > This enhancement adds a config property, disabled in tests, that controls the > dump activity. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7402) Suppress batch dumps for expected failures in tests
[ https://issues.apache.org/jira/browse/DRILL-7402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954517#comment-16954517 ] ASF GitHub Bot commented on DRILL-7402: --- arina-ielchiieva commented on issue #1872: DRILL-7402: Suppress batch dumps for expected failures in tests URL: https://github.com/apache/drill/pull/1872#issuecomment-543688813 +1, @paul-rogers thanks for making the changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Suppress batch dumps for expected failures in tests > --- > > Key: DRILL-7402 > URL: https://issues.apache.org/jira/browse/DRILL-7402 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.16.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Minor > Fix For: 1.17.0 > > > Drill provides a way to dump the last few batches when an error occurs. > However, in tests, we often deliberately cause something to fail. In this > case, the batch dump is unnecessary. > This enhancement adds a config property, disabled in tests, that controls the > dump activity. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7409) Remove bigIntDictionary.parquet from project sources
[ https://issues.apache.org/jira/browse/DRILL-7409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7409: Fix Version/s: 1.17.0 > Remove bigIntDictionary.parquet from project sources > > > Key: DRILL-7409 > URL: https://issues.apache.org/jira/browse/DRILL-7409 > Project: Apache Drill > Issue Type: Task > Components: Tools, Build Test >Reporter: Vova Vysotskyi >Assignee: Denys Ordynskiy >Priority: Minor > Fix For: 1.17.0 > > > {{bigIntDictionary.parquet}} file has size of 1.8M, but it is used in single > unit test {{TestColumnReaderFactory.testBigIntWithDictionary}}. We should > either move this test to a test-framework or recreate a smaller file that > will still allow us to verify this case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Updated] (DRILL-7409) Remove bigIntDictionary.parquet from project sources
[ https://issues.apache.org/jira/browse/DRILL-7409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-7409: Affects Version/s: 1.16.0 > Remove bigIntDictionary.parquet from project sources > > > Key: DRILL-7409 > URL: https://issues.apache.org/jira/browse/DRILL-7409 > Project: Apache Drill > Issue Type: Task > Components: Tools, Build Test >Affects Versions: 1.16.0 >Reporter: Vova Vysotskyi >Assignee: Denys Ordynskiy >Priority: Minor > Fix For: 1.17.0 > > > {{bigIntDictionary.parquet}} file has size of 1.8M, but it is used in single > unit test {{TestColumnReaderFactory.testBigIntWithDictionary}}. We should > either move this test to a test-framework or recreate a smaller file that > will still allow us to verify this case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Assigned] (DRILL-7409) Remove bigIntDictionary.parquet from project sources
[ https://issues.apache.org/jira/browse/DRILL-7409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Denys Ordynskiy reassigned DRILL-7409: -- Assignee: Denys Ordynskiy > Remove bigIntDictionary.parquet from project sources > > > Key: DRILL-7409 > URL: https://issues.apache.org/jira/browse/DRILL-7409 > Project: Apache Drill > Issue Type: Task > Components: Tools, Build Test >Reporter: Vova Vysotskyi >Assignee: Denys Ordynskiy >Priority: Minor > > {{bigIntDictionary.parquet}} file has size of 1.8M, but it is used in single > unit test {{TestColumnReaderFactory.testBigIntWithDictionary}}. We should > either move this test to a test-framework or recreate a smaller file that > will still allow us to verify this case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Created] (DRILL-7409) Remove bigIntDictionary.parquet from project sources
Vova Vysotskyi created DRILL-7409: - Summary: Remove bigIntDictionary.parquet from project sources Key: DRILL-7409 URL: https://issues.apache.org/jira/browse/DRILL-7409 Project: Apache Drill Issue Type: Task Components: Tools, Build Test Reporter: Vova Vysotskyi {{bigIntDictionary.parquet}} file has size of 1.8M, but it is used in single unit test {{TestColumnReaderFactory.testBigIntWithDictionary}}. We should either move this test to a test-framework or recreate a smaller file that will still allow us to verify this case. -- This message was sent by Atlassian Jira (v8.3.4#803005)
[jira] [Commented] (DRILL-7405) Build fails due to inaccessible apache-drill on S3 storage
[ https://issues.apache.org/jira/browse/DRILL-7405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16954441#comment-16954441 ] ASF GitHub Bot commented on DRILL-7405: --- vvysotskyi commented on issue #1874: DRILL-7405: Avoiding download of TPC-H data URL: https://github.com/apache/drill/pull/1874#issuecomment-543633312 I disagree that we should add so large files to the project sources. I have experimented some time ago with alternative solutions for this problem and one of the ideas was to create a new GitHub repository with these files, use JitPack for publishing archive and use `maven-dependency-plugin` in Drill to obtain and unpack files when the project is built. Here is a link to the repo with files: https://github.com/vvysotskyi/tpch-parquet and commit with changes in Drill: https://github.com/vvysotskyi/drill/commit/0635133bbd22945e7648791cfd6e2d146730b219. What do you think about this? You may use these changes if we will choose this approach. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Build fails due to inaccessible apache-drill on S3 storage > -- > > Key: DRILL-7405 > URL: https://issues.apache.org/jira/browse/DRILL-7405 > Project: Apache Drill > Issue Type: Bug > Components: Tools, Build Test >Affects Versions: 1.16.0 >Reporter: Boaz Ben-Zvi >Assignee: Abhishek Girish >Priority: Critical > Fix For: 1.17.0 > > > A new clean build (e.g. after deleting the ~/.m2 local repository) would > fail now due to: > Access denied to: > [http://apache-drill.s3.amazonaws.com|https://urldefense.proofpoint.com/v2/url?u=http-3A__apache-2Ddrill.s3.amazonaws.com_files_sf-2D0.01-5Ftpc-2Dh-5Fparquet-5Ftyped.tgz=DwMGaQ=C5b8zRQO1miGmBeVZ2LFWg=KLC1nKJ8dIOnUay2kR6CAw=08mf7Xfn1orlbAA60GKLIuj_PTtfaSAijrKDLOucMPU=CX97We3sm3ZZ_aVJIrsUdXVJ3CNMYg7p3IsxbJpuXWk=] > > (e.g., for the test data sf-0.01_tpc-h_parquet_typed.tgz ) > A new publicly available storage place is needed, plus appropriate changes in > Drill to get to these resources. -- This message was sent by Atlassian Jira (v8.3.4#803005)