[jira] [Commented] (DRILL-7310) Move schema-related classes from exec module to be able to use them in metastore module
[ https://issues.apache.org/jira/browse/DRILL-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874450#comment-16874450 ] ASF GitHub Bot commented on DRILL-7310: --- vvysotskyi commented on pull request #1816: DRILL-7310: Move schema-related classes from exec module to be able to use them in metastore module URL: https://github.com/apache/drill/pull/1816 For details please refer [DRILL-7310](https://issues.apache.org/jira/browse/DRILL-7310). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Move schema-related classes from exec module to be able to use them in > metastore module > --- > > Key: DRILL-7310 > URL: https://issues.apache.org/jira/browse/DRILL-7310 > Project: Apache Drill > Issue Type: Sub-task >Reporter: Volodymyr Vysotskyi >Assignee: Volodymyr Vysotskyi >Priority: Major > Fix For: 1.17.0 > > > Currently, most of the schema related classes are placed in the {{exec}} > module, but some of them should be used in {{metastore}} module. > {{metastore}} module doesn't have a dependency onto exec one. > The solution is to move these classes from {{exec}} into another module which > is used by {{metastore}}, so they will be accessible for {{metastore}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (DRILL-7310) Move schema-related classes from exec module to be able to use them in metastore module
Volodymyr Vysotskyi created DRILL-7310: -- Summary: Move schema-related classes from exec module to be able to use them in metastore module Key: DRILL-7310 URL: https://issues.apache.org/jira/browse/DRILL-7310 Project: Apache Drill Issue Type: Sub-task Reporter: Volodymyr Vysotskyi Assignee: Volodymyr Vysotskyi Fix For: 1.17.0 Currently, most of the schema related classes are placed in the {{exec}} module, but some of them should be used in {{metastore}} module. {{metastore}} module doesn't have a dependency onto exec one. The solution is to move these classes from {{exec}} into another module which is used by {{metastore}}, so they will be accessible for {{metastore}}. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-5983) Unsupported nullable converted type INT_8 for primitive type INT32 error
[ https://issues.apache.org/jira/browse/DRILL-5983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874263#comment-16874263 ] Stuart Teasdale commented on DRILL-5983: I'm seeing basically the same issue with Drill 1.16.0. I've attached the file that does this > select * from test; Error: INTERNAL_ERROR ERROR: Error in parquet record reader. Message: Failure in setting up reader Parquet Metadata: ParquetMetaData{FileMetaData{schema: message schema { optional int32 chrom_int (INT_8); optional int64 __index_level_0__; } , metadata: \{pandas={"index_columns": ["__index_level_0__"], "column_indexes": [{"name": null, "field_name": null, "pandas_type": "unicode", "numpy_type": "object", "metadata": {"encoding": "UTF-8"}}], "columns": [\{"name": "chrom_int", "field_name": "chrom_int", "pandas_type": "int8", "numpy_type": "int8", "metadata": null}, \{"name": null, "field_name": "__index_level_0__", "pandas_type": "int64", "numpy_type": "int64", "metadata": null}], "pandas_version": "0.24.2"}}}, blocks: [BlockMetaData\{1, 140 [ColumnMetaData{SNAPPY [chrom_int] optional int32 chrom_int (INT_8) [PLAIN_DICTIONARY, RLE, PLAIN], 24}, ColumnMetaData\{SNAPPY [__index_level_0__] optional int64 __index_level_0__ [PLAIN_DICTIONARY, RLE, PLAIN], 158}]}]} Fragment 0:0 Please, refer to logs for more information. [Error Id: c3e0e2ea-0e51-4732-96af-88ffe669b22c on mephistopheles.londc.genomicsplc.com:31010] (state=,code=0) and from the logs: org.apache.drill.common.exceptions.DrillRuntimeException: Error in parquet record reader. Message: Failure in setting up reader Parquet Metadata: ParquetMetaData{FileMetaData{schema: message schema { optional int32 chrom_int (INT_8); optional int64 __index_level_0__; } , metadata: \{pandas={"index_columns": ["__index_level_0__"], "column_indexes": [{"name": null, "field_name": null, "pandas_type": "unicode", "numpy_type": "object", "metadata": {"encoding": "UTF-8"}}], "columns": [\{"name": "chrom_int", "field_name": "chrom_int", "pandas_type": "int8", "numpy_type": "int8", "metadata": null}, \{"name": null, "field_name": "__index_level_0__", "pandas_type": "int64", "numpy_type": "int64", "metadata": null}], "pandas_version": "0.24.2"}}}, blocks: [BlockMetaData\{1, 140 [ColumnMetaData{SNAPPY [chrom_int] optional int32 chrom_int (INT_8) [PLAIN_DICTIONARY, RLE, PLAIN], 24}, ColumnMetaData\{SNAPPY [__index_level_0__] optional int64 __index_level_0__ [PLAIN_DICTIONARY, RLE, PLAIN], 158}]}]} at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.handleException(ParquetRecordReader.java:269) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.setup(ParquetRecordReader.java:253) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas(ScanBatch.java:321) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.physical.impl.ScanBatch.internalNext(ScanBatch.java:216) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:271) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:141) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:83) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:296) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:283) ~[drill-java-exec-1.16.0.jar:1.16.0] at ...(:0) ~[na:na] at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746) ~[hadoop-common-2.7.4.jar:na] at org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:283) ~[drill-java-exec-1.16.0.jar:1.16.0] at org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) ~[drill-common-1.16.0.jar:1.16.0] at ...(:0) ~[na:na] Caused by:
[jira] [Updated] (DRILL-5983) Unsupported nullable converted type INT_8 for primitive type INT32 error
[ https://issues.apache.org/jira/browse/DRILL-5983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Stuart Teasdale updated DRILL-5983: --- Attachment: test.parquet > Unsupported nullable converted type INT_8 for primitive type INT32 error > > > Key: DRILL-5983 > URL: https://issues.apache.org/jira/browse/DRILL-5983 > Project: Apache Drill > Issue Type: Bug > Components: Execution - Data Types >Affects Versions: 1.10.0, 1.11.0 > Environment: NAME="Ubuntu" > VERSION="16.04.2 LTS (Xenial Xerus)" >Reporter: Hakan Sarıbıyık >Priority: Major > Labels: parquet, read, types > Attachments: test.parquet > > > When I query a table with byte in it, then it gives an error; > _Query Failed: An Error Occurred > org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: > ExecutionSetupException: Unsupported nullable converted type INT_8 for > primitive type INT32 Fragment 1:6 [Error Id: > 46636b05-cff5-455b-ba25-527217346b3e on bigdata7:31010]_ > Actualy, it has been solved with > [DRILL-4764] - Parquet file with INT_16, etc. logical types not supported by > simple SELECT > according to https://drill.apache.org/docs/apache-drill-1-10-0-release-notes/ > But i tried it with even 1-11-0 it didnt worked. > I am querying parquet formatted file with pySpark > tablo1 > sourceid: byte (nullable = true) > select sourceid from tablo1 > works as expected with pySpark. But not with Drill v1.11.0 > Thanx. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6711) Use jitpack repository for Drill Calcite project artifacts instead of repository.mapr.com
[ https://issues.apache.org/jira/browse/DRILL-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874031#comment-16874031 ] ASF GitHub Bot commented on DRILL-6711: --- vvysotskyi commented on issue #1815: DRILL-6711: Use jitpack repository for Drill Calcite project artifacts instead of repository.mapr.com URL: https://github.com/apache/drill/pull/1815#issuecomment-506312844 @arina-ielchiieva, thanks for the review, commits are squashed. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Use jitpack repository for Drill Calcite project artifacts instead of > repository.mapr.com > - > > Key: DRILL-6711 > URL: https://issues.apache.org/jira/browse/DRILL-6711 > Project: Apache Drill > Issue Type: Task >Reporter: Arina Ielchiieva >Assignee: Volodymyr Vysotskyi >Priority: Major > Labels: ready-to-commit > Fix For: 1.17.0 > > > Simplify deployment of Drill Calcite project artifacts by using > [https://jitpack.io/]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (DRILL-6711) Use jitpack repository for Drill Calcite project artifacts instead of repository.mapr.com
[ https://issues.apache.org/jira/browse/DRILL-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Arina Ielchiieva updated DRILL-6711: Labels: ready-to-commit (was: ) > Use jitpack repository for Drill Calcite project artifacts instead of > repository.mapr.com > - > > Key: DRILL-6711 > URL: https://issues.apache.org/jira/browse/DRILL-6711 > Project: Apache Drill > Issue Type: Task >Reporter: Arina Ielchiieva >Assignee: Volodymyr Vysotskyi >Priority: Major > Labels: ready-to-commit > Fix For: 1.17.0 > > > Simplify deployment of Drill Calcite project artifacts by using > [https://jitpack.io/]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-6711) Use jitpack repository for Drill Calcite project artifacts instead of repository.mapr.com
[ https://issues.apache.org/jira/browse/DRILL-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873991#comment-16873991 ] ASF GitHub Bot commented on DRILL-6711: --- arina-ielchiieva commented on issue #1815: DRILL-6711: Use jitpack repository for Drill Calcite project artifacts instead of repository.mapr.com URL: https://github.com/apache/drill/pull/1815#issuecomment-506290215 +1, please squash the commits. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Use jitpack repository for Drill Calcite project artifacts instead of > repository.mapr.com > - > > Key: DRILL-6711 > URL: https://issues.apache.org/jira/browse/DRILL-6711 > Project: Apache Drill > Issue Type: Task >Reporter: Arina Ielchiieva >Assignee: Volodymyr Vysotskyi >Priority: Major > Fix For: 1.17.0 > > > Simplify deployment of Drill Calcite project artifacts by using > [https://jitpack.io/]. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (DRILL-7293) Convert the regex ("log") plugin to use EVF
[ https://issues.apache.org/jira/browse/DRILL-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873852#comment-16873852 ] ASF GitHub Bot commented on DRILL-7293: --- paul-rogers commented on issue #1807: DRILL-7293: Convert the regex ("log") plugin to use EVF URL: https://github.com/apache/drill/pull/1807#issuecomment-506209593 @arina-ielchiieva, added a unit test to show that the schema-only table function works. Tried to create a test that combined a "plugin" table function with the "schema" attribute. This failed due to the unfortunate use of "schema" as plugin property name. You've pointed out this issue all along, I finally understood why it was a problem. Still, I'm reluctant to change the config property name for fear of breaking compatibility. As it turns out, this limitation is only a minor nuisance since the only reason to combine the two kinds of table functions is to specify the regex property. A unit test shows that the regex can be specified as a table property instead. Also, went ahead and added support for the `columns` column. If no schema is provided (not in the plugin config, not in a table function, not in a provided schema), then rather than creating a set of dummy fields `field_0`, `field_1`, etc., the plugin how follows the text format plugin and puts the fields into the `columns` array. The dummy fields are still used if the user specifies at least one column schema, but the regex has more groups than specified columns. This means that, if the user uses a table function to specify just the regex, the user gets a reasonable result: the fields come back in the `columns` array. Unit tests show the new `columns` array support. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org > Convert the regex ("log") plugin to use EVF > --- > > Key: DRILL-7293 > URL: https://issues.apache.org/jira/browse/DRILL-7293 > Project: Apache Drill > Issue Type: Improvement >Affects Versions: 1.16.0 >Reporter: Paul Rogers >Assignee: Paul Rogers >Priority: Major > Fix For: 1.17.0 > > > The "log" plugin (which uses a regex to define the row format) is the subject > of Chapter 12 of the Learning Apache Drill book (though the version in the > book is simpler than the one in the master branch.) > The recently-completed "Enhanced Vector Framework" (EVF, AKA the "row set > framework") gives Drill control over the size of batches created by readers, > and allows readers to use the recently-added provided schema mechanism. > We wish to use the log reader as an example for how to convert a Drill format > plugin to use the EVF so that other developers can convert their own plugins. > This PR provides the first set of log plugin changes to enable us to publish > a tutorial on the EVF. -- This message was sent by Atlassian JIRA (v7.6.3#76005)