[jira] [Commented] (DRILL-7310) Move schema-related classes from exec module to be able to use them in metastore module

2019-06-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7310?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874450#comment-16874450
 ] 

ASF GitHub Bot commented on DRILL-7310:
---

vvysotskyi commented on pull request #1816: DRILL-7310: Move schema-related 
classes from exec module to be able to use them in metastore module
URL: https://github.com/apache/drill/pull/1816
 
 
   For details please refer 
[DRILL-7310](https://issues.apache.org/jira/browse/DRILL-7310).
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Move schema-related classes from exec module to be able to use them in 
> metastore module
> ---
>
> Key: DRILL-7310
> URL: https://issues.apache.org/jira/browse/DRILL-7310
> Project: Apache Drill
>  Issue Type: Sub-task
>Reporter: Volodymyr Vysotskyi
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.17.0
>
>
> Currently, most of the schema related classes are placed in the {{exec}} 
> module, but some of them should be used in {{metastore}} module. 
> {{metastore}} module doesn't have a dependency onto exec one.
> The solution is to move these classes from {{exec}} into another module which 
> is used by {{metastore}}, so they will be accessible for {{metastore}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Created] (DRILL-7310) Move schema-related classes from exec module to be able to use them in metastore module

2019-06-27 Thread Volodymyr Vysotskyi (JIRA)
Volodymyr Vysotskyi created DRILL-7310:
--

 Summary: Move schema-related classes from exec module to be able 
to use them in metastore module
 Key: DRILL-7310
 URL: https://issues.apache.org/jira/browse/DRILL-7310
 Project: Apache Drill
  Issue Type: Sub-task
Reporter: Volodymyr Vysotskyi
Assignee: Volodymyr Vysotskyi
 Fix For: 1.17.0


Currently, most of the schema related classes are placed in the {{exec}} 
module, but some of them should be used in {{metastore}} module. {{metastore}} 
module doesn't have a dependency onto exec one.

The solution is to move these classes from {{exec}} into another module which 
is used by {{metastore}}, so they will be accessible for {{metastore}}.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-5983) Unsupported nullable converted type INT_8 for primitive type INT32 error

2019-06-27 Thread Stuart Teasdale (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-5983?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874263#comment-16874263
 ] 

Stuart Teasdale commented on DRILL-5983:


I'm seeing basically the same issue with Drill 1.16.0. I've attached the file 
that does this

> select * from test;
Error: INTERNAL_ERROR ERROR: Error in parquet record reader.
Message: Failure in setting up reader
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message schema {
 optional int32 chrom_int (INT_8);
 optional int64 __index_level_0__;
}
, metadata: \{pandas={"index_columns": ["__index_level_0__"], "column_indexes": 
[{"name": null, "field_name": null, "pandas_type": "unicode", "numpy_type": 
"object", "metadata": {"encoding": "UTF-8"}}], "columns": [\{"name": 
"chrom_int", "field_name": "chrom_int", "pandas_type": "int8", "numpy_type": 
"int8", "metadata": null}, \{"name": null, "field_name": "__index_level_0__", 
"pandas_type": "int64", "numpy_type": "int64", "metadata": null}], 
"pandas_version": "0.24.2"}}}, blocks: [BlockMetaData\{1, 140 
[ColumnMetaData{SNAPPY [chrom_int] optional int32 chrom_int (INT_8) 
[PLAIN_DICTIONARY, RLE, PLAIN], 24}, ColumnMetaData\{SNAPPY [__index_level_0__] 
optional int64 __index_level_0__ [PLAIN_DICTIONARY, RLE, PLAIN], 158}]}]}

Fragment 0:0

Please, refer to logs for more information.

[Error Id: c3e0e2ea-0e51-4732-96af-88ffe669b22c on 
mephistopheles.londc.genomicsplc.com:31010] (state=,code=0)

 

and from the logs:

org.apache.drill.common.exceptions.DrillRuntimeException: Error in parquet 
record reader.
Message: Failure in setting up reader
Parquet Metadata: ParquetMetaData{FileMetaData{schema: message schema {
 optional int32 chrom_int (INT_8);
 optional int64 __index_level_0__;
}
, metadata: \{pandas={"index_columns": ["__index_level_0__"], "column_indexes": 
[{"name": null, "field_name": null, "pandas_type": "unicode", "numpy_type": 
"object", "metadata": {"encoding": "UTF-8"}}], "columns": [\{"name": 
"chrom_int", "field_name": "chrom_int", "pandas_type": "int8", "numpy_type": 
"int8", "metadata": null}, \{"name": null, "field_name": "__index_level_0__", 
"pandas_type": "int64", "numpy_type": "int64", "metadata": null}], 
"pandas_version": "0.24.2"}}}, blocks: [BlockMetaData\{1, 140 
[ColumnMetaData{SNAPPY [chrom_int] optional int32 chrom_int (INT_8) 
[PLAIN_DICTIONARY, RLE, PLAIN], 24}, ColumnMetaData\{SNAPPY [__index_level_0__] 
optional int64 __index_level_0__ [PLAIN_DICTIONARY, RLE, PLAIN], 158}]}]}
 at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.handleException(ParquetRecordReader.java:269)
 ~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.exec.store.parquet.columnreaders.ParquetRecordReader.setup(ParquetRecordReader.java:253)
 ~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.exec.physical.impl.ScanBatch.getNextReaderIfHas(ScanBatch.java:321)
 ~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.exec.physical.impl.ScanBatch.internalNext(ScanBatch.java:216) 
~[drill-java-exec-1.16.0.jar:1.16.0]
 at org.apache.drill.exec.physical.impl.ScanBatch.next(ScanBatch.java:271) 
~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:126)
 ~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:116)
 ~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.exec.record.AbstractUnaryRecordBatch.innerNext(AbstractUnaryRecordBatch.java:63)
 ~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.exec.physical.impl.project.ProjectRecordBatch.innerNext(ProjectRecordBatch.java:141)
 ~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.exec.record.AbstractRecordBatch.next(AbstractRecordBatch.java:186)
 ~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:104) 
~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.exec.physical.impl.ScreenCreator$ScreenRoot.innerNext(ScreenCreator.java:83)
 ~[drill-java-exec-1.16.0.jar:1.16.0]
 at org.apache.drill.exec.physical.impl.BaseRootExec.next(BaseRootExec.java:94) 
~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:296)
 ~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.exec.work.fragment.FragmentExecutor$1.run(FragmentExecutor.java:283)
 ~[drill-java-exec-1.16.0.jar:1.16.0]
 at ...(:0) ~[na:na]
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1746)
 ~[hadoop-common-2.7.4.jar:na]
 at 
org.apache.drill.exec.work.fragment.FragmentExecutor.run(FragmentExecutor.java:283)
 ~[drill-java-exec-1.16.0.jar:1.16.0]
 at 
org.apache.drill.common.SelfCleaningRunnable.run(SelfCleaningRunnable.java:38) 
~[drill-common-1.16.0.jar:1.16.0]
 at ...(:0) ~[na:na]
Caused by: 

[jira] [Updated] (DRILL-5983) Unsupported nullable converted type INT_8 for primitive type INT32 error

2019-06-27 Thread Stuart Teasdale (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-5983?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Stuart Teasdale updated DRILL-5983:
---
Attachment: test.parquet

> Unsupported nullable converted type INT_8 for primitive type INT32 error
> 
>
> Key: DRILL-5983
> URL: https://issues.apache.org/jira/browse/DRILL-5983
> Project: Apache Drill
>  Issue Type: Bug
>  Components: Execution - Data Types
>Affects Versions: 1.10.0, 1.11.0
> Environment: NAME="Ubuntu"
> VERSION="16.04.2 LTS (Xenial Xerus)"
>Reporter: Hakan Sarıbıyık
>Priority: Major
>  Labels: parquet, read, types
> Attachments: test.parquet
>
>
> When I query a table with byte in it, then it gives an error;
> _Query Failed: An Error Occurred
> org.apache.drill.common.exceptions.UserRemoteException: SYSTEM ERROR: 
> ExecutionSetupException: Unsupported nullable converted type INT_8 for 
> primitive type INT32 Fragment 1:6 [Error Id: 
> 46636b05-cff5-455b-ba25-527217346b3e on bigdata7:31010]_
> Actualy, it has been solved with
> [DRILL-4764] - Parquet file with INT_16, etc. logical types not supported by 
> simple SELECT
> according to https://drill.apache.org/docs/apache-drill-1-10-0-release-notes/
> But i tried it with even 1-11-0 it didnt worked.
> I am querying parquet formatted file with pySpark 
> tablo1
> sourceid: byte (nullable = true)
> select sourceid from tablo1
> works as expected with pySpark. But not with Drill v1.11.0
> Thanx.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6711) Use jitpack repository for Drill Calcite project artifacts instead of repository.mapr.com

2019-06-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16874031#comment-16874031
 ] 

ASF GitHub Bot commented on DRILL-6711:
---

vvysotskyi commented on issue #1815: DRILL-6711: Use jitpack repository for 
Drill Calcite project artifacts instead of repository.mapr.com
URL: https://github.com/apache/drill/pull/1815#issuecomment-506312844
 
 
   @arina-ielchiieva, thanks for the review, commits are squashed.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Use jitpack repository for Drill Calcite project artifacts instead of 
> repository.mapr.com
> -
>
> Key: DRILL-6711
> URL: https://issues.apache.org/jira/browse/DRILL-6711
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Arina Ielchiieva
>Assignee: Volodymyr Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> Simplify deployment of Drill Calcite project artifacts by using 
> [https://jitpack.io/].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Updated] (DRILL-6711) Use jitpack repository for Drill Calcite project artifacts instead of repository.mapr.com

2019-06-27 Thread Arina Ielchiieva (JIRA)


 [ 
https://issues.apache.org/jira/browse/DRILL-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arina Ielchiieva updated DRILL-6711:

Labels: ready-to-commit  (was: )

> Use jitpack repository for Drill Calcite project artifacts instead of 
> repository.mapr.com
> -
>
> Key: DRILL-6711
> URL: https://issues.apache.org/jira/browse/DRILL-6711
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Arina Ielchiieva
>Assignee: Volodymyr Vysotskyi
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.17.0
>
>
> Simplify deployment of Drill Calcite project artifacts by using 
> [https://jitpack.io/].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6711) Use jitpack repository for Drill Calcite project artifacts instead of repository.mapr.com

2019-06-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-6711?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873991#comment-16873991
 ] 

ASF GitHub Bot commented on DRILL-6711:
---

arina-ielchiieva commented on issue #1815: DRILL-6711: Use jitpack repository 
for Drill Calcite project artifacts instead of repository.mapr.com
URL: https://github.com/apache/drill/pull/1815#issuecomment-506290215
 
 
   +1, please squash the commits.
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Use jitpack repository for Drill Calcite project artifacts instead of 
> repository.mapr.com
> -
>
> Key: DRILL-6711
> URL: https://issues.apache.org/jira/browse/DRILL-6711
> Project: Apache Drill
>  Issue Type: Task
>Reporter: Arina Ielchiieva
>Assignee: Volodymyr Vysotskyi
>Priority: Major
> Fix For: 1.17.0
>
>
> Simplify deployment of Drill Calcite project artifacts by using 
> [https://jitpack.io/].



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-7293) Convert the regex ("log") plugin to use EVF

2019-06-27 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/DRILL-7293?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16873852#comment-16873852
 ] 

ASF GitHub Bot commented on DRILL-7293:
---

paul-rogers commented on issue #1807: DRILL-7293: Convert the regex ("log") 
plugin to use EVF
URL: https://github.com/apache/drill/pull/1807#issuecomment-506209593
 
 
   @arina-ielchiieva, added a unit test to show that the schema-only table 
function works.
   
   Tried to create a test that combined a "plugin" table function with the 
"schema" attribute. This failed due to the unfortunate use of "schema" as 
plugin property name. You've pointed out this issue all along, I finally 
understood why it was a problem. Still, I'm reluctant to change the config 
property name for fear of breaking compatibility.
   
   As it turns out, this limitation is only a minor nuisance since the only 
reason to combine the two kinds of table functions is to specify the regex 
property. A unit test shows that the regex can be specified as a table property 
instead.
   
   Also, went ahead and added support for the `columns` column. If no schema is 
provided (not in the plugin config, not in a table function, not in a provided 
schema), then rather than creating a set of dummy fields `field_0`, `field_1`, 
etc., the plugin how follows the text format plugin and puts the fields into 
the `columns` array. The dummy fields are still used if the user specifies at 
least one column schema, but the regex has more groups than specified columns.
   
   This means that, if the user uses a table function to specify just the 
regex, the user gets a reasonable result: the fields come back in the `columns` 
array.
   
   Unit tests show the new `columns` array support.  
 

This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


> Convert the regex ("log") plugin to use EVF
> ---
>
> Key: DRILL-7293
> URL: https://issues.apache.org/jira/browse/DRILL-7293
> Project: Apache Drill
>  Issue Type: Improvement
>Affects Versions: 1.16.0
>Reporter: Paul Rogers
>Assignee: Paul Rogers
>Priority: Major
> Fix For: 1.17.0
>
>
> The "log" plugin (which uses a regex to define the row format) is the subject 
> of Chapter 12 of the Learning Apache Drill book (though the version in the 
> book is simpler than the one in the master branch.)
> The recently-completed "Enhanced Vector Framework" (EVF, AKA the "row set 
> framework") gives Drill control over the size of batches created by readers, 
> and allows readers to use the recently-added provided schema mechanism.
> We wish to use the log reader as an example for how to convert a Drill format 
> plugin to use the EVF so that other developers can convert their own plugins.
> This PR provides the first set of log plugin changes to enable us to publish 
> a tutorial on the EVF.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)