[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-04-07 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16429335#comment-16429335
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user asfgit closed the pull request at:

https://github.com/apache/drill/pull/1166


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-04-02 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16422075#comment-16422075
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/1166
  
@rajrahul thanks for making all the changes (and of course for the fix)!


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-04-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421944#comment-16421944
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@vdiravka removed the extra line.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-04-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421643#comment-16421643
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r178456861
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -61,6 +60,7 @@
 import org.junit.runners.Parameterized;
 
 @RunWith(Parameterized.class)
+
--- End diff --

ok, just remove it


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-04-01 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16421642#comment-16421642
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r178456935
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -780,17 +780,31 @@ public void 
testImpalaParquetBinaryAsVarBinary_DictChange() throws Exception {
   Test the reading of a binary field as drill timestamp where data is in 
dictionary _and_ non-dictionary encoded pages
*/
   @Test
-  @Ignore("relies on particular time zone, works for UTC")
   public void testImpalaParquetBinaryAsTimeStamp_DictChange() throws 
Exception {
 try {
   testBuilder()
-  .sqlQuery("select int96_ts from dfs.`parquet/int96_dict_change` 
order by int96_ts")
+  .sqlQuery("select min(int96_ts) date_value from 
dfs.`parquet/int96_dict_change`")
--- End diff --

It is just more obvious what result is expected. But using MIN is ok.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420683#comment-16420683
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r178324303
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -61,6 +60,7 @@
 import org.junit.runners.Parameterized;
 
 @RunWith(Parameterized.class)
+
--- End diff --

Actually not required, tried to add another RunWith for Mocking and removed 
later on leaving the newline.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420493#comment-16420493
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r178290675
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -780,17 +780,31 @@ public void 
testImpalaParquetBinaryAsVarBinary_DictChange() throws Exception {
   Test the reading of a binary field as drill timestamp where data is in 
dictionary _and_ non-dictionary encoded pages
*/
   @Test
-  @Ignore("relies on particular time zone, works for UTC")
   public void testImpalaParquetBinaryAsTimeStamp_DictChange() throws 
Exception {
 try {
   testBuilder()
-  .sqlQuery("select int96_ts from dfs.`parquet/int96_dict_change` 
order by int96_ts")
+  .sqlQuery("select min(int96_ts) date_value from 
dfs.`parquet/int96_dict_change`")
--- End diff --

I did not try a WHERE statement, MIN was used to select a single record to 
compare. Was there any specific reason to use WHERE?


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420294#comment-16420294
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r178255942
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -780,17 +780,31 @@ public void 
testImpalaParquetBinaryAsVarBinary_DictChange() throws Exception {
   Test the reading of a binary field as drill timestamp where data is in 
dictionary _and_ non-dictionary encoded pages
*/
   @Test
-  @Ignore("relies on particular time zone, works for UTC")
   public void testImpalaParquetBinaryAsTimeStamp_DictChange() throws 
Exception {
 try {
   testBuilder()
-  .sqlQuery("select int96_ts from dfs.`parquet/int96_dict_change` 
order by int96_ts")
+  .sqlQuery("select min(int96_ts) date_value from 
dfs.`parquet/int96_dict_change`")
--- End diff --

Did you try WHERE statement?


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-30 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420295#comment-16420295
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r178255699
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -61,6 +60,7 @@
 import org.junit.runners.Parameterized;
 
 @RunWith(Parameterized.class)
+
--- End diff --

new line?


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16420144#comment-16420144
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@vdiravka Done. Please review.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
>Affects Versions: 1.13.0
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419087#comment-16419087
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r178071020
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -35,6 +36,7 @@
 import java.util.Map;
 
 import com.google.common.base.Joiner;
+import mockit.integration.junit4.JMockit;
--- End diff --

the same


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419088#comment-16419088
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r178070635
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -780,17 +783,42 @@ public void 
testImpalaParquetBinaryAsVarBinary_DictChange() throws Exception {
   Test the reading of a binary field as drill timestamp where data is in 
dictionary _and_ non-dictionary encoded pages
*/
   @Test
-  @Ignore("relies on particular time zone, works for UTC")
   public void testImpalaParquetBinaryAsTimeStamp_DictChange() throws 
Exception {
 try {
   testBuilder()
   .sqlQuery("select int96_ts from dfs.`parquet/int96_dict_change` 
order by int96_ts")
   .optionSettingQueriesForTestQuery(
   "alter session set `%s` = true", 
ExecConstants.PARQUET_READER_INT96_AS_TIMESTAMP)
   .ordered()
-  
.csvBaselineFile("testframework/testParquetReader/testInt96DictChange/q1.tsv")
-  .baselineTypes(TypeProtos.MinorType.TIMESTAMP)
   .baselineColumns("int96_ts")
+  .baselineValues(new DateTime(convertToLocalTimestamp("1970-01-01 
00:00:01.000")))
--- End diff --

One baselineValue is enough. Please use `where` in the query.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16419086#comment-16419086
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r178072549
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -27,6 +27,7 @@
 import java.math.BigDecimal;
 import java.nio.file.Paths;
 import java.sql.Date;
+import java.sql.Timestamp;
--- End diff --

unused import?


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418981#comment-16418981
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@vdiravka I have made similar changes for 
testSparkParquetBinaryAsTimeStamp_DictChange, 
testHiveParquetTimestampAsInt96_basic and 
testImpalaParquetBinaryAsTimeStamp_DictChange. All tests are passing, please 
have a look.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-29 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418688#comment-16418688
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r178009588
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -797,6 +797,24 @@ public void 
testImpalaParquetBinaryAsTimeStamp_DictChange() throws Exception {
 }
   }
 
+  @Test
+  public void testSparkParquetBinaryAsTimeStamp_DictChange() throws 
Exception {
+try {
+  mockUtcDateTimeZone();
--- End diff --

I see that in `TestParquetWriter` only one parameter is used - `repeat`. I 
think you can replace `Parameterized` running of this. test with simple 
variable. 
Other approach - you can write programmatically using of JMockit.

But I prefer not to use mocks if possible. So try to use 
`convertToLocalTimestamp`. By using it you can enable also 
`testHiveParquetTimestampAsInt96_basic` test and 
`testImpalaParquetBinaryAsTimeStamp_DictChange` with removing redundant rows.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-28 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16418423#comment-16418423
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r177950795
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -797,6 +797,24 @@ public void 
testImpalaParquetBinaryAsTimeStamp_DictChange() throws Exception {
 }
   }
 
+  @Test
+  public void testSparkParquetBinaryAsTimeStamp_DictChange() throws 
Exception {
+try {
+  mockUtcDateTimeZone();
--- End diff --

@vdiravka your thoughts on comment above?


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-27 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16415088#comment-16415088
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r177318780
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -797,6 +797,24 @@ public void 
testImpalaParquetBinaryAsTimeStamp_DictChange() throws Exception {
 }
   }
 
+  @Test
+  public void testSparkParquetBinaryAsTimeStamp_DictChange() throws 
Exception {
+try {
+  mockUtcDateTimeZone();
--- End diff --

I could see two ways of doing this within the code itself.
1. Mock and run with UTC, and compare the results in UTC as in 
TestCastFunctions#testToDateForTimeStamp. Since TestParquetWriter already has a 
RunWith annotation, we might have to create another class and move both the 
methods.
2. Run with the JVM timezone(no mocking) and compare the results after a 
'convertToLocalTimestamp' as in TestParquetWriter#testInt96TimeStampValueWidth

Approach 2 does not used fixed UTC timezone. Which approach do you suggest?


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16414091#comment-16414091
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user vdiravka commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r177154051
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -797,6 +797,24 @@ public void 
testImpalaParquetBinaryAsTimeStamp_DictChange() throws Exception {
 }
   }
 
+  @Test
+  public void testSparkParquetBinaryAsTimeStamp_DictChange() throws 
Exception {
+try {
+  mockUtcDateTimeZone();
--- End diff --

It doesn't work without `@RunWith(JMockit.class)`. 
Also please enable above test case 
`testImpalaParquetBinaryAsTimeStamp_DictChange` with the same change. And be 
sure that tests pass in the other time zone.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-26 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16413783#comment-16413783
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@vdiravka I have made the changes. Please have a look.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-24 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16412718#comment-16412718
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user vdiravka commented on the issue:

https://github.com/apache/drill/pull/1166
  
@rajrahul Unit test from your PR relies on particular timezone similar to 
`TestParquetWriter.testImpalaParquetBinaryAsTimeStamp_DictChange`.

Could you please edit test case for working within any time zone?
Please see this PR #904 for more details. 


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
>  Labels: ready-to-commit
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-21 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16407690#comment-16407690
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/1166
  
+1. LGTM


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Assignee: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401444#comment-16401444
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
The schema given below creates the issue, as @vdiravka pointed int96 is 
marked required here. This parquet was generated with an older version of spark 
and is included in the test case.

```
message spark_schema {
  optional binary article_no (UTF8);
  optional binary qty (UTF8);
  required int96 run_date;
}

```
Newer spark version created the schema below where int96 has become 
optional.

```
message spark_schema {
  optional binary country (UTF8);
  optional double sales;
  optional int96 targetDate;
}
```


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-15 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16400151#comment-16400151
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@parthchandra @vdiravka I have added the test case using the same parquet 
file(2.9k bytes). I tried creating a smaller file using Spark, but could not 
replicate the behavior. I have rebased the changes on the same commit and PR.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398652#comment-16398652
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user vdiravka commented on the issue:

https://github.com/apache/drill/pull/1166
  
@parthchandra I have compared meta of files from 
`TestParquetWriter.testImpalaParquetBinaryAsTimeStamp_DictChange` and the meta 
from Rahul's dataset and found that test case indeed makes a query from two 
parquet files: one is dictionary encoded and other isn't. But the dataMode of 
column is `Optional`, that's why `Nullable` column reader is used.
Rahul's dataset contains `required` mode for INT96 column. This is a 
difference. Therefore other non-nullable column reader is necessary. 

But I believe we have some mess in names of that column readers. Maybe to 
make some refactoring would be a good point. What do you think? For example to 
remove `Dictionary` prefixes from nested classes, but to leave it for top class 
name.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398508#comment-16398508
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@parthchandra I will create a unit test with few time stamp fields.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398446#comment-16398446
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/1166
  
@rajrahul this link is good. As expected, the int96 column is dictionary 
encoded. 
Is it possible for you to extract just a couple of records from this file 
and then use that for a unit test? 
see 
[TestParquetWriter.testImpalaParquetBinaryAsTimeStamp_DictChange](https://github.com/apache/drill/blob/master/exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java#L784)

@vdiravka TestParquetWriter.testImpalaParquetBinaryAsTimeStamp_DictChange 
also uses an int96 that is dictionary encoded. Any idea whether (and why) it 
might be going thru a different code path?




> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398379#comment-16398379
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@parthchandra please use the link 
https://github.com/rajrahul/files/raw/master/result.tar.gz
The files are present inside result/parquet/latest.


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398344#comment-16398344
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user parthchandra commented on the issue:

https://github.com/apache/drill/pull/1166
  
@rajrahul, thanks for submitting the patch. It looks good. I guess we 
missed dictionary encoded int96 timestamps (even though timestamps with 
nanosecond precision) are the one thing that should never, ever, be dictionary 
encoded!

Just to make sure, I tried the use the sample file in DRILL-6016, but I 
could not even unzip it! Can you please check and see if the file is correct? 
WE can use that to create the unit test as well.



> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398190#comment-16398190
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@parthchandra @vdiravka  I do not have a test case for this. I have 
manually verified the scenario with and without the patch. The sample input 
file is attached with https://issues.apache.org/jira/browse/DRILL-6016.




> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2018-03-14 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16398167#comment-16398167
 ] 

ASF GitHub Bot commented on DRILL-6016:
---

Github user priteshm commented on the issue:

https://github.com/apache/drill/pull/1166
  
@parthchandra would you please review this?


> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>Priority: Major
> Fix For: 1.14.0
>
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (DRILL-6016) Error reading INT96 created by Apache Spark

2017-12-07 Thread Vitalii Diravka (JIRA)

[ 
https://issues.apache.org/jira/browse/DRILL-6016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16281561#comment-16281561
 ] 

Vitalii Diravka commented on DRILL-6016:


Interesting dataset.
Drill reads INT96 by default as VARBINARY: 
https://drill.apache.org/docs/parquet-format/#sql-data-types-to-parquet
But with provided dataset it returns an error. Even with explicit converting it 
returns an error:
{code}
0: jdbc:drill:zk=local> select CONVERT_FROM(run_date, 'TIMESTAMP_IMPALA') from 
dfs.`/home/vitalii/Downloads/result/parquet/latest/part-r-0-0c44161e-49e7-4b40-b4ab-c3d8e492bf33.snappy.parquet`
 limit 1; 
Error: DATA_READ ERROR: Error reading from Parquet file

File:  
/home/vitalii/Downloads/result/parquet/latest/part-r-0-0c44161e-49e7-4b40-b4ab-c3d8e492bf33.snappy.parquet
Column:  run_date
Row Group Start:  5523
Fragment 0:0
{code}

But the schema looks good:
{code}
vitalii@vitalii-pc:~/parquet-tools/parquet-mr/parquet-tools/target$ java -jar 
parquet-tools-1.6.0rc3-SNAPSHOT.jar schema 
/home/vitalii/Downloads/result/parquet/latest/part-r-0-0c44161e-49e7-4b40-b4ab-c3d8e492bf33.snappy.parquet
message spark_schema {
  optional binary article_no (UTF8);
  optional binary qty (UTF8);
  required int96 run_date;
}
{code}

> Error reading INT96 created by Apache Spark
> ---
>
> Key: DRILL-6016
> URL: https://issues.apache.org/jira/browse/DRILL-6016
> Project: Apache Drill
>  Issue Type: Bug
> Environment: Drill 1.11
>Reporter: Rahul Raj
>
> Hi,
> I am getting the error - SYSTEM ERROR : ClassCastException: 
> org.apache.drill.exec.vector.TimeStampVector cannot be cast to 
> org.apache.drill.exec.vector.VariableWidthVector while trying to read a spark 
> INT96 datetime field on Drill 1.11 in spite of setting the property 
> store.parquet.reader.int96_as_timestamp to  true.
> I believe this was fixed in drill 
> 1.10(https://issues.apache.org/jira/browse/DRILL-4373). What could be wrong.
> I have attached the dataset at 
> https://github.com/rajrahul/files/blob/master/result.tar.gz



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)