[GitHub] drill pull request #1067: DRILL-3958: Return a valid error message when stor...

2017-12-12 Thread rajrahul
GitHub user rajrahul opened a pull request:

https://github.com/apache/drill/pull/1067

DRILL-3958: Return a valid error message when storage plugin fails



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajrahul/drill DRILL-3958

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1067.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1067


commit d5db51c795e9a9537c9feb4cf330d3b11045be8f
Author: Rahul Raj 
Date:   2017-12-12T08:51:53Z

DRILL-3958: Return a valid error message when storage plugin fails




---


[GitHub] drill pull request #1166: DRILL-6016 - Fix for Error reading INT96 created b...

2018-03-13 Thread rajrahul
GitHub user rajrahul opened a pull request:

https://github.com/apache/drill/pull/1166

DRILL-6016 - Fix for Error reading INT96 created by Apache Spark

This fixes DRILL-6016 where drill was failing to read int96 generated by 
Apache Spark even after setting store.parquet.reader.int96_as_timestamp to true.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/rajrahul/drill DRILL-6016

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/drill/pull/1166.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1166


commit 6d74884f314638e8d95e0c6ea3431222cb308ec5
Author: Rahul Raj 
Date:   2018-03-14T06:35:45Z

DRILL-6016 - Fix for Error reading INT96 created by Apache Spark




---


[GitHub] drill issue #1166: DRILL-6016 - Fix for Error reading INT96 created by Apach...

2018-03-14 Thread rajrahul
Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@parthchandra @vdiravka  I do not have a test case for this. I have 
manually verified the scenario with and without the patch. The sample input 
file is attached with https://issues.apache.org/jira/browse/DRILL-6016.




---


[GitHub] drill issue #1166: DRILL-6016 - Fix for Error reading INT96 created by Apach...

2018-03-14 Thread rajrahul
Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@parthchandra please use the link 
https://github.com/rajrahul/files/raw/master/result.tar.gz
The files are present inside result/parquet/latest.


---


[GitHub] drill issue #1166: DRILL-6016 - Fix for Error reading INT96 created by Apach...

2018-03-14 Thread rajrahul
Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@parthchandra I will create a unit test with few time stamp fields.


---


[GitHub] drill issue #1166: DRILL-6016 - Fix for Error reading INT96 created by Apach...

2018-03-15 Thread rajrahul
Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@parthchandra @vdiravka I have added the test case using the same parquet 
file(2.9k bytes). I tried creating a smaller file using Spark, but could not 
replicate the behavior. I have rebased the changes on the same commit and PR.


---


[GitHub] drill issue #1166: DRILL-6016 - Fix for Error reading INT96 created by Apach...

2018-03-15 Thread rajrahul
Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
The schema given below creates the issue, as @vdiravka pointed int96 is 
marked required here. This parquet was generated with an older version of spark 
and is included in the test case.

```
message spark_schema {
  optional binary article_no (UTF8);
  optional binary qty (UTF8);
  required int96 run_date;
}

```
Newer spark version created the schema below where int96 has become 
optional.

```
message spark_schema {
  optional binary country (UTF8);
  optional double sales;
  optional int96 targetDate;
}
```


---


[GitHub] drill issue #1166: DRILL-6016 - Fix for Error reading INT96 created by Apach...

2018-03-26 Thread rajrahul
Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@vdiravka I have made the changes. Please have a look.


---


[GitHub] drill pull request #1166: DRILL-6016 - Fix for Error reading INT96 created b...

2018-03-26 Thread rajrahul
Github user rajrahul commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r177318780
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -797,6 +797,24 @@ public void 
testImpalaParquetBinaryAsTimeStamp_DictChange() throws Exception {
 }
   }
 
+  @Test
+  public void testSparkParquetBinaryAsTimeStamp_DictChange() throws 
Exception {
+try {
+  mockUtcDateTimeZone();
--- End diff --

I could see two ways of doing this within the code itself.
1. Mock and run with UTC, and compare the results in UTC as in 
TestCastFunctions#testToDateForTimeStamp. Since TestParquetWriter already has a 
RunWith annotation, we might have to create another class and move both the 
methods.
2. Run with the JVM timezone(no mocking) and compare the results after a 
'convertToLocalTimestamp' as in TestParquetWriter#testInt96TimeStampValueWidth

Approach 2 does not used fixed UTC timezone. Which approach do you suggest?


---


[GitHub] drill pull request #1166: DRILL-6016 - Fix for Error reading INT96 created b...

2018-03-28 Thread rajrahul
Github user rajrahul commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r177950795
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -797,6 +797,24 @@ public void 
testImpalaParquetBinaryAsTimeStamp_DictChange() throws Exception {
 }
   }
 
+  @Test
+  public void testSparkParquetBinaryAsTimeStamp_DictChange() throws 
Exception {
+try {
+  mockUtcDateTimeZone();
--- End diff --

@vdiravka your thoughts on comment above?


---


[GitHub] drill issue #1166: DRILL-6016 - Fix for Error reading INT96 created by Apach...

2018-03-29 Thread rajrahul
Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@vdiravka I have made similar changes for 
testSparkParquetBinaryAsTimeStamp_DictChange, 
testHiveParquetTimestampAsInt96_basic and 
testImpalaParquetBinaryAsTimeStamp_DictChange. All tests are passing, please 
have a look.


---


[GitHub] drill issue #1166: DRILL-6016 - Fix for Error reading INT96 created by Apach...

2018-03-29 Thread rajrahul
Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@vdiravka Done. Please review.


---


[GitHub] drill pull request #1166: DRILL-6016 - Fix for Error reading INT96 created b...

2018-03-30 Thread rajrahul
Github user rajrahul commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r178290675
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -780,17 +780,31 @@ public void 
testImpalaParquetBinaryAsVarBinary_DictChange() throws Exception {
   Test the reading of a binary field as drill timestamp where data is in 
dictionary _and_ non-dictionary encoded pages
*/
   @Test
-  @Ignore("relies on particular time zone, works for UTC")
   public void testImpalaParquetBinaryAsTimeStamp_DictChange() throws 
Exception {
 try {
   testBuilder()
-  .sqlQuery("select int96_ts from dfs.`parquet/int96_dict_change` 
order by int96_ts")
+  .sqlQuery("select min(int96_ts) date_value from 
dfs.`parquet/int96_dict_change`")
--- End diff --

I did not try a WHERE statement, MIN was used to select a single record to 
compare. Was there any specific reason to use WHERE?


---


[GitHub] drill pull request #1166: DRILL-6016 - Fix for Error reading INT96 created b...

2018-03-30 Thread rajrahul
Github user rajrahul commented on a diff in the pull request:

https://github.com/apache/drill/pull/1166#discussion_r178324303
  
--- Diff: 
exec/java-exec/src/test/java/org/apache/drill/exec/physical/impl/writer/TestParquetWriter.java
 ---
@@ -61,6 +60,7 @@
 import org.junit.runners.Parameterized;
 
 @RunWith(Parameterized.class)
+
--- End diff --

Actually not required, tried to add another RunWith for Mocking and removed 
later on leaving the newline.


---


[GitHub] drill issue #1166: DRILL-6016 - Fix for Error reading INT96 created by Apach...

2018-04-01 Thread rajrahul
Github user rajrahul commented on the issue:

https://github.com/apache/drill/pull/1166
  
@vdiravka removed the extra line.


---