Interesting. Thanks On Fri, Sep 30, 2016 at 11:46 PM, Jinfeng Ni <[email protected]> wrote:
> This seems to be same issue as reported in DRILL-4203. You are right > that it causes the problem of exchanging data between two different > products. > > There has been a pull request under review. Hopefully, once DRILL-4203 > is fixed, the issue you saw will be fixed as well. > > > [1] https://issues.apache.org/jira/browse/DRILL-4203 > > > On Fri, Sep 30, 2016 at 6:26 PM, Minnow Noir <[email protected]> wrote: > > I'm trying to process data using Spark and then query it using Drill. > > > > When I create a parquet file using a Spark 1.6.1 job, and then try to > query > > it in Drill 1.8.0, I notice that the dates are in an unknown format. All > > string and other types seem fine. I'm using the java.sql.Date class > because > > I get "unsupported" errors when I use java.util.Date and try to save in > > parquet format. If I create the parquet file using CTAS in Drill, I > don't > > have this problem; this is strictly a problem exchanging data between the > > two products. > > > > For example, if I create an RDD of dates, convert that to a DF, then save > > that DF, and read the file back into Spark, it sees the correct values. > > > > ... > > 76 case class foo(dt: java.sql.Date) > > 80 val format = new java.text.SimpleDateFormat("MM/dd/yyyy") > > 81 val dates = test.map(x => foo( new java.sql.Date( > > format.parse(x).getTime ) ) ) > > 83 val df = dates.toDF > > 85 df.write.save("blah/test.parquet") > > 86 val df2 = sqlContext.read.parquet("blah/test.parquet") > > 87 df2.first > > res10: org.apache.spark.sql.Row = [2016-06-08] > > > > > > However, If I query the file using Drill, I get a different result: > > > > select * from blah limit 1; > > +-------------+---------------+ > > | dt | dir0 | > > +-------------+---------------+ > > | 349-06-19 | test.parquet | > > > > Any idea what I need to do in order to be able to query dates in > > Spark-created parquet files with Drill? > > > > Thanks >
