Re: [Spark SQL] ceil and floor functions on doubles

2017-05-22 Thread Vadim Semenov
Yes, it was done on purpose to match the behavior of Hive ( https://issues.apache.org/jira/browse/SPARK-10865). And I believe Hive returns `Long`s because they adopted the definition used in MySQL (https://issues.apache.org/jira/browse/HIVE-615). On Fri, May 19, 2017 at 10:51 AM, Anton Okolnychyi

Re: Anyone knows how to build and spark on jdk9?

2017-10-27 Thread Vadim Semenov
If someone else is looking how to try jdk9, you can just pass your own JAVA_HOME environment variables: spark.yarn.appMasterEnv.JAVA_HOME=/usr/lib/jvm/java-1.8.0 spark.executorEnv.JAVA_HOME=/usr/lib/jvm/java-1.8.0 On Fri, Oct 27, 2017 at 5:14 AM, Steve Loughran wrote: > > On 27 Oct 2017, at 0

Re: Task failures and other problems

2017-11-09 Thread Vadim Semenov
Probably not Oracle but Cloudera 🙂 Jan, I think your DataNodes might be overloaded, I'd suggest reducing `spark.executor.cores` if you run executors alongside DataNodes, so the DataNode process would get some resources. The other thing you can do is to increase `dfs.client.socket-timeout` in hado

Re: Spark Writing to parquet directory : java.io.IOException: Disk quota exceeded

2017-11-22 Thread Vadim Semenov
The error message seems self-explanatory, try to figure out what's the disk quota you have for your user. On Wed, Nov 22, 2017 at 8:23 AM, Chetan Khatri wrote: > Anybody reply on this ? > > On Tue, Nov 21, 2017 at 3:36 PM, Chetan Khatri < > chetan.opensou...@gmail.com> wrote: > >> >> Hello Spark

Re: RDD[internalRow] -> DataSet

2017-12-12 Thread Vadim Semenov
not possible, but you can add your own object in your project to the spark's package that would give you access to private methods package org.apache.spark.sql import org.apache.spark.rdd.RDD import org.apache.spark.sql.catalyst.InternalRow import org.apache.spark.sql.execution.LogicalRDD import