date:20180815

[ANNOUNCE] Apache Toree 0.2.0-incubating Released

2018-08-15 Thread Luciano Resende

Apache Toree is a kernel for the Jupyter Notebook platform providing interactive and remote access to Apache Spark. The Apache Toree community is pleased to announce the release of Apache Toree 0.2.0-incubating which provides various bug fixes and the following enhancements. * Support Apache

JdbcRDD - schema always resolved as nullable=true

2018-08-15 Thread Subhash Sriram

Hi Spark Users, We do a lot of processing in Spark using data that is in MS SQL server. Today, I created a DataFrame against a table in SQL Server using the following: val dfSql=spark.read.jdbc(connectionString, table, props) I noticed that every column in the DataFrame showed as

Re: Unable to see completed application in Spark 2 history web UI

2018-08-15 Thread Manu Zhang

If you are able to log onto the node where UI has been launched, then try `ps -aux | grep HistoryServer` and the first column of output should be the user. On Wed, Aug 15, 2018 at 10:26 PM Fawze Abujaber wrote: > Thanks Manu, Do you know how i can see which user the UI is running, > because i'm

java.lang.UnsupportedOperationException: No Encoder found for Set[String]

2018-08-15 Thread V0lleyBallJunki3

Hello, I am using Spark 2.2.2 with Scala 2.11.8. I wrote a short program val spark = SparkSession.builder().master("local[4]").getOrCreate() case class TestCC(i: Int, ss: Set[String]) import spark.implicits._ import spark.sqlContext.implicits._ val testCCDS = Seq(TestCC(1,Set("SS","Salil")),

from_json schema order

2018-08-15 Thread Brandon Geise

Hi, Can someone confirm whether ordering matters between the schema and underlying JSON string? Thanks, Brandon

Dynamic Allocation not removing executors

2018-08-15 Thread Maximiliano Patricio Méndez

Hi, I found an issue trying to use dynamic allocation in 2.3.1 where the driver does not remove idle executors under some circunstances. For the first instance of this happening, it seems that a change introduced in 2.2.1/2.3.0 (SPARK-21656 )

Re: from_json function

2018-08-15 Thread Maxim Gekk

Hello Denis, The from_json function supports only the fail fast mode, see: https://github.com/apache/spark/blob/e2ab7deae76d3b6f41b9ad4d0ece14ea28db40ce/sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/jsonExpressions.scala#L568 Your settings "mode" -> "PERMISSIVE" will be

[K8S] Spark initContainer custom bootstrap support for Spark master

2018-08-15 Thread Li Gao

Hi, We've noticed on the latest Master (not Spark 2.3.1 branch), the support for Kubernetes initContainer is no longer there. What would be the path forward if we need to do custom bootstrap actions (i.e. run additional scripts) prior to driver/executor container entering running mode? Thanks,

Shuffle uses Direct Memory Buffer even after setting "spark.shuffle.io.preferDirectBufs = false"

2018-08-15 Thread Vaibhav Kulkarni

Hi, I am using Standalone Spark 2.3 and have a question regarding Shuffle. Going by the documentation, default Shuffle behaviour is to use Direct Memory buffers. But, even after I set the following parameter, I notice Shuffle still uses Direct Memory buffers. spark.shuffle.io.preferDirectBufs

from_json function

2018-08-15 Thread dbolshak

Hello community, I can not manage to run from_json method with "columnNameOfCorruptRecord" option. ``` import org.apache.spark.sql.functions._ val data = Seq( "{'number': 1}", "{'number': }" ) val schema = new StructType() .add($"number".int)

Re: Unable to see completed application in Spark 2 history web UI

2018-08-15 Thread Fawze Abujaber

Thanks Manu, Do you know how i can see which user the UI is running, because i'm using cloudera manager and i created a user for cloudera manager and called it spark but this didn't solve me issue and here i'm trying to find out the user for the spark hisotry UI. On Wed, Aug 15, 2018 at 5:11 PM

Re: Unable to see completed application in Spark 2 history web UI

2018-08-15 Thread Manu Zhang

Hi Fawze, A) The file permission is currently hard coded to 770 ( https://github.com/apache/spark/blob/branch-2.3/core/src/main/scala/org/apache/spark/scheduler/EventLoggingListener.scala#L287 ). B) I think add all users (including UI) to the group like Spark will do. On Wed, Aug 15, 2018 at

Java API for statistics of spark job running on yarn

2018-08-15 Thread Serkan TAS

Hi all, I am facing and issue for long running spark job on yarn. If there occures some bottle neck on hdfs and/or kafka, active batch count increases immidiately. I am plannning to check the active batch count with java client and create alarms for the operations group. So, is it possible to

spark driver pod stuck in Waiting: PodInitializing state in Kubernetes

2018-08-15 Thread purna pradeep

im running Spark 2.3 job on kubernetes cluster kubectl version Client Version: version.Info{Major:"1", Minor:"9", GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", GitTreeState:"clean", BuildDate:"2018-02-09T21:51:06Z", GoVersion:"go1.9.4", Compiler:"gc",

Re: Unable to see completed application in Spark 2 history web UI

2018-08-15 Thread Fawze Abujaber

Hi Manu, Thanks for your response. Yes, i see but still interesting to know how i can see these applications from the spark history UI. How i can know with which user i'm logged in when i'm navigating the spark history UI. The Spark process is running with cloudera-scm and the events written

Re: Unable to see completed application in Spark 2 history web UI

2018-08-15 Thread Manu Zhang

Hi Fawze, In Spark 2.3, HistoryServer will check for file permissions when reading event logs written by your applications. (Please check https://issues.apache.org/jira/browse/SPARK-20172). With file permissions of 770, HistoryServer is not permitted to read the event log. That's why you were

[ANNOUNCE] Apache Toree 0.2.0-incubating Released

JdbcRDD - schema always resolved as nullable=true

Re: Unable to see completed application in Spark 2 history web UI

java.lang.UnsupportedOperationException: No Encoder found for Set[String]

from_json schema order

Dynamic Allocation not removing executors

Re: from_json function

[K8S] Spark initContainer custom bootstrap support for Spark master

Shuffle uses Direct Memory Buffer even after setting "spark.shuffle.io.preferDirectBufs = false"

from_json function

Re: Unable to see completed application in Spark 2 history web UI

Re: Unable to see completed application in Spark 2 history web UI

Java API for statistics of spark job running on yarn

spark driver pod stuck in Waiting: PodInitializing state in Kubernetes

Re: Unable to see completed application in Spark 2 history web UI

Re: Unable to see completed application in Spark 2 history web UI

16 matches

Site Navigation

Mail list logo

Footer information