Re: java.lang.UnsupportedOperationException: No Encoder found for Set[String]

2018-08-16 Thread Manu Zhang
You may try applying this PR https://github.com/apache/spark/pull/18416. On Fri, Aug 17, 2018 at 9:13 AM Venkat Dabri wrote: > We are using spark 2.2.0. Is it possible to bring the > ExpressionEncoder from 2.3.0 and related classes into my code base and > use them? I see the changes in

Re: java.lang.UnsupportedOperationException: No Encoder found for Set[String]

2018-08-16 Thread Venkat Dabri
We are using spark 2.2.0. Is it possible to bring the ExpressionEncoder from 2.3.0 and related classes into my code base and use them? I see the changes in ExpressionEncoder between 2.3.0 and 2.2.0 is not much but there might be many other classes underneath that might have changed. On Thu, Aug

Re: Pass config file through spark-submit

2018-08-16 Thread yujhe.li
So can you read the file on executor side? I think the file passed by --files my.app.conf would be added under classpath, and you can use it directly. -- Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ - To

Re: Unable to see completed application in Spark 2 history web UI

2018-08-16 Thread Manu Zhang
Hi Fawze, Sorry but I'm not familiar with CM. Maybe you can look into the logs (or turn on DEBUG log). On Thu, Aug 16, 2018 at 3:05 PM Fawze Abujaber wrote: > Hi Manu, > > I'm using cloudera manager with single user mode and every process is > running with cloudera-scm user, the cloudera-scm

something happened to MemoryStream after spark 2.3

2018-08-16 Thread Koert Kuipers
hi, we just started testing internally with spark 2.4 snapshots, and it seems our streaming tests are broken. i believe it has to do with MemoryStream. before we were able to create a MemoryStream, add data to it, convert it to a streaming unbounded DataFrame and use it repeatedly. by using it

Re: [K8S] Spark initContainer custom bootstrap support for Spark master

2018-08-16 Thread Li Gao
Thanks! We will likely use the second option to customize the bootstrap. On Thu, Aug 16, 2018 at 10:04 AM Yinan Li wrote: > Yes, the init-container has been removed in the master branch. The > init-container was used in 2.3.x only for downloading remote dependencies, > which is now handled by

Re: [K8S] Spark initContainer custom bootstrap support for Spark master

2018-08-16 Thread Yinan Li
Yes, the init-container has been removed in the master branch. The init-container was used in 2.3.x only for downloading remote dependencies, which is now handled by running spark-submit in the driver. If you need to run custom bootstrap scripts using an init-container, the best option would be to

Re: java.lang.IndexOutOfBoundsException: len is negative - when data size increases

2018-08-16 Thread Vadim Semenov
one of the spills becomes bigger than 2GiB and can't be loaded fully (as arrays in Java can't have more than 2^32 values) > > org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillReader.loadNext(UnsafeSorterSpillReader.java:76) You can try increasing the number of partitions, so

java.lang.IndexOutOfBoundsException: len is negative - when data size increases

2018-08-16 Thread Deepak Sharma
Hi All, I am running spark based ETL in spark 1.6 and facing this weird issue. The same code with same properties/configuration runs fine in other environment E.g. PROD but never completes in CAT. The only change would be the size of data it is processing and that too be by 1-2 GB. This is the

[Spark Streaming] [ML]: Exception handling for the transform method of Spark ML pipeline model

2018-08-16 Thread sudododo
Hi, I'm implementing a Spark Streaming + ML application. The data is coming in a Kafka topic as json format. The Spark Kafka connector reads the data from the Kafka topic as DStream. After several preprocessing steps, the input DStream is transformed to a feature DStream which is fed into Spark

Pass config file through spark-submit

2018-08-16 Thread James Starks
I have a config file that exploits type safe config library located on the local file system, and want to submit that file through spark-submit so that spark program can read customized parameters. For instance, my.app { db { host = domain.cc port = 1234 db = dbname user =

Re: java.lang.UnsupportedOperationException: No Encoder found for Set[String]

2018-08-16 Thread Manu Zhang
Hi, It's added since Spark 2.3.0. https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/SQLImplicits.scala#L180 Regards, Manu Zhang On Thu, Aug 16, 2018 at 9:59 AM V0lleyBallJunki3 wrote: > Hello, > I am using Spark 2.2.2 with Scala 2.11.8. I wrote a short

Re: spark driver pod stuck in Waiting: PodInitializing state in Kubernetes

2018-08-16 Thread purna pradeep
Hello, im running Spark 2.3 job on kubernetes cluster > > kubectl version > > Client Version: version.Info{Major:"1", Minor:"9", > GitVersion:"v1.9.3", GitCommit:"d2835416544f298c919e2ead3be3d0864b52323b", > GitTreeState:"clean", BuildDate:"2018-02-09T21:51:06Z", > GoVersion:"go1.9.4",

Re: Structured streaming: Tried to fetch $offset but the returned record offset was ${record.offset}"

2018-08-16 Thread andreas . weise
On 2018/04/17 22:34:25, Cody Koeninger wrote: > Is this possibly related to the recent post on > https://issues.apache.org/jira/browse/SPARK-18057 ? > > On Mon, Apr 16, 2018 at 11:57 AM, ARAVIND SETHURATHNAM < > asethurath...@homeaway.com.invalid> wrote: > > > Hi, > > > > We have several

Re: Unable to see completed application in Spark 2 history web UI

2018-08-16 Thread Fawze Abujaber
Hi Manu, I'm using cloudera manager with single user mode and every process is running with cloudera-scm user, the cloudera-scm is a super user and this is why i was confused how it worked in spark 1.6 and not in spark 2.3 On Thu, Aug 16, 2018 at 5:34 AM Manu Zhang wrote: > If you are able to

Re: from_json function

2018-08-16 Thread dbolshak
Maxim, thanks for your replay. I've left comment in the following jira issue https://issues.apache.org/jira/browse/SPARK-23194?focusedCommentId=16582025=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#comment-16582025 -- Sent from: