date:20171112

Re: is there a way for removing hadoop from spark

2017-11-12 Thread Sean Owen

Nothing about Spark depends on a cluster. The Hadoop client libs are required as they are part of the API but there is no need to remove that if you aren't using YARN. Indeed you can't but they're just libs. On Sun, Nov 12, 2017, 9:36 PM wrote: > @Jörn Spark without Hadoop is

Re: is there a way for removing hadoop from spark

2017-11-12 Thread Jörn Franke

Within in a CI/CD pipeline I use MiniDFSCluster and MiniYarnCluster if the production cluster has also HDFS and Yarn - it has been proven as extremely useful and caught a lot of errors before going to the cluster (ie saves a lot of money). Cf.

Re: is there a way for removing hadoop from spark

2017-11-12 Thread trsell

@Jörn Spark without Hadoop is useful - For using sparks programming model on a single beefy instance - For testing and integrating with a CI/CD pipeline. It's ugly to have tests which depend on a cluster running somewhere. On Sun, 12 Nov 2017 at 17:17 Jörn Franke

unsubscribe

2017-11-12 Thread 何琪

Re: HashingTFModel/IDFModel in Structured Streaming

2017-11-12 Thread Davis Varghese

Bago, Finally I am able to create one which fails consistently. I think the issue is caused by the VectorAssembler in the model. In the new code, I have 2 features(1 text and 1 number) and I have to run through a VectorAssembler before giving to LogisticRegression. Code and test data below

Re: Jenkins upgrade/Test Parallelization & Containerization

2017-11-12 Thread shane knapp

hey all, i'm finally back from vacation this week and will be following up once i whittle down my inbox. in summation: jenkins worker upgrades will be happening. the biggest one is the move to ubuntu... we need containerized builds for this, but i don't have the cycles to really do all of this

Divide Spark Dataframe to parts by timestamp

2017-11-12 Thread Chetan Khatri

Hello All, I have Spark Dataframe with timestamp from 2015-10-07 19:36:59 to 2017-01-01 18:53:23 If i want to split this Dataframe to 3 parts, I wrote below code to split it. Can anyone please confirm is this correct approach or not ?! val finalDF1 =

Re: is there a way for removing hadoop from spark

2017-11-12 Thread Jörn Franke

Why do you even mind? > On 11. Nov 2017, at 18:42, Cristian Lorenzetto > wrote: > > Considering the case i neednt hdfs, it there a way for removing completely > hadoop from spark? > Is YARN the unique dependency in spark? > is there no java or scala (jdk

Re: is there a way for removing hadoop from spark

Re: is there a way for removing hadoop from spark

Re: is there a way for removing hadoop from spark

unsubscribe

Re: HashingTFModel/IDFModel in Structured Streaming

Re: Jenkins upgrade/Test Parallelization & Containerization

Divide Spark Dataframe to parts by timestamp

Re: is there a way for removing hadoop from spark

8 matches

Site Navigation

Mail list logo

Footer information