Re: Spark Cluster over yarn cluster monitoring
Could someone please help me to understand better.. On Thu, Oct 17, 2019 at 7:41 PM Chetan Khatri wrote: > Hi Users, > > I do submit *X* number of jobs with Airflow to Yarn as a part of workflow > for *Y *customer. I could potentially run workflow for customer *Z *but I > need to check that how much resources are available over the cluster so > jobs for next customer should start. > > Could you please tell what is the best way to handle this. Currently, I am > just checking availableMB > 100 then trigger next Airflow DAG over Yarn. > > GET http://rm-http-address:port/ws/v1/cluster/metrics > > Thanks. > >
Re: Spark - configuration setting doesn't work
Could someone please help me. On Thu, Oct 17, 2019 at 7:29 PM Chetan Khatri wrote: > Hi Users, > > I am setting spark configuration in below way; > > val spark = SparkSession.builder().appName(APP_NAME).getOrCreate() > > spark.conf.set("spark.speculation", "false") > spark.conf.set("spark.broadcast.compress", "true") > spark.conf.set("spark.sql.broadcastTimeout", "36000") > spark.conf.set("spark.network.timeout", "2500s") > spark.conf.set("spark.serializer", > "org.apache.spark.serializer.KryoSerializer") > spark.conf.set("spark.driver.memory", "10g") > spark.conf.set("spark.executor.memory", "10g") > > import spark.implicits._ > > > and submitting spark job with spark - submit. but none of the above > configuration is > > getting reflected to the job, I have checked at Spark-UI. > > I know setting up like this while creation of spark object, it's working well. > > > val spark = SparkSession.builder().appName(APP_NAME) > .config("spark.network.timeout", "1500s") > .config("spark.broadcast.compress", "true") > .config("spark.sql.broadcastTimeout", "36000") > .getOrCreate() > > import spark.implicits._ > > > Can someone please throw light? > >
Re: issue with regexp_replace
Hi Amit, I am able to run this test without any issue. test("string translate with escape characters") { val df = Seq(("ab\\\"ab", "")).toDF("a", "b") checkAnswer(df.select(translate($"a", "\\\"", "\"")), Row("ab\"ab")) checkAnswer(df.selectExpr("""translate(a, "\\\"", "\"")"""), Row("ab\"ab")) df.createOrReplaceTempView("table") checkAnswer(spark.sql("select translate(a,'\"','\"') from table "), Row("ab\"ab")) } This is basically converting value ab\\\"ab with ab\"ab. I need the exact query where you are facing the issue. Might help in debug more. Dp point out if i am mistaken. Thanks On Sat, Oct 26, 2019 at 4:47 PM amit kumar singh wrote: > Hi Team, > > > I am trying to use regexp_replace in spark sql it throwing error > > expected , but found Scalar > in 'reader', line 9, column 45: > ... select translate(payload, '"', '"') as payload > > > i am trying to remove all character from \\\" with " >
issue with regexp_replace
Hi Team, I am trying to use regexp_replace in spark sql it throwing error expected , but found Scalar in 'reader', line 9, column 45: ... select translate(payload, '"', '"') as payload i am trying to remove all character from \\\" with "