Re: Spark Cluster over yarn cluster monitoring

2019-10-26 Thread Chetan Khatri
Could someone please help me to understand better..

On Thu, Oct 17, 2019 at 7:41 PM Chetan Khatri 
wrote:

> Hi Users,
>
> I do submit *X* number of jobs with Airflow to Yarn as a part of workflow
> for *Y *customer. I could potentially run workflow for customer *Z *but I
> need to check that how much resources are available over the cluster so
> jobs for next customer should start.
>
> Could you please tell what is the best way to handle this. Currently, I am
> just checking availableMB > 100 then trigger next Airflow DAG over Yarn.
>
> GET http://rm-http-address:port/ws/v1/cluster/metrics
>
> Thanks.
>
>


Re: Spark - configuration setting doesn't work

2019-10-26 Thread Chetan Khatri
Could someone please help me.

On Thu, Oct 17, 2019 at 7:29 PM Chetan Khatri 
wrote:

> Hi Users,
>
> I am setting spark configuration in below way;
>
> val spark = SparkSession.builder().appName(APP_NAME).getOrCreate()
>
> spark.conf.set("spark.speculation", "false")
> spark.conf.set("spark.broadcast.compress", "true")
> spark.conf.set("spark.sql.broadcastTimeout", "36000")
> spark.conf.set("spark.network.timeout", "2500s")
> spark.conf.set("spark.serializer", 
> "org.apache.spark.serializer.KryoSerializer")
> spark.conf.set("spark.driver.memory", "10g")
> spark.conf.set("spark.executor.memory", "10g")
>
> import spark.implicits._
>
>
> and submitting spark job with spark - submit. but none of the above 
> configuration is
>
> getting reflected to the job, I have checked at Spark-UI.
>
> I know setting up like this while creation of spark object, it's working well.
>
>
> val spark = SparkSession.builder().appName(APP_NAME)
>   .config("spark.network.timeout", "1500s")
>   .config("spark.broadcast.compress", "true")
>   .config("spark.sql.broadcastTimeout", "36000")
>   .getOrCreate()
>
> import spark.implicits._
>
>
> Can someone please throw light?
>
>


Re: issue with regexp_replace

2019-10-26 Thread Aniket Khandelwal
Hi Amit,

I am able to run this test without any issue.

test("string translate with escape characters") {
  val df = Seq(("ab\\\"ab", "")).toDF("a", "b")
  checkAnswer(df.select(translate($"a", "\\\"", "\"")), Row("ab\"ab"))
  checkAnswer(df.selectExpr("""translate(a, "\\\"", "\"")"""), Row("ab\"ab"))
  df.createOrReplaceTempView("table")
  checkAnswer(spark.sql("select translate(a,'\"','\"') from table
"), Row("ab\"ab"))
}

This is basically converting value ab\\\"ab with ab\"ab.

I need the exact query where you are facing the issue. Might help in debug more.

Dp point out if i am mistaken.


Thanks


On Sat, Oct 26, 2019 at 4:47 PM amit kumar singh 
wrote:

> Hi Team,
>
>
> I am trying to use regexp_replace in spark sql  it throwing error
>
> expected , but found Scalar
>  in 'reader', line 9, column 45:
>  ... select translate(payload, '"', '"') as payload
>
>
> i am trying to remove all character  from \\\"  with "
>


issue with regexp_replace

2019-10-26 Thread amit kumar singh
Hi Team,


I am trying to use regexp_replace in spark sql  it throwing error

expected , but found Scalar
 in 'reader', line 9, column 45:
 ... select translate(payload, '"', '"') as payload


i am trying to remove all character  from \\\"  with "