回复:Re: Is Spark SQL able to auto update partition stats like hive by setting hive.stats.autogather=true

2020-12-18 Thread 疯狂的哈丘
thx,but `hive.stats.autogather` is not work for sparkSQL. - 原始邮件 - 发件人:Mich Talebzadeh 收件人:kongt...@sina.com 抄送人:user 主题:Re: Is Spark SQL able to auto update partition stats like hive by setting hive.stats.autogather=true 日期:2020年12月19日 06点45分 Hi, A fellow forum member kindly spotted a

Column-level encryption in Spark SQL

2020-12-18 Thread john washington
Dear Spark team members, Can you please advise if Column-level encryption is available in Spark SQL? I am aware that HIVE supports column level encryption. Appreciate your response. Thanks, John

Re: Is Spark SQL able to auto update partition stats like hive by setting hive.stats.autogather=true

2020-12-18 Thread Mich Talebzadeh
Hi, A fellow forum member kindly spotted a lousy error of mine, where a comma was missing at the line above the red line. This appears to be accepted spark = SparkSession.builder \ .appName("app1") \ .enableHiveSupport() \ .getOrCreate() # Hive settings settings = [

Re: Issue while installing dependencies Python Spark

2020-12-18 Thread Patrick McCarthy
At the risk of repeating myself, this is what I was hoping to avoid when I suggested deploying a full, zipped, conda venv. What is your motivation for running an install process on the nodes and risking the process failing, instead of pushing a validated environment artifact and not having that

Re: Issue while installing dependencies Python Spark

2020-12-18 Thread Sachit Murarka
Hi Patrick/Users, I am exploring wheel file form packages for this , as this seems simple:- https://bytes.grubhub.com/managing-dependencies-and-artifacts-in-pyspark-7641aa89ddb7 However, I am facing another issue:- I am using pandas , which needs numpy. Numpy is giving error! ImportError:

Re: Is Spark SQL able to auto update partition stats like hive by setting hive.stats.autogather=true

2020-12-18 Thread Mich Talebzadeh
I am afraid not supported for spark sql see Automatic Statistics Collection For Better Query Performance | Qubole I tried it as below spark = SparkSession.builder \ .appName("app1") \

Re: How to submit a job via REST API?

2020-12-18 Thread Daniel de Oliveira Mantovani
In my opinion this should be part of the official documentation. Amazing work Zhou Yang. On Wed, Nov 25, 2020 at 5:45 AM Zhou Yang wrote: > Hi all, > > I found the solution through the source code. Appending the —conf k-v into > `sparkProperties` work. > For example: > > ./spark-submit \ >

Re: Convert Seq[Any] to Seq[String]

2020-12-18 Thread Vikas Garg
I am getting the table schema through Map which I have converted to Seq and passing to toDF On Fri, 18 Dec 2020 at 20:13, Sean Owen wrote: > It's not really a Spark question. .toDF() takes column names. > atrb.head.toSeq.map(_.toString)? but it's not clear what you mean the col > names to be >

Re: Convert Seq[Any] to Seq[String]

2020-12-18 Thread Sean Owen
It's not really a Spark question. .toDF() takes column names. atrb.head.toSeq.map(_.toString)? but it's not clear what you mean the col names to be On Fri, Dec 18, 2020 at 8:37 AM Vikas Garg wrote: > Hi, > > Can someone please help me how to convert Seq[Any] to Seq[String] > > For line > val df

Convert Seq[Any] to Seq[String]

2020-12-18 Thread Vikas Garg
Hi, Can someone please help me how to convert Seq[Any] to Seq[String] For line val df = row.toSeq.toDF(newCol.toSeq: _*) I get that error message. I converted Map "val aMap = Map("admit" -> ("description","comments"))" to Seq var atrb = ListBuffer[(String,String,String)]() for((key,value) <-