Re: A question about radd bytes size

2019-12-01 Thread Wenchen Fan
When we talk about bytes size, we need to specify how the data is stored. For example, if we cache the dataframe, then the bytes size is the number of bytes of the binary format of the table cache. If we write to hive tables, then the bytes size is the total size of the data files of the table.

ScaledML 2020 Spark Speakers and Promo

2019-12-01 Thread Reza Zadeh
Spark Users, You are all welcome to join us at ScaledML 2020: http://scaledml.org A very steep discount is available for this list, using this link . We'd love to see you there. Best, Reza

connectivity

2019-12-01 Thread Krishna Chandran Nair
Hi Team, Can anyone provide the sample code to connect to azure to connect to ADLS using azure key vault(user managed key). Qatar Airways - Going Places Together Disclaimer:- This message (including attachments) is intended solely for the addressee named above. It may be confidential,

A question about radd bytes size

2019-12-01 Thread zhangliyun
Hi: I want to get the total bytes of a DataFrame by following function , but when I insert the DataFrame into hive , I found the value of the function is different from spark.sql.statistics.totalSize . The spark.sql.statistics.totalSize is less than the result of following function

Re: [Spark SQL]: Does namespace name is always needed in a query for tables from a user defined catalog plugin

2019-12-01 Thread xufei
Thanks, Terry. Glad to know that it is not an expected behavior. Terry Kim 于2019年12月2日周一 上午11:51写道: > Hi Xufei, > I also noticed the same while looking into relation resolution behavior > (See Appendix A in this doc >

Re: [Spark SQL]: Does namespace name is always needed in a query for tables from a user defined catalog plugin

2019-12-01 Thread Terry Kim
Hi Xufei, I also noticed the same while looking into relation resolution behavior (See Appendix A in this doc ). I created SPARK-30094 and will

[Spark SQL]: Does namespace name is always needed in a query for tables from a user defined catalog plugin

2019-12-01 Thread xufei
Hi, I'm trying to write a catalog plugin based on spark-3.0-preview, and I found even when I use 'use catalog.namespace' to set the current catalog and namespace, I still need to qualified name in the query. For example, I add a catalog named 'example_catalog', there is a database named 'test'