How to use disk instead of just InMemoryRelation when use JDBC datasource in SPARKSQL?

2018-04-10 Thread Louis Hust
We want to extract data from mysql, and calculate in sparksql. The sql explain like below. == Parsed Logical Plan == > 'Sort ['revenue DESC NULLS LAST], true > +- 'Aggregate ['n_name], ['n_name, 'SUM(('l_extendedprice * (1 - > 'l_discount))) AS revenue#329] >+- 'Filter ('c_custkey =

cache OS memory and spark usage of it

2018-04-10 Thread José Raúl Pérez Rodríguez
Hi, When I issue a "free -m" command in a host, I see a lot of memory used for cache in OS, however Spark Streaming is not able to request that memory for its usage, and it fail the execution due to not been able to launch executors. What I understand of the OS memory cache (the one in

Re: Accessing Hive Tables in Spark

2018-04-10 Thread Dr. Kent Yao
Applying this fix https://github.com/apache/spark/pull/19663 Or Using --files or --jars /local/path/to/hive-site.xml may works Thanks, Kent -- Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ - To

Re: Accessing Hive Tables in Spark

2018-04-10 Thread Marco Gaido
Hi Tushar, It seems Spark is not able to access the metastore. It may be because you are using derby metastases which is maintained locally. Please check all your configurations and that Spark has access to the hive-site.xml file with the metastore uri. Thanks, Marco On Tue, 10 Apr 2018, 08:20

Accessing Hive Tables in Spark

2018-04-10 Thread Tushar Singhal
Hi Everyone, I was accessing Hive Tables in Spark SQL using Scala submitted by spark-submit command. When I ran in cluster mode then got error like : Table not found But the same is working while submitted as client mode. Please help me to understand why? Distribution : Hortonworks Thanks in