Re: How to run large Hive queries in PySpark 1.2.1

2016-05-26 Thread Nikolay Voronchikhin
Hi Jörn, We will be upgrading to MapR 5.1, Hive 1.2, and Spark 1.6.1 at the end of June. In the meantime, still can this be done with these versions? There is not a firewall issue since we have edge nodes and cluster nodes hosted in the same location with the same NFS mount. On Thu, May 26,

Re: How to run large Hive queries in PySpark 1.2.1

2016-05-26 Thread Jörn Franke
Both have outdated versions, usually one can support you better if you upgrade to the newest. Firewall could be an issue here. > On 26 May 2016, at 10:11, Nikolay Voronchikhin > wrote: > > Hi PySpark users, > > We need to be able to run large Hive queries in PySpark

Fwd: How to run large Hive queries in PySpark 1.2.1

2016-05-26 Thread Nikolay Voronchikhin
Hi PySpark users, We need to be able to run large Hive queries in PySpark 1.2.1. Users are running PySpark on an Edge Node, and submit jobs to a Cluster that allocates YARN resources to the clients. We are using MapR as the Hadoop Distribution on top of Hive 0.13 and Spark 1.2.1. Currently, our