RE: Input file as an argument og a Spark code

2017-07-25 Thread Joaquín Silva
Well I think I will build a REST API over Livy in order to import/export data 
into HDFS.

 

Thanks all!

 

Joaquín Silva | Pentagon Security & AKAINIX

Av. Kennedy 4.700, Piso 10, Of. 1002, Edificio New Century, Vitacura | Código 
Postal (ZIP Code) 7561127

Cel: (56-9) 6304 2498 

 

From: Vivek Suvarna [mailto:vikk...@gmail.com] 
Sent: martes, 25 de julio de 2017 1:19
To: user@livy.incubator.apache.org
Subject: Re: Input file as an argument og a Spark code

 

I had a similar requirement. 

I used webhdfs to first copy the file across to hdfs before starting the spark 
job via Livy. 



Sent from my iPhone


On 25 Jul 2017, at 9:39 AM, Saisai Shao  wrote:

I think you have to make this csv file accessible from Spark cluster, 
putting to HDFS is one possible solution. 

 

On Tue, Jul 25, 2017 at 1:26 AM, Joaquín Silva  
wrote:

Hello,

 

I'm building a BASH program (using Curl)  that should run a Spark code 
remotely using Livy. But one of the code argument  is a CSV file, how can I 
make that spark reads this file?. The file is going to be in client side, not 
in the Spark cluster machines.

 

Regards,

 

Joaquín Silva

 

 



Livy jobs keeps runing forever

2017-07-25 Thread Joaquín Silva
Hello,

 

I'm new in Livy, and I have this problem.

 

When I run a batch job like this:

curl -X POST --data '{"file": "/user/spark/program.jar", "className": 
"my.program"}' -H "Content-Type: application/json" LIVY-HOST:8998/batches

 

The program keeps runing for ever.

Searching in the logs I found the reason:

 

17/07/25 14:21:46 ERROR yarn.ApplicationMaster: User class threw exception: 
java.lang.OutOfMemoryError: PermGen space

 

So in order to solve this issue I increased the executor and driver memory: 
"driverMemory":"15g","executorMemory":"15g". But I still seen this error.

 

 

So I tried running this job directly  using spark-submit and it worked 
perfectly, no memory error.

 

spark-submit --class my.program  --deploy-mode cluster --conf spark.master=yarn 
hdfs://HDFS_NN:8020/user/spark/program.jar

 

I'm using Livy 0.4.0-SNAPSHOT on Spark 2.1.0. 

 

 

Joaquín Silva | Pentagon Security & AKAINIX

Av. Kennedy 4.700, Piso 10, Of. 1002, Edificio New Century, Vitacura | Código 
Postal (ZIP Code) 7561127

Cel: (56-9) 6304 2498 

 



Re: Livy jobs keeps runing forever

2017-07-25 Thread Marcelo Vanzin
On Tue, Jul 25, 2017 at 8:35 AM, Joaquín Silva  wrote:
> 17/07/25 14:21:46 ERROR yarn.ApplicationMaster: User class threw exception:
> java.lang.OutOfMemoryError: PermGen space
>
> So in order to solve this issue I increased the executor and driver memory:
> "driverMemory":"15g","executorMemory":"15g". But I still seen this error.

That error won't be fixed by adding more memory; you need to set
"XX:MaxPermSize=blah" to fix it, or use Java 8.

Still it shouldn't cause the app to just hang, it should eventually
fail. So perhaps there's a bug in Livy's error handling path
somewhere.

-- 
Marcelo


Multiple Livy instances and load balancing

2017-07-25 Thread Vivek
Hi,

We are now considering moving into a uat environment using Livy at my company. 

Has anyone implemented multiple Livy instances on a single cluster with load 
balancing?

A few questions. 
1. Is this feature available in the 0.3 release?
2. How would I name/number the multiple instances I bring up?
3. How does one load balance and send requests across the multiple instances?
4. Does Livy have a heartbeat mechanism to understand which or how many 
instances are up?

Any answers would be appreciated. 

Regards
Vivek


Sent from my iPhone