Re: How many Spark streaming applications can be run at a time on a Spark cluster?

2016-12-14 Thread Akhilesh Pathodia
If you have enough cores/resources, run them separately depending on your use case. On Thursday 15 December 2016, Divya Gehlot wrote: > It depends on the use case ... > Spark always depends on the resource availability . > As long as you have resource to acoomodate ,can

Re: Reading parquet files into Spark Streaming

2016-08-27 Thread Akhilesh Pathodia
Hi Renato, Which version of Spark are you using? If spark version is 1.3.0 or more then you can use SqlContext to read the parquet file which will give you DataFrame. Please follow the below link: https://spark.apache.org/docs/1.5.0/sql-programming-guide.html#loading-data-programmatically

Re: Reading parquet files into Spark Streaming

2016-08-26 Thread Akhilesh Pathodia
Hi Renato, Which version of Spark are you using? If spark version is 1.3.0 or more then you can use SqlContext to read the parquet file which will give you DataFrame. Please follow the below link: https://spark.apache.org/docs/1.5.0/sql-programming-guide.html#loading-data-programmatically

Failed to get broadcast_1_piece0 of broadcast_1

2016-04-04 Thread Akhilesh Pathodia
Hi, I am running spark jobs on yarn in cluster mode. The job get the messages from kafka direct stream. I am using broadcast variables and checkpointing every 30 seconds. When I start the job first time it runs fine without any issue. If I kill the job and restart it throws below exception in

Reading large set of files in Spark

2016-02-04 Thread Akhilesh Pathodia
Hi, I am using Spark to read large set of files from HDFS, applying some formatting on each line and then saving each line as a record in hive. Spark is reading directory paths from kafka. Each directory can have large number of files. I am reading one path from kafka and then processing all

Undefined job output-path error in Spark on hive

2016-01-25 Thread Akhilesh Pathodia
Hi, I am getting following exception in Spark while writing to hive partitioned table in parquet format: 16/01/25 03:56:40 ERROR executor.Executor: Exception in task 0.2 in stage 1.0 (TID 3) java.io.IOException: Undefined job output-path at

Spark not saving data to Hive

2016-01-23 Thread Akhilesh Pathodia
Hi, I am trying to write data from spark to Hive partitioned table: DataFrame dataFrame = sqlContext.createDataFrame(rdd, schema); dataFrame.write().partitionBy("YEAR","MONTH","DAY").saveAsTable(tableName); The data is not being written to hive table (hdfs location: /user/hive/warehouse//),

Spark not writing data in Hive format

2016-01-23 Thread Akhilesh Pathodia
format instead of Hive format. Can anybody tell me how to get rid of this issue? Spark version - 1.5.0 CDH 5.5.1 Thanks, Akhilesh Pathodia

Re: Spark on hbase using Phoenix in secure cluster

2015-12-07 Thread Akhilesh Pathodia
eros ticket for authentication to pass. > > > -- > Ruslan Dautkhanov > > On Mon, Dec 7, 2015 at 12:54 PM, Akhilesh Pathodia < > pathodia.akhil...@gmail.com> wrote: > >> Hi, >> >> I am running spark job on yarn in cluster mode in secured cluster. I am &g

Spark on hbase using Phoenix in secure cluster

2015-12-07 Thread Akhilesh Pathodia
successful in running Spark on hbase using Phoenix in yarn cluster or client mode? Thanks, Akhilesh Pathodia

Unable to get phoenix connection in spark job in secured cluster

2015-12-01 Thread Akhilesh Pathodia
the connection are correct. Do we need any addition configuration for making phoenix work in spark job running on yarn in cluster mode in secured cluster? How do make phoenix work with spark in secured environment? Thanks, Akhilesh Pathodia

Re: Unable to get phoenix connection in spark job in secured cluster

2015-12-01 Thread Akhilesh Pathodia
Spark - 1.3.1 Hbase - 1.0.0 Phoenix - 4.3 Cloudera - 5.4 On Tue, Dec 1, 2015 at 9:35 PM, Ted Yu <yuzhih...@gmail.com> wrote: > What are the versions for Spark / HBase / Phoenix you're using ? > > Cheers > > On Tue, Dec 1, 2015 at 4:15 AM, Akhilesh Pathodia < > pathodi