Optimized spark configuration

2014-12-05 Thread vdiwakar.malladi
Hi Could any one help what would be better / optimized configuration for driver memory, worker memory, number of parallelisms etc., parameters to be configured when we are running 1 master node (it itself acting as slave node also) and 1 slave node. Both are of 32 GB RAM with 4 cores. On this, I

Re: Unable to generate assembly jar which includes jdbc-thrift server

2014-11-27 Thread vdiwakar.malladi
Hi, I setup maven environment on a Linux machine and able to build the pom file in spark home directory. Each module refreshed with corresponding target directory with jar files. In order to include all the libraries to classpath, what I need to do? earlier, I used single assembly jar file to

Exception while starting thrift server

2014-11-27 Thread vdiwakar.malladi
Hi, When I'm starting thrift server, I'm getting the following exception. Could you any one help me on this. I placed hive-site.xml in $SPARK_HOME/conf folder and the property hive.metastore.sasl.enabled set to 'false'. org.apache.hive.service.ServiceException: Unable to login to kerberos with

Unable to generate assembly jar which includes jdbc-thrift server

2014-11-26 Thread vdiwakar.malladi
Hi, When I'm trying to build spark assembly to include the dependencies related to thrift server, build is getting failed by throwing the following error. Could any one help me on this. [ERROR] Failed to execute goal on project spark-assembly_2.10: Could not resolve dependencies for project

Re: Unable to generate assembly jar which includes jdbc-thrift server

2014-11-26 Thread vdiwakar.malladi
Thanks for your response. I'm using the following command. mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -DskipTests clean package Regards. -- View this message in context:

Re: Unable to generate assembly jar which includes jdbc-thrift server

2014-11-26 Thread vdiwakar.malladi
Yes, I'm building it from Spark 1.1.0 Thanks in advance. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-generate-assembly-jar-which-includes-jdbc-thrift-server-tp19887p19937.html Sent from the Apache Spark User List mailing list archive at

issue while running the code in standalone mode: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

2014-11-24 Thread vdiwakar.malladi
Hi, When i trying to execute the program from my laptop by connecting to HDP environment (on which Spark also configured), i'm getting the warning (Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory) and Job is being

Re: issue while running the code in standalone mode: Initial job has not accepted any resources; check your cluster UI to ensure that workers are registered and have sufficient memory

2014-11-24 Thread vdiwakar.malladi
Thanks for your response. I gave correct master url. Moreover as i mentioned in my post, i could able to run the sample program by using spark-submit. But it is not working when i'm running from my machine. Any clue on this? Thanks in advance. -- View this message in context:

Getting exception on JavaSchemaRDD; org.apache.spark.SparkException: Task not serializable

2014-11-22 Thread vdiwakar.malladi
Hi I'm trying to load the parquet file for querying purpose from my web application. I could able to load it as JavaSchemaRDD. But at the time of using map function on the JavaSchemaRDD, I'm getting the following exception. The class in which I'm using this code implements Serializable class.

Re: Getting exception on JavaSchemaRDD; org.apache.spark.SparkException: Task not serializable

2014-11-22 Thread vdiwakar.malladi
Thanks for your prompt response. I'm not using any thing in my map function. please see the below code. For sample purpose, I would like to using 'select * from '. This code worked for me in standalone mode. But when I integrated with my web application, it is throwing the specified exception.

Re: Getting exception on JavaSchemaRDD; org.apache.spark.SparkException: Task not serializable

2014-11-22 Thread vdiwakar.malladi
Thanks. After writing it as static inner class, that exception not coming. But getting snappy related exception. I could see the corresponding dependency is in the spark assembly jar. Still getting the exception. Any quick suggestion on this? Here is the stack trace.

Re: saveAsParquetFile throwing exception

2014-11-14 Thread vdiwakar.malladi
Thanks for your response. I'm using Spark 1.1.0 Currently I have the spark setup which comes with Hadoop CDH (using cloudera manager). Could you please suggest me, how can I make use of the patch? Thanks in advance. -- View this message in context:

Re: loading, querying schemaRDD using SparkSQL

2014-11-13 Thread vdiwakar.malladi
Thanks Michael. I used Parquet files and it could able to solve my initial problem to some extent (i.e. loading data from one context and reading it from another context). But there I could see another issue. I need to load the parquet file every time I create the JavaSQLContext using

loading, querying schemaRDD using SparkSQL

2014-11-04 Thread vdiwakar.malladi
Hi, There is a need in my application to query the loaded data into sparkcontext (I mean loaded SchemaRDD from JSON file(s)). For this purpose, I created the SchemaRDD and call registerTempTable method in a standalone program and submited the application using spark-submit command. Then I have

Re: loading, querying schemaRDD using SparkSQL

2014-11-04 Thread vdiwakar.malladi
Thanks Michael for your response. Just now, i saw saveAsTable method on JavaSchemaRDD object (in Spark 1.1.0 API). But I couldn't find the corresponding documentation. Will that help? Please let me know. Thanks in advance. -- View this message in context:

Unresolved attributes: SparkSQL on the schemaRDD

2014-09-29 Thread vdiwakar.malladi
Hello, I'm exploring SparkSQL and I'm facing issue while using the queries. Any help on this is appreciated. I have the following schema once loaded as RDD. root |-- data: array (nullable = true) ||-- element: struct (containsNull = false) |||-- age: integer (nullable = true) |

Re: Unresolved attributes: SparkSQL on the schemaRDD

2014-09-29 Thread vdiwakar.malladi
Thanks for your prompt response. Still on further note, I'm getting the exception while executing the query. SELECT data[0].name FROM people where data[0].age =13 *Exception in thread main java.lang.RuntimeException: [1.46] failure: ``UNION'' expected but identifier .age found SELECT

Re: Unresolved attributes: SparkSQL on the schemaRDD

2014-09-29 Thread vdiwakar.malladi
I'm using the latest version i.e. Spark 1.1.0 Thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Unresolved-attributes-SparkSQL-on-the-schemaRDD-tp15339p15376.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: Spark SQL

2014-08-07 Thread vdiwakar.malladi
Thanks for your response. I could able to compile my code now. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-SQL-tp11618p11644.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Java API for GraphX

2014-08-07 Thread vdiwakar.malladi
Hi, Could you please let me know whether Java API available for GraphX component or not. If so, could you please help me to point to the same. Thanks in advance. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Java-API-for-GraphX-tp11752.html Sent from