java.lang.ClassNotFoundException

2015-08-08 Thread Yasemin Kaya
Hi, I have a little spark program and i am getting an error why i dont understand. My code is https://gist.github.com/yaseminn/522a75b863ad78934bc3. I am using spark 1.3 Submitting : bin/spark-submit --class MonthlyAverage --master local[4] weather.jar error: ~/spark-1.3.1-bin-hadoop2.4$

Re: Spark on YARN

2015-08-08 Thread Sandy Ryza
Hi Jem, Do they fail with any particular exception? Does YARN just never end up giving them resources? Does an application master start? If so, what are in its logs? If not, anything suspicious in the YARN ResourceManager logs? -Sandy On Fri, Aug 7, 2015 at 1:48 AM, Jem Tucker

Re: Spark on YARN

2015-08-08 Thread Jem Tucker
Hi Sandy, The application doesn't fail, it gets accepted by yarn but the application master never starts and the application state never changes to running. I have checked in the resource manager and node manager logs and nothing jumps out. Thanks Jem On Sat, 8 Aug 2015 at 09:20, Sandy Ryza

Pagination on big table, splitting joins

2015-08-08 Thread Gaspar Muñoz
Hi, I have two different parts in my system. 1. Batch application that every x minutes do sql queries between several tables that contains millions of rows to compound a entity, and sent that entities to Kafka. 2. Streaming application that processing data from Kafka. Now, I have entire system

Re: java.lang.ClassNotFoundException

2015-08-08 Thread Ted Yu
Have you tried including package name in the class name ? Thanks On Aug 8, 2015, at 12:00 AM, Yasemin Kaya godo...@gmail.com wrote: Hi, I have a little spark program and i am getting an error why i dont understand. My code is https://gist.github.com/yaseminn/522a75b863ad78934bc3. I

Re: Spark on YARN

2015-08-08 Thread Jem Tucker
Hi dustin, Yes there are enough resources available, the same application run with a different user works fine so I think it is something to do with permissions but I can't work out where. Thanks , Jem On Sat, 8 Aug 2015 at 17:35, Dustin Cote dc...@cloudera.com wrote: Hi Jem, In the top of

Re: java.lang.ClassNotFoundException

2015-08-08 Thread Yasemin Kaya
Thanx Ted, i solved it :) 2015-08-08 14:07 GMT+03:00 Ted Yu yuzhih...@gmail.com: Have you tried including package name in the class name ? Thanks On Aug 8, 2015, at 12:00 AM, Yasemin Kaya godo...@gmail.com wrote: Hi, I have a little spark program and i am getting an error why i dont

Re: Spark on YARN

2015-08-08 Thread Shushant Arora
which is the scheduler on your cluster. Just check on RM UI scheduler tab and see your user and max limit of vcores for that user , is currently other applications of that user have occupies till max vcores of this user then that could be the reason of not allocating vcores to this user but for

Re: DataFrame column structure change

2015-08-08 Thread Raghavendra Pandey
You can use struct function of org.apache.spark.sql.function class to combine two columns to create struct column. Sth like. val nestedCol = struct(df(d), df(e)) df.select(df(a), df(b), df(c), nestedCol) On Aug 7, 2015 3:14 PM, Rishabh Bhardwaj rbnex...@gmail.com wrote: I am doing it by creating

Spark sql jobs n their partition

2015-08-08 Thread Raghavendra Pandey
I have a complex transformation requirements that i m implementing using dataframe. It involves lot of joins also with Cassandra table. I was wondering how can I debug the jobs n stages queued by spark sql the way I can do for Rdds. In one of cases, spark sql creates more than 17 lakhs tasks for

How to create DataFrame from a binary file?

2015-08-08 Thread unk1102
Hi how do we create DataFrame from a binary file stored in HDFS? I was thinking to use JavaPairRDDString,PortableDataStream pairRdd = javaSparkContext.binaryFiles(/hdfs/path/to/binfile); JavaRDDPortableDataStream javardd = pairRdd.values(); I can see that PortableDataStream has method called

Re: Spark master driver UI: How to keep it after process finished?

2015-08-08 Thread Andrew Or
Hi Saif, You need to run your application with `spark.eventLog.enabled` set to true. Then if you are using standalone mode, you can view the Master UI at port 8080. Otherwise, you may start a history server through `sbin/start-history-server.sh`, which by default starts the history UI at port

Re: Schema change on Spark Hive (Parquet file format) table not working

2015-08-08 Thread sim
Yes, I've found a number of problems with metadata management in Spark SQL. One core issue is SPARK-9764 https://issues.apache.org/jira/browse/SPARK-9764 . Related issues are SPARK-9342 https://issues.apache.org/jira/browse/SPARK-9342 , SPARK-9761

Re: Spark inserting into parquet files with different schema

2015-08-08 Thread sim
Adam, did you find a solution for this? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-inserting-into-parquet-files-with-different-schema-tp20706p24181.html Sent from the Apache Spark User List mailing list archive at Nabble.com.