Hi,
I have a little spark program and i am getting an error why i dont
understand.
My code is https://gist.github.com/yaseminn/522a75b863ad78934bc3.
I am using spark 1.3
Submitting : bin/spark-submit --class MonthlyAverage --master local[4]
weather.jar
error:
~/spark-1.3.1-bin-hadoop2.4$
Hi Jem,
Do they fail with any particular exception? Does YARN just never end up
giving them resources? Does an application master start? If so, what are
in its logs? If not, anything suspicious in the YARN ResourceManager logs?
-Sandy
On Fri, Aug 7, 2015 at 1:48 AM, Jem Tucker
Hi Sandy,
The application doesn't fail, it gets accepted by yarn but the application
master never starts and the application state never changes to running. I
have checked in the resource manager and node manager logs and nothing
jumps out.
Thanks
Jem
On Sat, 8 Aug 2015 at 09:20, Sandy Ryza
Hi,
I have two different parts in my system.
1. Batch application that every x minutes do sql queries between several
tables that contains millions of rows to compound a entity, and sent that
entities to Kafka.
2. Streaming application that processing data from Kafka.
Now, I have entire system
Have you tried including package name in the class name ?
Thanks
On Aug 8, 2015, at 12:00 AM, Yasemin Kaya godo...@gmail.com wrote:
Hi,
I have a little spark program and i am getting an error why i dont
understand.
My code is https://gist.github.com/yaseminn/522a75b863ad78934bc3.
I
Hi dustin,
Yes there are enough resources available, the same application run with a
different user works fine so I think it is something to do with permissions
but I can't work out where.
Thanks ,
Jem
On Sat, 8 Aug 2015 at 17:35, Dustin Cote dc...@cloudera.com wrote:
Hi Jem,
In the top of
Thanx Ted, i solved it :)
2015-08-08 14:07 GMT+03:00 Ted Yu yuzhih...@gmail.com:
Have you tried including package name in the class name ?
Thanks
On Aug 8, 2015, at 12:00 AM, Yasemin Kaya godo...@gmail.com wrote:
Hi,
I have a little spark program and i am getting an error why i dont
which is the scheduler on your cluster. Just check on RM UI scheduler tab
and see your user and max limit of vcores for that user , is currently
other applications of that user have occupies till max vcores of this user
then that could be the reason of not allocating vcores to this user but for
You can use struct function of org.apache.spark.sql.function class to
combine two columns to create struct column.
Sth like.
val nestedCol = struct(df(d), df(e))
df.select(df(a), df(b), df(c), nestedCol)
On Aug 7, 2015 3:14 PM, Rishabh Bhardwaj rbnex...@gmail.com wrote:
I am doing it by creating
I have a complex transformation requirements that i m implementing using
dataframe. It involves lot of joins also with Cassandra table.
I was wondering how can I debug the jobs n stages queued by spark sql the
way I can do for Rdds.
In one of cases, spark sql creates more than 17 lakhs tasks for
Hi how do we create DataFrame from a binary file stored in HDFS? I was
thinking to use
JavaPairRDDString,PortableDataStream pairRdd =
javaSparkContext.binaryFiles(/hdfs/path/to/binfile);
JavaRDDPortableDataStream javardd = pairRdd.values();
I can see that PortableDataStream has method called
Hi Saif,
You need to run your application with `spark.eventLog.enabled` set to true.
Then if you are using standalone mode, you can view the Master UI at port
8080. Otherwise, you may start a history server through
`sbin/start-history-server.sh`, which by default starts the history UI at
port
Yes, I've found a number of problems with metadata management in Spark SQL.
One core issue is SPARK-9764
https://issues.apache.org/jira/browse/SPARK-9764 . Related issues are
SPARK-9342 https://issues.apache.org/jira/browse/SPARK-9342 , SPARK-9761
Adam, did you find a solution for this?
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Spark-inserting-into-parquet-files-with-different-schema-tp20706p24181.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.
14 matches
Mail list logo