The issue seems to be with primordial class loader. I cannot load the
drivers to all the nodes at the same location but have loaded the jars to
HDFS. I have tried SPARK_YARN_DIST_FILES as well as SPARK_CLASSPATH on the
edge node with no luck. Is there another way to load these jars
through
Are you using Spark's textFiles method? If so, go through this blog :-
http://tech.kinja.com/how-not-to-pull-from-s3-using-apache-spark-1704509219
Anubhav
On Mon, Apr 24, 2017 at 12:48 PM, Afshin, Bardia <
bardia.afs...@capitalone.com> wrote:
> Hi there,
>
>
>
> I have a process that downloads
Hi,
I am having log4j trouble while running Spark using YARN as cluster manager
in CDH 5.3.3.
I get the following error:-
SLF4J: Class path contains multiple SLF4J bindings.
SLF4J: Found binding in
.
On Tue, Nov 3, 2015 at 7:48 AM, Ted Yu <yuzhih...@gmail.com> wrote:
> I am a bit curious: why is the synchronization on finalLock is needed ?
>
> Thanks
>
> On Oct 23, 2015, at 8:25 AM, Anubhav Agarwal <anubha...@gmail.com> wrote:
>
> I have a spark job that
I have a spark job that creates 6 million rows in RDDs. I convert the RDD
into Data-frame and write it to HDFS. Currently it takes 3 minutes to write
it to HDFS.
Here is the snippet:-
RDDList.parallelStream().forEach(mapJavaRDD -> {
if (mapJavaRDD != null) {
Hi Ankit,
Here is my solution for this:-
1) Download the latest Spark 1.5.1(Just copied the following link from
spark.apache.org, if it doesn't work then gran a new one from the website.)
wget http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz
2) Unzip the folder and rename/move
I am running Spark 1.3 on CDH 5.4 stack. I am getting the following error
when I spark-submit my application:-
15/08/11 16:03:49 INFO Remoting: Starting remoting
15/08/11 16:03:49 INFO Remoting: Remoting started; listening on addresses
Hi,
I am trying to modify my code to use HDFS and multiple nodes. The code
works fine when I run it locally in a single machine with a single worker.
I have been trying to modify it and I get the following error. Any hint
would be helpful.
java.lang.NullPointerException
at
do you use ?
Cheers
On Mon, Aug 3, 2015 at 3:13 PM, Anubhav Agarwal anubha...@gmail.com
wrote:
Hi,
I am trying to modify my code to use HDFS and multiple nodes. The code
works fine when I run it locally in a single machine with a single worker.
I have been trying to modify it and I get
Zhan specifying port fixed the port issue.
Is it possible to specify the log directory while starting the spark
thriftserver?
Still getting this error even through the folder exists and everyone has
permission to use that directory.
drwxr-xr-x 2 root root 4096 Mar 24 19:04
10 matches
Mail list logo