Hi all,
I am using Hadoop 2.6.5 and spark 2.1.0 and run a job using spark-submit
and master is set to "yarn". When spark starts, I can load Spark UI page
using port 4040 but no job is shown in the page. After the following logs
(registering application master on yarn) spark UI is not accessible
any
Hello all,
Can someone help me to solve the following fundamental problem?
I have a JavaRDD and as a flatMap method, I call a new instance of a class
which implements FlatMapFunction. This class has a constructor method and a
call method. In constructor method, I set the values for "List" variabl
Hi All,
I read a test file using sparkContext.textfile(filename) and assign it to
an RDD and process the RDD (replace some words) and finally write it to
a text file using rdd.saveAsTextFile(output).
Is there any way to be sure the order of the sentences will not be changed?
I need to have the same
Hi all,
Please tell me how can I tune output partition numbers.
I run my spark job on my local machine with 8 cores and input data is
6.5GB. It creates 193 tasks and put the output into 193 partitions.
How can I change the number of tasks and consequently, the number of output
files?
Best,
Soheil
Hi,
I have executed my spark job using spark-submit on my local machine and on
cluster.
Now I want to try using HDFS. I mean put the data (text file) on hdfs and
read from there, execute the jar file and finally write the output to hdfs.
I got this error after running the job:
*failed to launch or
Hi
I am new in Spark and have a question in first steps of Spark learning.
How can I filter an RDD using an String variable (for example words[i]) ,
instead of a fix one like "Error"?
Thanks a lot in advance.
Soheila