I have a spark cluster consists of 5 nodes and I have a spark job that should
process some files from a directory and send its content to Kafka.
I am trying to submit the job using the following command
bin$ ./spark-submit --total-executor-cores 20 --executor-memory 5G --class
I could solve the issue but the solution is very weird.
I run this command cat old_script.py > new_script.py then I submitted the
job using the new script.
This is the second time I face such issue with python script and I have no
explanation to what happened.
I hope this trick help someone
I have a very strange problem.
I wrote a spark streaming job that monitor an HDFS directory, read the newly
added files, and send the contents to Kafka.
The job is written in python and you can got the code from this link
http://pastebin.com/mpKkMkph
When submitting the job I got that error
I have a spark streaming job that read tweets stream from gnip and write it
to Kafak.
Spark and kafka are running on the same cluster.
My cluster consists of 5 nodes. Kafka-b01 ... Kafka-b05
Spark master is running on Kafak-b05.
Here is how we submit the spark job
*nohup sh
I think I find a solution but I have no idea how this affects the execution
of the application.
At the end of the script I added a sleep statement.
import time
time.sleep(1)
This solved the problem.
--
View this message in context:
rap_servers="10.62.54.111:9092")
tweets =
sc.textFile("/home/fanooos/Desktop/historical_scripts/output/1/activities_201603270430_201603270440.json")
tweetsCollection = tweets.collect()
for tweet in tweetsCollection:
producer.send('testTopic', value=bytes(twe
Dears
If I will use Kafka as a streaming source to some spark jobs, is it advised
to install spark to the same nodes of kafka cluster?
What are the benefits and drawbacks of such a decision?
regards
--
View this message in context:
This is my first Spark Stream application. The setup is as following
3 nodes running a spark cluster. One master node and two slaves.
The application is a simple java application streaming from twitter and
dependencies managed by maven.
Here is the code of the application
public class
We have cloudera CDH 5.3 installed on one machine.
We are trying to use spark sql thrift server to execute some analysis
queries against hive table.
Without any changes in the configurations, we run the following query on
both hive and spark sql thrift server
*select * from tableName;*
The
I have a hadoop cluster and I need to query the data stored on the HDFS using
spark sql thrift server.
Spark sql thrift server is up and running. It is configured to read from
HIVE table. The hive table is an external table that corresponding to set of
files stored on HDFS. These files contains
I have some applications developed using PHP and currently we have a problem
in connecting these applications to spark sql thrift server. ( Here is the
problem I am talking about.
http://apache-spark-user-list.1001560.n3.nabble.com/Connection-PHP-application-to-Spark-Sql-thrift-server-td21925.html
We have two applications need to connect to Spark Sql thrift server.
The first application is developed in java. Having spark sql thrift server
running, we following the steps in this link
https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients#HiveServer2Clients-JDBC
and the
We have installed hadoop cluster with hive and spark and the spark sql thrift
server is up and running without any problem.
Now we have set of applications need to use spark sql thrift server to query
some data.
Some of these applications are java applications and the others are PHP
I have installed a hadoop cluster (version : 2.6.0), apache spark (version :
1.2.1 preBuilt for hadoop 2.4 and later), and hive (version 1.0.0).
When I try to start the spark sql thrift server I am getting the following
exception.
Exception in thread main java.lang.RuntimeException:
Hi
I have installed hadoop on a local virtual machine using the steps from this
URL
https://www.digitalocean.com/community/tutorials/how-to-install-hadoop-on-ubuntu-13-10
In the local machine I write a little Spark application in java to read a
file from the hadoop instance installed in the
15 matches
Mail list logo