When you say driver running on mesos can you explain how are you doing that...??
> On Mar 10, 2016, at 4:44 PM, Eran Chinthaka Withana
> wrote:
>
> Yanling I'm already running the driver on mesos (through docker). FYI, I'm
> running this on cluster mode with
r in your image to be spark home?
>
> Tim
>
>
> On Mar 10, 2016, at 3:11 AM, Ashish Soni <asoni.le...@gmail.com> wrote:
>
> You need to install spark on each mesos slave and then while starting
> container make a workdir to your spark home so that it can find the
You need to install spark on each mesos slave and then while starting container
make a workdir to your spark home so that it can find the spark class.
Ashish
> On Mar 10, 2016, at 5:22 AM, Guillaume Eynard Bontemps
> wrote:
>
> For an answer to my question see
since the slave is in a
> chroot.
>
> Can you try mounting in a volume from the host when you launch the slave
> for your slave's workdir?
> docker run -v /tmp/mesos/slave:/tmp/mesos/slave mesos_image mesos-slave
> --work_dir=/tmp/mesos/slave
>
> Tim
>
> On Thu, Mar 3, 2016 a
gt; image.
>>
>> Tim
>>
>> On Wed, Mar 2, 2016 at 2:28 PM, Charles Allen <
>> charles.al...@metamarkets.com> wrote:
>>
>>> Re: Spark on Mesos Warning regarding disk space:
>>> https://issues.apache.org/jira/browse/SPARK-12330
>&
Vairavelu <
vsathishkuma...@gmail.com> wrote:
> Try passing jar using --jars option
>
> On Wed, Mar 2, 2016 at 10:17 AM Ashish Soni <asoni.le...@gmail.com> wrote:
>
>> I made some progress but now i am stuck at this point , Please help as
>> looks like i am
utor log from
> steer file and see what the problem is?
>
> Tim
>
> On Mar 1, 2016, at 8:05 AM, Ashish Soni <asoni.le...@gmail.com> wrote:
>
> Not sure what is the issue but i am getting below error when i try to run
> spark PI example
>
> Blacklisting Mesos slave value: &
Hi All ,
Can some one please help me how do i translate below spark submit to
marathon JSON request
docker run -it --rm -e SPARK_MASTER="mesos://10.0.2.15:5050" -e
SPARK_IMAGE="spark_driver:latest" spark_driver:latest
/opt/spark/bin/spark-submit --name "PI Example" --class
k your Mesos UI if you see Spark application in the
> Frameworks tab
>
> On Mon, Feb 29, 2016 at 12:23 PM Ashish Soni <asoni.le...@gmail.com>
> wrote:
>
>> What is the Best practice , I have everything running as docker container
>> in single host ( mesos and marathon
mesosphere/spark:1.6)
> and Mesos will automatically launch docker containers for you.
>
> Tim
>
> On Mon, Feb 29, 2016 at 7:36 AM, Ashish Soni <asoni.le...@gmail.com>
> wrote:
>
>> Yes i read that and not much details here.
>>
>> Is it true that we nee
ml should be the
> best source, what problems were you running into?
>
> Tim
>
> On Fri, Feb 26, 2016 at 11:06 AM, Yin Yang <yy201...@gmail.com> wrote:
>
>> Have you read this ?
>> https://spark.apache.org/docs/latest/running-on-mesos.html
>>
>>
Hi All ,
Is there any proper documentation as how to run spark on mesos , I am
trying from the last few days and not able to make it work.
Please help
Ashish
Hi All ,
Just wanted to know if there is any work around or resolution for below
issue in Stand alone mode
https://issues.apache.org/jira/browse/SPARK-9559
Ashish
Hi All ,
As per my best understanding we can have only one log4j for both spark and
application as which ever comes first in the classpath takes precedence ,
Is there any way we can keep one in application and one in the spark conf
folder .. is it possible ?
Thanks
; spark-submit --conf "spark.executor.memory=512m" --conf
> "spark.executor.extraJavaOptions=x" --conf "Dlog4j.configuration=log4j.xml"
>
> Sent from Samsung Mobile.
>
>
> Original message
> From: Ted Yu <yuzhih...@gmail.com&g
Hi All ,
How do i pass multiple configuration parameter while spark submit
Please help i am trying as below
spark-submit --conf "spark.executor.memory=512m
spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j.xml"
Thanks,
Hi All ,
How do change the log level for the running spark streaming Job , Any help
will be appriciated.
Thanks,
Are there any examples as how to implement onEnvironmentUpdate method for
customer listener
Thanks,
Hi All ,
Please let me know how we can redirect spark logging files or tell spark to
log to kafka queue instead of files ..
Ashish
Hi All ,
What is the best way to tell spark streaming job for the no of partition to
to a given topic -
Should that be provided as a parameter or command line argument
or
We should connect to kafka in the driver program and query it
Map fromOffsets = new
Gerard
>
> On Mon, Jan 25, 2016 at 5:31 PM, Ashish Soni <asoni.le...@gmail.com>
> wrote:
>
>> Hi All ,
>>
>> What is the best way to tell spark streaming job for the no of partition
>> to to a given topic -
>>
>> Should that be provided a
Hi ,
I have a strange behavior when i creating standalone spark container using
docker
Not sure why by default it is assigning 4 cores to the first Job it submit
and then all the other jobs are in wait state , Please suggest if there is
an setting to change this
i tried --executor-cores 1 but
Need more details but you might want to filter the data first ( create multiple
RDD) and then process.
> On Oct 5, 2015, at 8:35 PM, Chen Song wrote:
>
> We have a use case with the following design in Spark Streaming.
>
> Within each batch,
> * data is read and
try this
You can use dstream.map to conver it to JavaDstream with only the data you
are interested probably return an Pojo of your JSON
and then call foreachRDD and inside that call below line
javaFunctions(rdd).writerBuilder("table", "keyspace",
mapToRow(Class.class)).saveToCassandra();
On
gmail.com> wrote:
>
> You can use JavaSparkContext.setLogLevel to set the log level in your
> codes.
>
> Best Regards,
> Shixiong Zhu
>
> 2015-09-28 22:55 GMT+08:00 Ashish Soni <asoni.le...@gmail.com>:
>
>> I am not running it using spark submit , i am running locally insid
w.com/questions/28840438/how-to-override-sparks-log4j-properties-per-driver
>
> From: Ashish Soni
> Date: Monday, September 28, 2015 at 5:18 PM
> To: user
> Subject: Spark Streaming Log4j Inside Eclipse
>
> I need to turn off the verbose logging of Spark Streaming Code when i
Hi All ,
I need to turn off the verbose logging of Spark Streaming Code when i am
running inside eclipse i tried creating a log4j.properties file and placed
inside /src/main/resources but i do not see it getting any effect , Please
help as not sure what else needs to be done to change the log at
Hi All ,
Just wanted to find out if there is an benefits to installing kafka
brokers and spark nodes on the same machine ?
is it possible that spark can pull data from kafka if it is local to the
node i.e. the broker or partition is on the same machine.
Thanks,
Ashish
Hi ,
How can i pass an dynamic value inside below function to filter instead of
hardcoded
if have an existing RDD and i would like to use data in that for filter so
instead of doing .where("name=?","Anna") i want to do
.where("name=?",someobject.value)
Please help
JavaRDD rdd3 =
Hi All ,
Are there any framework which can be used to execute workflows with in
spark or Is it possible to use ML Pipeline for workflow execution but not
doing ML .
Thanks,
Ashish
gt;
>
>
> Flat map that concatenates the results, so you get
>
>
>
> 1,2,3, 2,3, 3,3
>
>
>
> You should get the same with any scala collection
>
>
>
> Cheers
>
>
>
> *From:* Ashish Soni [mailto:asoni.le...@gmail.com]
> *Sent:* Thursday, Se
Hi ,
Can some one please explain the output of the flat map
data in RDD as below
{1, 2, 3, 3}
rdd.flatMap(x => x.to(3))
output as below
{1, 2, 3, 2, 3, 3, 3}
i am not able to understand how the output came as above.
Thanks,
Please help as not sure what is incorrect with below code as it gives me
complilaton error in eclipse
SparkConf sparkConf = new
SparkConf().setMaster(local[4]).setAppName(JavaDirectKafkaWordCount);
JavaStreamingContext jssc = new JavaStreamingContext(sparkConf,
Hi All ,
I am having a class loading issue as Spark Assembly is using google guice
internally and one of Jar i am using uses sisu-guice-3.1.0-no_aop.jar , How
do i load my class first so that it doesn't result in error and tell spark
to load its assembly later on
Ashish
Hi All ,
I have an XML file with same tag repeated multiple times as below , Please
suggest what would be best way to process this data inside spark as ...
How can i extract each open and closing tag and process them or how can i
combine multiple line into single line
review
/review
review
Hi All ,
How can i broadcast a data change to all the executor ever other 10 min or
1 min
Ashish
Hi All ,
If some one can help me understand as which portion of the code gets
executed on Driver and which portion will be executed on executor from the
below code it would be a great help
I have to load data from 10 Tables and then use that data in various
manipulation and i am using SPARK SQL
Hi All ,
I have and Stream of Event coming in and i want to fetch some additional
data from the database based on the values in the incoming data , For Eg
below is the data coming in
loginName
Email
address
city
Now for each login name i need to go to oracle database and get the userId
from
Hi ,
I need to load 10 tables in memory and have them available to all the
workers , Please let me me know what is the best way to do broadcast them
sc.broadcast(df) allow only one
Thanks,
Hi ,
How can i use Map function in java to convert all the lines of csv file
into a list of objects , Can some one please help...
JavaRDDListCharge rdd = sc.textFile(data.csv).map(new
FunctionString, ListCharge() {
@Override
public ListCharge call(String s) {
Hi All ,
I have an DataFrame Created as below
options.put(dbtable, (select * from user) as account);
DataFrame accountRdd =
sqlContext.read().format(jdbc).options(options).load();
and i have another RDD which contains login name and i want to find the
userid from above DF RDD and return
rdd.map function...
Rdd object is not serialiable. Whatever objects you use inside map
function should be serializable as they get transferred to executor nodes.
On Jul 2, 2015 6:13 AM, Ashish Soni asoni.le...@gmail.com wrote:
Hi All ,
I am not sure what is the wrong with below code
Hi All ,
I am not sure what is the wrong with below code as it give below error when
i access inside the map but it works outside
JavaRDDCharge rdd2 = rdd.map(new FunctionCharge, Charge() {
@Override
public Charge call(Charge ch) throws Exception {
*
Hi All ,
What is the best possible way to load multiple data tables using spark sql
MapString, String options = new HashMap();
options.put(driver, MYSQLDR);
options.put(url, MYSQL_CN_URL);
options.put(dbtable,(select * from courses);
*can i add multiple tables to options map
Not sure what is the issue but when i run the spark-submit or spark-shell i
am getting below error
/usr/bin/spark-class: line 24: /usr/bin/load-spark-env.sh: No such file or
directory
Can some one please help
Thanks,
Hi ,
If i have a below data format , how can i use kafka direct stream to
de-serialize as i am not able to understand all the parameter i need to
pass , Can some one explain what will be the arguments as i am not clear
about this
JavaPairInputDStream
for you to start.
Thanks
Best Regards
On Fri, Jun 26, 2015 at 6:09 PM, Ashish Soni asoni.le...@gmail.com
wrote:
Hi ,
If i have a below data format , how can i use kafka direct stream to
de-serialize as i am not able to understand all the parameter i need to
pass , Can some one explain
Hi All ,
We are looking to use spark as our stream processing framework and it would
be helpful if experts can weigh if we made a right choice given below
requirement
Given a stream of data we need to take those event to multiple stage (
pipeline processing ) and in those stage customer will
Hi All ,
What is difference between below in terms of execution to the cluster with
1 or more worker node
rdd.map(...).map(...)...map(..)
vs
val rdd1 = rdd.map(...)
val rdd2 = rdd1.map(...)
val rdd3 = rdd2.map(...)
Thanks,
Ashish
Hi All ,
What is the Best Way to install and Spark Cluster along side with Hadoop
Cluster , Any recommendation for below deployment topology will be a great
help
*Also Is it necessary to put the Spark Worker on DataNodes as when it read
block from HDFS it will be local to the Server / Worker or
Can any one help i am getting below error when i try to start the History
Server
I do not see any org.apache.spark.deploy.yarn.history.pakage inside the
assembly jar not sure how to get that
java.lang.ClassNotFoundException:
org.apache.spark.deploy.yarn.history.YarnHistoryProvider
Thanks,
hw...@qilinsoft.com
*Date:* 2015-06-19 18:47
*To:* Enno Shioji eshi...@gmail.com; Tathagata Das t...@databricks.com
*CC:* prajod.vettiyat...@wipro.com; Cody Koeninger c...@koeninger.org;
bit1...@163.com; Jordan Pilat jrpi...@gmail.com; Will Briggs
wrbri...@gmail.com; Ashish Soni asoni.le
Hi ,
Is any one able to install Spark 1.4 on HDP 2.2 , Please let me know how
can i do the same ?
Ashish
version?
On Fri, Jun 19, 2015 at 10:22 PM, Ashish Soni asoni.le...@gmail.com
wrote:
Hi ,
Is any one able to install Spark 1.4 on HDP 2.2 , Please let me know how
can i do the same ?
Ashish
--
Best Regards,
Ayan Guha
Can some one please let me know what all i need to configure to have Spark
run using Yarn ,
There is lot of documentation but none of it says how and what all files
needs to be changed
Let say i have 4 node for Spark - SparkMaster , SparkSlave1 , SparkSlave2 ,
SparkSlave3
Now in which node
My Use case is below
We are going to receive lot of event as stream ( basically Kafka Stream )
and then we need to process and compute
Consider you have a phone contract with ATT and every call / sms / data
useage you do is an event and then it needs to calculate your bill on real
time basis so
Hi Sparkers ,
https://dl.acm.org/citation.cfm?id=2742788
Recently Twitter release a paper on Heron as an replacement of Apache Storm
and i would like to know if currently Apache Spark Does Suffer from the
same issues as they have outlined.
Any input / thought will be helpful.
Thanks,
Ashish
idempotent. Replacing the entire state is the easiest way to do
it, but it's obviously expensive.
The alternative is to do something similar to what Storm does. At that
point, you'll have to ask though if just using Storm is easier than that.
On Wed, Jun 17, 2015 at 1:50 PM, Ashish Soni asoni.le
something yourself, or you can use Storm
Trident (or transactional low-level API).
On Wed, Jun 17, 2015 at 1:26 PM, Ashish Soni asoni.le...@gmail.com
wrote:
My Use case is below
We are going to receive lot of event as stream ( basically Kafka Stream )
and then we need to process and compute
would definitely pursue this route as
our transformations are really simple.
Best
On Wed, Jun 17, 2015 at 10:26 PM, Ashish Soni asoni.le...@gmail.com
wrote:
My Use case is below
We are going to receive lot of event as stream ( basically Kafka
Stream ) and then we need to process
60 matches
Mail list logo