Re: Can I control the execution of Spark jobs?

2016-06-18 Thread Jacek Laskowski
Hi,

Ahh, that makes sense now.

Spark works like this by default. You just do your 1st pipeline and
then another one (and perhaps some more). Since the pipelines are
processed serially (one by one) you implicitly create a dependency
between Spark jobs. You need no special steps to have it.

pipeline == load a dataset, transform it and save it to persistent storage

Pozdrawiam,
Jacek Laskowski

https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Fri, Jun 17, 2016 at 4:15 AM, Haopu Wang <hw...@qilinsoft.com> wrote:
> Jacek,
>
> For example, one ETL job is saving raw events and update a file.
> The other job is using that file's content to process the data set.
>
> In this case, the first job has to be done before the second one. That's what 
> I mean by dependency. Any suggestions/comments are appreciated.
>
> -Original Message-
> From: Jacek Laskowski [mailto:ja...@japila.pl]
> Sent: 2016年6月16日 19:09
> To: user
> Subject: Re: Can I control the execution of Spark jobs?
>
> Hi,
>
> When you say "several ETL types of things", what is this exactly? What
> would an example of "dependency between these jobs" be?
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Thu, Jun 16, 2016 at 11:36 AM, Haopu Wang <hw...@qilinsoft.com> wrote:
>> Hi,
>>
>>
>>
>> Suppose I have a spark application which is doing several ETL types of
>> things.
>>
>> I understand Spark can analyze and generate several jobs to execute.
>>
>> The question is: is it possible to control the dependency between these
>> jobs?
>>
>>
>>
>> Thanks!
>>
>>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: Can I control the execution of Spark jobs?

2016-06-16 Thread Haopu Wang
Jacek,

For example, one ETL job is saving raw events and update a file.
The other job is using that file's content to process the data set.

In this case, the first job has to be done before the second one. That's what I 
mean by dependency. Any suggestions/comments are appreciated.

-Original Message-
From: Jacek Laskowski [mailto:ja...@japila.pl] 
Sent: 2016年6月16日 19:09
To: user
Subject: Re: Can I control the execution of Spark jobs?

Hi,

When you say "several ETL types of things", what is this exactly? What
would an example of "dependency between these jobs" be?

Pozdrawiam,
Jacek Laskowski

https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Thu, Jun 16, 2016 at 11:36 AM, Haopu Wang <hw...@qilinsoft.com> wrote:
> Hi,
>
>
>
> Suppose I have a spark application which is doing several ETL types of
> things.
>
> I understand Spark can analyze and generate several jobs to execute.
>
> The question is: is it possible to control the dependency between these
> jobs?
>
>
>
> Thanks!
>
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Can I control the execution of Spark jobs?

2016-06-16 Thread Alonso Isidoro Roman
Hi Wang,

maybe you can consider to use an integration framework like Apache Camel in
order to run differents jobs...

Alonso Isidoro Roman
[image: https://]about.me/alonso.isidoro.roman


2016-06-16 13:08 GMT+02:00 Jacek Laskowski :

> Hi,
>
> When you say "several ETL types of things", what is this exactly? What
> would an example of "dependency between these jobs" be?
>
> Pozdrawiam,
> Jacek Laskowski
> 
> https://medium.com/@jaceklaskowski/
> Mastering Apache Spark http://bit.ly/mastering-apache-spark
> Follow me at https://twitter.com/jaceklaskowski
>
>
> On Thu, Jun 16, 2016 at 11:36 AM, Haopu Wang  wrote:
> > Hi,
> >
> >
> >
> > Suppose I have a spark application which is doing several ETL types of
> > things.
> >
> > I understand Spark can analyze and generate several jobs to execute.
> >
> > The question is: is it possible to control the dependency between these
> > jobs?
> >
> >
> >
> > Thanks!
> >
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: Can I control the execution of Spark jobs?

2016-06-16 Thread Jacek Laskowski
Hi,

When you say "several ETL types of things", what is this exactly? What
would an example of "dependency between these jobs" be?

Pozdrawiam,
Jacek Laskowski

https://medium.com/@jaceklaskowski/
Mastering Apache Spark http://bit.ly/mastering-apache-spark
Follow me at https://twitter.com/jaceklaskowski


On Thu, Jun 16, 2016 at 11:36 AM, Haopu Wang  wrote:
> Hi,
>
>
>
> Suppose I have a spark application which is doing several ETL types of
> things.
>
> I understand Spark can analyze and generate several jobs to execute.
>
> The question is: is it possible to control the dependency between these
> jobs?
>
>
>
> Thanks!
>
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org