Re: Is spark a right tool for updating a dataframe repeatedly

2016-10-17 Thread Mike Metzger
; > > > E.g. if the data is completely contained in memory and there is no spill > > over to disk, it might not be a big issue (ofcourse there will still be > > memory, CPU and network overhead/latency). > > > > If you are looking at storing the data on disk (e.g. as pa

Re: Is spark a right tool for updating a dataframe repeatedly

2016-10-17 Thread Mungeol Heo
ata on disk (e.g. as part of a checkpoint > or explicit storage), then there can be substantial I/O activity. > > > > > > > > From: Xi Shen <davidshe...@gmail.com> > Date: Monday, October 17, 2016 at 2:54 AM > To: Divya Gehlot <divya.htco...@gmail.com>, Munge

Re: Is spark a right tool for updating a dataframe repeatedly

2016-10-17 Thread Thakrar, Jayesh
com> Date: Monday, October 17, 2016 at 2:54 AM To: Divya Gehlot <divya.htco...@gmail.com>, Mungeol Heo <mungeol@gmail.com> Cc: "user @spark" <user@spark.apache.org> Subject: Re: Is spark a right tool for updating a dataframe repeatedly I think most of the &qu

Re: Is spark a right tool for updating a dataframe repeatedly

2016-10-17 Thread Xi Shen
mungeol@gmail.com> wrote: > > Hello, everyone. > > As I mentioned at the tile, I wonder that is spark a right tool for > updating a data frame repeatedly until there is no more date to > update. > > For example. > > while (if there was a updating) { >

Re: Is spark a right tool for updating a dataframe repeatedly

2016-10-17 Thread Divya Gehlot
everyone. > > As I mentioned at the tile, I wonder that is spark a right tool for > updating a data frame repeatedly until there is no more date to > update. > > For example. > > while (if there was a updating) { > update a data frame A > } > > If it is the right tool,

Is spark a right tool for updating a dataframe repeatedly

2016-10-16 Thread Mungeol Heo
Hello, everyone. As I mentioned at the tile, I wonder that is spark a right tool for updating a data frame repeatedly until there is no more date to update. For example. while (if there was a updating) { update a data frame A } If it is the right tool, then what is the best practice

Re: Is Spark the right tool for me?

2014-12-02 Thread andy petrella
this is ok? ~Ben Von: andy petrella andy.petre...@gmail.com Datum: Montag, 1. Dezember 2014 15:48 An: Benjamin Stadin benjamin.sta...@heidelberg-mobil.com, user@spark.apache.org user@spark.apache.org Betreff: Re: Is Spark the right tool for me? Indeed. However, I guess the important load

Re: Is Spark the right tool for me?

2014-12-02 Thread Stadin, Benjamin
user@spark.apache.orgmailto:user@spark.apache.org Betreff: Re: Is Spark the right tool for me? The point 4 looks weird to me, I mean if you intent to have such workflow to run in a single session (maybe consider sessionless arch) Is such process for each user? If it's the case, maybe finding a way

Re: Is Spark the right tool for me?

2014-12-02 Thread andy petrella
2014 10:00 An: Benjamin Stadin benjamin.sta...@heidelberg-mobil.com, user@spark.apache.org user@spark.apache.org Betreff: Re: Is Spark the right tool for me? The point 4 looks weird to me, I mean if you intent to have such workflow to run in a single session (maybe consider sessionless arch

Re: Is Spark the right tool for me?

2014-12-02 Thread Roger Hoover
of a user session. = Oozie Do you think this is ok? ~Ben Von: andy petrella andy.petre...@gmail.com Datum: Montag, 1. Dezember 2014 15:48 An: Benjamin Stadin benjamin.sta...@heidelberg-mobil.com, user@spark.apache.org user@spark.apache.org Betreff: Re: Is Spark the right tool for me

Is Spark the right tool for me?

2014-12-01 Thread Stadin, Benjamin
Hi all, I need some advise whether Spark is the right tool for my zoo. My requirements share commonalities with „big data“, workflow coordination and „reactive“ event driven data processing (as in for example Haskell Arrows), which doesn’t make it any easier to decide on a tool set. NB: I

Re: Is Spark the right tool for me?

2014-12-01 Thread andy petrella
transaction... unless you go to a dedicated database like Vertica (just mentioning) kr, andy On Mon Dec 01 2014 at 2:49:44 PM Stadin, Benjamin benjamin.sta...@heidelberg-mobil.com wrote: Hi all, I need some advise whether Spark is the right tool for my zoo. My requirements share

Re: Is Spark the right tool for me?

2014-12-01 Thread Stadin, Benjamin
2014 15:07 An: Benjamin Stadin benjamin.sta...@heidelberg-mobil.commailto:benjamin.sta...@heidelberg-mobil.com, user@spark.apache.orgmailto:user@spark.apache.org user@spark.apache.orgmailto:user@spark.apache.org Betreff: Re: Is Spark the right tool for me? Not quite sure which geo processing

Re: Is Spark the right tool for me?

2014-12-01 Thread Stadin, Benjamin
, user@spark.apache.orgmailto:user@spark.apache.org user@spark.apache.orgmailto:user@spark.apache.org Betreff: Re: Is Spark the right tool for me? Not quite sure which geo processing you're doing are they raster, vector? More info will be appreciated for me to help you further. Meanwhile I can

Re: Is Spark the right tool for me?

2014-12-01 Thread andy petrella
://www.deep-map.com Von: andy petrella andy.petre...@gmail.com Datum: Montag, 1. Dezember 2014 15:07 An: Benjamin Stadin benjamin.sta...@heidelberg-mobil.com, user@spark.apache.org user@spark.apache.org Betreff: Re: Is Spark the right tool for me? Not quite sure which geo processing you're

Re: Is Spark the right tool?

2014-10-28 Thread Akhil
it. I need insertion, deletion, and lookup to be fast. Is this something that can be done with Spark and is Spark the right tool to use in terms of latency and throughput? Pardon me if I don't know what I am talking about. All these are very new to me. Thanks! -- View this message

Re: Is Spark the right tool?

2014-10-28 Thread Koert Kuipers
and list (or vector) of transaction records as data. This RDD need to be thread (or process) safe since different threads and processes will be reading and modifying it. I need insertion, deletion, and lookup to be fast. Is this something that can be done with Spark and is Spark the right tool

Is Spark the right tool?

2014-10-19 Thread kc66
with Spark and is Spark the right tool to use in terms of latency and throughput? Pardon me if I don't know what I am talking about. All these are very new to me. Thanks! -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Is-Spark-the-right-tool-tp16775.html Sent