Re: Catalyst dependency on Spark Core

2014-07-15 Thread Sean Owen
Agree. You end up with a core and a corer core to distinguish between and it ends up just being more complicated. This sounds like something that doesn't need a module. On Tue, Jul 15, 2014 at 5:59 AM, Patrick Wendell pwend...@gmail.com wrote: Adding new build modules is pretty high overhead, so

traveling next week

2014-07-15 Thread Cody Koeninger
I'm going to be on a plane wed 23, return flight monday 28, so will miss daily call those days. I'll be pushing forward on projects as I can, but skype availability may be limited, so email if you need something from me.

[brainsotrming] Generalization of DStream, a ContinuousRDD ?

2014-07-15 Thread andy petrella
Dear Sparkers, *[sorry for the lengthy email... = head to the gist https://gist.github.com/andypetrella/12228eb24eea6b3e1389 for a preview :-p**]* I would like to share some thinking I had due to a use case I faced. Basically, as the subject announced it, it's a generalization of the DStream

Re: traveling next week

2014-07-15 Thread Patrick Wendell
Cody - did you mean to send this to the spark dev list? On Tue, Jul 15, 2014 at 7:15 AM, Cody Koeninger cody.koenin...@mediacrossing.com wrote: I'm going to be on a plane wed 23, return flight monday 28, so will miss daily call those days. I'll be pushing forward on projects as I can, but

Re: traveling next week

2014-07-15 Thread Cody Koeninger
No, sorry for the mixup, it was a helpful autocomplete similarity between an internal work list and the spark dev list :( Switched my spark mailing list subscription back to my personal email so you guys won't be subjected to further unwanted email. On Tue, Jul 15, 2014 at 12:36 PM, Patrick

Re: Reproducible deadlock in 1.0.1, possibly related to Spark-1097

2014-07-15 Thread Cody Koeninger
We tested that patch from aarondav's branch, and are no longer seeing that deadlock. Seems to have solved the problem, at least for us. On Mon, Jul 14, 2014 at 7:22 PM, Patrick Wendell pwend...@gmail.com wrote: Andrew and Gary, Would you guys be able to test

Re: Hadoop's Configuration object isn't threadsafe

2014-07-15 Thread yao
Good catch Andrew. In addition to your proposed solution, is that possible to fix Configuration class and make it thread-safe ? I think the fix should be trivial, just use a ConcurrentHashMap, but I am not sure if we can push this change upstream (will hadoop guys accept this change ? for them, it

Re: Hadoop's Configuration object isn't threadsafe

2014-07-15 Thread Andrew Ash
Hi Shengzhe, Even if we did make Configuration threadsafe, it'd take quite some time for that to trickle down to a Hadoop release that we could actually rely on Spark users having installed. I agree we should consider whether making Configuration threadsafe is something that Hadoop should do,

Re: Hadoop's Configuration object isn't threadsafe

2014-07-15 Thread Patrick Wendell
Hey Andrew, Cloning the conf this might be a good/simple fix for this particular problem. It's definitely worth looking into. There are a few things we can probably do in Spark to deal with non-thread-safety inside of the Hadoop FileSystem and Configuration classes. One thing we can do in

Re: [brainsotrming] Generalization of DStream, a ContinuousRDD ?

2014-07-15 Thread Tathagata Das
Very interesting ideas Andy! Conceptually i think it makes sense. In fact, it is true that dealing with time series data, windowing over application time, windowing over number of events, are things that DStream does not natively support. The real challenge is actually mapping the conceptual

Re: Catalyst dependency on Spark Core

2014-07-15 Thread Matei Zaharia
Yeah, that seems like something we can inline :). On Jul 15, 2014, at 7:30 PM, Baofeng Zhang pelickzh...@qq.com wrote: Is Matei following this? Catalyst uses the Utils to get the ClassLoader which loaded Spark. Can Catalyst directly do getClass.getClassLoader to avoid the dependency on