+1
Thanks,
Mridul
On Thu, Jul 30, 2020 at 4:49 PM Holden Karau wrote:
> Hi Spark Developers,
>
> After the discussion of the proposal to amend Spark committer guidelines,
> it appears folks are generally in agreement on policy clarifications. (See
>
I agree, that would be a new feature; and unless compelling reason (like
security concerns) would not qualify.
Regards,
Mridul
On Wed, Jul 15, 2020 at 11:46 AM Wenchen Fan wrote:
> Supporting Python 3.8.0 sounds like a new feature, and doesn't qualify a
> backport. But I'm open to other
Thanks Holden, this version looks good to me.
+1
Regards,
Mridul
On Thu, Jul 23, 2020 at 3:56 PM Imran Rashid wrote:
> Sure, that sounds good to me. +1
>
> On Wed, Jul 22, 2020 at 1:50 PM Holden Karau wrote:
>
>>
>>
>> On Wed, Jul 22, 2020 at 7:39 AM Imran Rashid < iras...@apache.org >
>>
Congratulations !
Regards,
Mridul
On Tue, Jul 14, 2020 at 12:37 PM Matei Zaharia
wrote:
> Hi all,
>
> The Spark PMC recently voted to add several new committers. Please join me
> in welcoming them to their new roles! The new committers are:
>
> - Huaxin Gao
> - Jungtaek Lim
> - Dilip Biswal
>
+1
Thanks,
Mridul
On Wed, Jul 1, 2020 at 6:36 PM Hyukjin Kwon wrote:
> +1
>
> 2020년 7월 2일 (목) 오전 10:08, Marcelo Vanzin 님이 작성:
>
>> I reviewed the docs and PRs from way before an SPIP was explicitly
>> asked, so I'm comfortable with giving a +1 even if I haven't really
>> fully read the new
Thanks for shepherding this Holden !
I left a few comments, but overall it looks good to me.
Regards,
Mridul
On Sat, Jun 27, 2020 at 9:34 PM Holden Karau wrote:
> There’s been some comments & a few additions in the doc, but it seems like
> the folks taking a look generally agree on the
Great job everyone ! Congratulations :-)
Regards,
Mridul
On Thu, Jun 18, 2020 at 10:21 AM Reynold Xin wrote:
> Hi all,
>
> Apache Spark 3.0.0 is the first release of the 3.x line. It builds on many
> of the innovations from Spark 2.x, bringing new ideas as well as continuing
> long-term
+1
Regards,
Mridul
On Sat, Jun 6, 2020 at 1:20 PM Reynold Xin wrote:
> Apologies for the mistake. The vote is open till 11:59pm Pacific time on
> Mon June 9th.
>
> On Sat, Jun 6, 2020 at 1:08 PM Reynold Xin wrote:
>
>> Please vote on releasing the following candidate as Apache Spark version
Is this a behavior change in 2.4.x from earlier version ?
Or are we proposing to introduce a functionality to help with adoption ?
Regards,
Mridul
On Wed, Jun 3, 2020 at 10:32 AM Xiao Li wrote:
> Yes. Spark 3.0 RC2 works well.
>
> I think the current behavior in Spark 2.4 affects the
+1 (binding)
Thanks,
Mridul
On Sun, May 31, 2020 at 4:47 PM Holden Karau wrote:
> Please vote on releasing the following candidate as Apache Spark
> version 2.4.6.
>
> The vote is open until June 5th at 9AM PST and passes if a majority +1 PMC
> votes are cast, with a minimum of 3 +1 votes.
>
I agree with what Sean detailed.
The only place where I can see some amount of investigation being required
would be for security issues or correctness issues.
Knowing the affected versions, particularly if an earlier supported version
does not have the bug, will help users understand the
I am in broad agreement with the prposal, as any developer, I prefer
stable well designed API's :-)
Can we tie the proposal to stability guarantees given by spark and
reasonable expectation from users ?
In my opinion, an unstable or evolving could change - while an
experimental api which has been
Very well put Imran. This is a variant of executor failure after an RDD has
been computed (including caching). In general, non determinism in spark is
going to lead to inconsistency.
The only reasonable solution for us, at that time, was to make
pseudo-randomness repeatable and checkpoint after so
Just for completeness sake, spark is not version neutral to hadoop;
particularly in yarn mode, there is a minimum version requirement
(though fairly generous I believe).
I agree with Steve, it is a long standing pain that we are bundling a
positively ancient version of hive.
Having said that, we
Makes more sense to drop support for zstd assuming the fix is not
something at spark end (configuration, etc).
Does not make sense to try to detect deadlock in codec.
Regards,
Mridul
On Tue, Oct 1, 2019 at 8:39 PM Jungtaek Lim
wrote:
>
> Hi devs,
>
> I've discovered an issue with event logger,
Add a +1 from me as well.
Just managed to finish going over it.
Thanks Bobby for leading this effort !
Regards,
Mridul
On Wed, May 29, 2019 at 2:51 PM Tom Graves wrote:
>
> Ok, I'm going to call this vote and send the result email. We had 9 +1's (4
> binding) and 1 +0 and no -1's.
>
> Tom
>
>
Unfortunately I do not have bandwidth to do a detailed review, but a few
things come to mind after a quick read:
- While it might be tactically beneficial to align with existing
implementation, a clean design which does not tie into existing shuffle
implementation would be preferable (if it can
I am -1 on this vote for pretty much all the reasons that Mark mentioned.
A major version change gives us an opportunity to remove deprecated
interfaces, stabilize experimental/developer api, drop support for
outdated functionality/platforms and evolve the project with a vision
for foreseeable
Is this handling only scala or java as well ?
Regards,
Mridul
On Thu, Nov 22, 2018 at 9:11 AM Cody Koeninger wrote:
> Plugin invocation is ./build/mvn mvn-scalafmt_2.12:format
>
> It takes about 5 seconds, and errors out on the first different file
> that doesn't match formatting.
>
> I made a
Is it only me or are all others getting Wenchen’s mails ? (Obviously Ryan
did :-) )
I did not see it in the mail thread I received or in archives ... [1]
Wondering which othersenderswere getting dropped (if yes).
Regards
Mridul
[1]
+1
I left a couple of comments in NiharS's PR, but this is very useful to
have in spark !
Regards,
Mridul
On Fri, Aug 3, 2018 at 10:00 AM Imran Rashid
wrote:
>
> I'd like to propose adding a plugin api for Executors, primarily for
> instrumentation and debugging
>
o non-serializable objects etc.
> In all these cases you know you are adding references you shouldn't.
> If users were used to another UX we can try fix it, not sure how well this
> worked in the past though and if covered all cases.
>
> Regards,
> Stavros
>
> On Mon, Aug 6,
I agree, we should not work around the testcase but rather understand
and fix the root cause.
Closure cleaner should have null'ed out the references and allowed it
to be serialized.
Regards,
Mridul
On Sun, Aug 5, 2018 at 8:38 PM Wenchen Fan wrote:
>
> It seems to me that the closure cleaner
I agree, I dont see pressing need for major version bump as well.
Regards,
Mridul
On Fri, Jun 15, 2018 at 10:25 AM Mark Hamstra wrote:
>
> Changing major version numbers is not about new features or a vague notion
> that it is time to do something that will be seen to be a significant
>
Specifically to run spark with hadoop 3 docker support, I have filed a
few jira's tracked under [1].
Regards,
Mridul
[1] https://issues.apache.org/jira/browse/SPARK-23717
On Mon, Apr 2, 2018 at 1:00 PM, Reynold Xin wrote:
> Does anybody know what needs to be done in order
Congratulations !
Regards,
Mridul
On Fri, Mar 2, 2018 at 2:41 PM, Matei Zaharia wrote:
> Hi everyone,
>
> The Spark PMC has recently voted to add several new committers to the
> project, based on their contributions to Spark 2.3 and other past work:
>
> - Anirudh
On Wed, Jan 31, 2018 at 1:15 AM, Ruifeng Zheng wrote:
> HI all:
>
>
>
>1, Dataset API supports operation “sortWithinPartitions”, but in RDD
> API there is no counterpart (I know there is
> “repartitionAndSortWithinPartitions”, but I don’t want to repartition the
>
We should definitely clean this up and make it the default, nicely done
Marcelo !
Thanks,
Mridul
On Fri, Jan 5, 2018 at 5:06 PM Marcelo Vanzin wrote:
> Hey all, especially those working on the k8s stuff.
>
> Currently we have 3 docker images that need to be built and
We do support running on Apache Mesos via docker images - so this
would not be restricted to k8s.
But unlike mesos support, which has other modes of running, I believe
k8s support more heavily depends on availability of docker images.
Regards,
Mridul
On Wed, Nov 29, 2017 at 8:56 AM, Sean Owen
I agree, proposal 1 sounds better among the options.
Regards,
Mridul
On Sun, Oct 1, 2017 at 3:50 PM, Reynold Xin wrote:
> Probably should do 1, and then it is an easier transition in 3.0.
>
> On Sun, Oct 1, 2017 at 1:28 AM Sean Owen wrote:
>>
>> I
Congratulations Tejas !
Regards,
Mridul
On Fri, Sep 29, 2017 at 12:58 PM, Matei Zaharia wrote:
> Hi all,
>
> The Spark PMC recently added Tejas Patil as a committer on the
> project. Tejas has been contributing across several areas of Spark for
> a while, focusing
Sounds good to me.
+1
Regards,
Mridul
On Tue, Sep 26, 2017 at 2:36 AM, Sean Owen wrote:
> Not a big deal, but I'm wondering whether Flume integration should at least
> be opt-in and behind a profile? it still sees some use (at least on our end)
> but not applicable to the
Congratulations Jerry, well deserved !
Regards,
Mridul
On Mon, Aug 28, 2017 at 6:28 PM, Matei Zaharia wrote:
> Hi everyone,
>
> The PMC recently voted to add Saisai (Jerry) Shao as a committer. Saisai has
> been contributing to many areas of the project for a long
While I definitely support the idea of Apache Spark being able to
leverage kubernetes, IMO it is better for long term evolution of spark
to expose appropriate SPI such that this support need not necessarily
live within Apache Spark code base.
It will allow for multiple backends to evolve,
Congratulations Hyukjin, Sameer !
Regards,
Mridul
On Mon, Aug 7, 2017 at 8:53 AM, Matei Zaharia wrote:
> Hi everyone,
>
> The Spark PMC recently voted to add Hyukjin Kwon and Sameer Agarwal as
> committers. Join me in congratulating both of them and thanking them for
Hi,
https://issues.apache.org/jira/browse/SPARK-20202?jql=priority%20%3D%20Blocker%20AND%20affectedVersion%20%3D%20%222.1.1%22%20and%20project%3D%22spark%22
Indicates there is another blocker (SPARK-20197 should have come in
the list too, but was marked major).
Regards,
Mridul
On Tue, Apr 4,
Congratulations and welcome Holden and Burak !
Regards,
Mridul
On Tue, Jan 24, 2017 at 10:13 AM, Reynold Xin wrote:
> Hi all,
>
> Burak and Holden have recently been elected as Apache Spark committers.
>
> Burak has been very active in a large number of areas in Spark,
Since TaskContext.getPartitionId is part of the public api, it cant be
removed as user code can be depending on it (unless we go through a
deprecation process for it).
Regards,
Mridul
On Sat, Jan 14, 2017 at 2:02 AM, Jacek Laskowski wrote:
> Hi,
>
> Just noticed that
Can someone add me to edit list for the spark wiki please ?
Thanks,
Mridul
-
To unsubscribe e-mail: dev-unsubscr...@spark.apache.org
+1
Regards,
Mridul
On Wed, Sep 28, 2016 at 7:14 PM, Reynold Xin wrote:
> Please vote on releasing the following candidate as Apache Spark version
> 2.0.1. The vote is open until Sat, Oct 1, 2016 at 20:00 PDT and passes if a
> majority of at least 3+1 PMC votes are cast.
>
>
When numPartitions is 0, there is no data in the rdd: so getPartition is
never invoked.
- Mridul
On Friday, September 16, 2016, WangJianfei
wrote:
> if so, we will get exception when the numPartitions is 0.
> def getPartition(key: Any): Int = key match {
>
It is good to get clarification, but the way I read it, the issue is
whether we publish it as official Apache artifacts (in maven, etc).
Users can of course build it directly (and we can make it easy to do so) -
as they are explicitly agreeing to additional licenses.
Regards
Mridul
On
I agree, we should not be publishing both of them.
Thanks for bringing this up !
Regards,
Mridul
On Wed, Sep 7, 2016 at 1:29 AM, Sean Owen wrote:
> It's worth calling attention to:
>
> https://issues.apache.org/jira/browse/SPARK-17418
>
The example violates the basic contract of a Partitioner.
It does make sense to take Partitioner as a param to distinct - though it
is fairly trivial to simulate that in user code as well ...
Regards
Mridul
On Wednesday, June 8, 2016, 汪洋 wrote:
> Hi Alexander,
>
> I
Congratulations Yanbo !
Regards
Mridul
On Friday, June 3, 2016, Matei Zaharia wrote:
> Hi all,
>
> The PMC recently voted to add Yanbo Liang as a committer. Yanbo has been a
> super active contributor in many areas of MLlib. Please join me in
> welcoming Yanbo!
>
>
+1 (binding) on removing maintainer process.
I agree with your opinion of "automatic " instead of a manual list.
Regards
Mridul
On Thursday, May 19, 2016, Matei Zaharia wrote:
> Hi folks,
>
> Around 1.5 years ago, Spark added a maintainer process for reviewing API
>
On Friday, April 15, 2016, Mattmann, Chris A (3980) <
chris.a.mattm...@jpl.nasa.gov> wrote:
> Yeah in support of this statement I think that my primary interest in
> this Spark Extras and the good work by Luciano here is that anytime we
> take bits out of a code base and “move it to GitHub” I see
In general, I agree - it is preferable to break backward compatibility
(where unavoidable) only at major versions.
Unfortunately, this usually is planned better - with earlier versions
announcing intent of the change - deprecation across multiple
releases, defaults changed, etc.
>From the thread,
I think Reynold's suggestion of using ram disk would be a good way to
test if these are the bottlenecks or something else is.
For most practical purposes, pointing local dir to ramdisk should
effectively give you 'similar' performance as shuffling from memory.
Are there concerns with taking that
required (and this discussion is a sign that the process has not been
> > conducted properly as people have concerns, me including).
> >
> > Thanks Mridul!
> >
> > Pozdrawiam,
> > Jacek Laskowski
> >
> > https://medium.com/@jaceklaskowski/
> >
ts to support scala 2.10 three years after they did the last
> maintenance release?
>
>
> On Thu, Mar 24, 2016 at 9:59 PM, Mridul Muralidharan <mri...@gmail.com
> <javascript:_e(%7B%7D,'cvml','mri...@gmail.com');>> wrote:
>
>> Removing compatibility (with jdk, etc
ts to support scala 2.10 three years after they did the last
> maintenance release?
>
>
> On Thu, Mar 24, 2016 at 9:59 PM, Mridul Muralidharan <mri...@gmail.com
> <javascript:_e(%7B%7D,'cvml','mri...@gmail.com');>> wrote:
>
>> Removing compatibility (with jdk, etc
Removing compatibility (with jdk, etc) can be done with a major release-
given that 7 has been EOLed a while back and is now unsupported, we have to
decide if we drop support for it in 2.0 or 3.0 (2+ years from now).
Given the functionality & performance benefits of going to jdk8, future
Container Java version can be different from yarn Java version : we run
jobs with jdk8 on jdk7 cluster without issues.
Regards
Mridul
On Thursday, March 24, 2016, Koert Kuipers wrote:
> i guess what i am saying is that in a yarn world the only hard
> restrictions left are
+1
Agree, dropping support for java 7 is long overdue - and 2.0 would be
a logical release to do this on.
Regards,
Mridul
On Thu, Mar 24, 2016 at 12:27 AM, Reynold Xin wrote:
> About a year ago we decided to drop Java 6 support in Spark 1.5. I am
> wondering if we should
I was not aware of a discussion in Dev list about this - agree with most of
the observations.
In addition, I did not see PMC signoff on moving (sub-)modules out.
Regards
Mridul
On Thursday, March 17, 2016, Marcelo Vanzin wrote:
> Hello all,
>
> Recently a lot of the
We use it in executors to get to :
a) spark conf (for getting to hadoop config in map doing custom
writing of side-files)
b) Shuffle manager (to get shuffle reader)
Not sure if there are alternative ways to get to these.
Regards,
Mridul
On Wed, Mar 16, 2016 at 2:52 PM, Reynold Xin
t Kafka specifically
>
> https://issues.apache.org/jira/browse/SPARK-13877
>
>
> On Thu, Mar 17, 2016 at 2:49 PM, Mridul Muralidharan <mri...@gmail.com> wrote:
>>
>> I was not aware of a discussion in Dev list about this - agree with most of
>> the observations.
>> In add
open by people out there anyway)
>
> On Thu, Dec 31, 2015 at 3:25 AM, Mridul Muralidharan <mri...@gmail.com> wrote:
>> I am not sure of others, but I had a PR close from under me where
>> ongoing discussion was as late as 2 weeks back.
>> Given this, I assumed it was
ividual ones.
>
>
> On Wednesday, December 30, 2015, Mridul Muralidharan <mri...@gmail.com>
> wrote:
>>
>> Is there a script running to close "old" PR's ? I was not aware of any
>> discussion about this in dev list.
>>
>> - Mridul
>>
>> -
Is there a script running to close "old" PR's ? I was not aware of any
discussion about this in dev list.
- Mridul
-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail:
There was a proposal to make schedulers pluggable in context of adding one
which leverages Apache Tez : IIRC it was a abandoned - but the jira might
be a good starting point.
Regards
Mridul
On Dec 3, 2015 2:59 PM, "Rad Gruchalski" wrote:
> There was a talk in this thread
Would be also good to fix api breakages introduced as part of 1.0
(where there is missing functionality now), overhaul & remove all
deprecated config/features/combinations, api changes that we need to
make to public api which has been deferred for minor releases.
Regards,
Mridul
On Tue, Nov 10,
What I understood from Imran's mail (and what was referenced in his
mail) the RDD mentioned seems to be violating some basic contracts on
how partitions are used in spark [1].
They cannot be arbitrarily numbered,have duplicates, etc.
Extending RDD to add functionality is typically for niche
Would be a good idea to generalize this for spark core - and allow for
its use in serde, compression, etc.
Regards,
Mridul
On Thu, Jul 30, 2015 at 11:33 AM, Joseph Batchik
josephbatc...@gmail.com wrote:
Yep I was looking into using the jar service loader.
I pushed a rough draft to my fork of
Simply customize your log4j confit instead of modifying code if you don't
want messages from that class.
Regards
Mridul
On Sunday, July 26, 2015, Sea 261810...@qq.com wrote:
This exception is so ugly!!! The screen is full of these information when
the program runs a long time, and they
the only thing that changed is
the location of some scripts in mesos/ to amplab/).
Thanks
Shivaram
On Mon, Jul 20, 2015 at 12:55 PM, Mridul Muralidharan mri...@gmail.com
wrote:
Might be a good idea to get the PMC's of both projects to sign off to
prevent future issues with apache.
Regards
of the Apache Mesos
project. It was a remnant part of Spark from when Spark used to live at
github.com/mesos/spark.
Shivaram
On Tue, Jul 21, 2015 at 11:03 AM, Mridul Muralidharan mri...@gmail.com
wrote:
If I am not wrong, since the code was hosted within mesos project
repo, I assume (atleast part
Might be a good idea to get the PMC's of both projects to sign off to
prevent future issues with apache.
Regards,
Mridul
On Mon, Jul 20, 2015 at 12:01 PM, Shivaram Venkataraman
shiva...@eecs.berkeley.edu wrote:
I've created https://github.com/amplab/spark-ec2 and added an initial set of
Just to clarify, the proposal is to have a single commit msg giving the
jira and pr id?
That sounds like a good change to have.
Regards
Mridul
On Saturday, July 18, 2015, Reynold Xin r...@databricks.com wrote:
I took a look at the commit messages in git log -- it looks like the
individual
https://plus.google.com/+LinusTorvalds/posts/DiG9qANf5PA
I have noticed a bunch of mails from dev@ and github going to spam -
including spark maliing list.
Might be a good idea for dev, committers to check if they are missing
things in their spam folder if on gmail.
Regards,
Mridul
description
3. List of authors contributing to the patch
The main thing that changes is 3: we used to also include the individual
commits to the pull request branch that are squashed.
On Sat, Jul 18, 2015 at 3:45 PM, Mridul Muralidharan mri...@gmail.com
javascript:_e(%7B%7D,'cvml','mri...@gmail.com
If you can scan input twice, you can of course do per partition count and
build custom RDD which can reparation without shuffle.
But nothing off the shelf as Sandy mentioned.
Regards
Mridul
On Thursday, June 18, 2015, Sandy Ryza sandy.r...@cloudera.com wrote:
Hi Alexander,
There is currently
Hi,
I vaguely remember issues with using float/double as keys in MR (and spark ?).
But cant seem to find documentation/analysis about the same.
Does anyone have some resource/link I can refer to ?
Thanks,
Mridul
-
To
That works when it is launched from same process - which is
unfortunately not our case :-)
- Mridul
On Sun, May 10, 2015 at 9:05 PM, Manku Timma manku.tim...@gmail.com wrote:
sc.applicationId gives the yarn appid.
On 11 May 2015 at 08:13, Mridul Muralidharan mri...@gmail.com wrote:
We had
For tiny/small clusters (particularly single tenet), you can set it to
lower value.
But for anything reasonably large or multi-tenet, the request storm
can be bad if large enough number of applications start aggressively
polling RM.
That is why the interval is set to configurable.
- Mridul
On
We had a similar requirement, and as a stopgap, I currently use a
suboptimal impl specific workaround - parsing it out of the
stdout/stderr (based on log config).
A better means to get to this is indeed required !
Regards,
Mridul
On Sun, May 10, 2015 at 7:33 PM, Ron's Yahoo!
We could build on minimum jdk we support for testing pr's - which will
automatically cause build failures in case code uses newer api ?
Regards,
Mridul
On Fri, May 1, 2015 at 2:46 PM, Reynold Xin r...@databricks.com wrote:
It's really hard to inspect API calls since none of us have the Java
... ;)
On Sat, May 2, 2015 at 1:09 PM, Mridul Muralidharan mri...@gmail.com
wrote:
We could build on minimum jdk we support for testing pr's - which will
automatically cause build failures in case code uses newer api ?
Regards,
Mridul
On Fri, May 1, 2015 at 2:46 PM, Reynold Xin r
I agree, this is better handled by the filesystem cache - not to
mention, being able to do zero copy writes.
Regards,
Mridul
On Sat, May 2, 2015 at 10:26 PM, Reynold Xin r...@databricks.com wrote:
I've personally prototyped completely in-memory shuffle for Spark 3 times.
However, it is unclear
This is a great suggestion - definitely makes sense to have it.
Regards,
Mridul
On Fri, Apr 24, 2015 at 11:08 AM, Patrick Wendell pwend...@gmail.com wrote:
It's a bit of a digression - but Steve's suggestion that we have a
mailing list for new issues is a great idea and we can do it easily.
Cross region as in different data centers ?
- Mridul
On Sun, Mar 15, 2015 at 8:08 PM, lonely Feb lonely8...@gmail.com wrote:
Hi all, i meet up with a problem that torrent broadcast hang out in my
spark cluster (1.2, standalone) , particularly serious when driver and
executors are
Let me try to rephrase my query.
How can a user specify, for example, what the executor memory should
be or number of cores should be.
I dont want a situation where some variables can be specified using
one set of idioms (from this PR for example) and another set cannot
be.
Regards,
Mridul
Who is managing 1.3 release ? You might want to coordinate with them before
porting changes to branch.
Regards
Mridul
On Friday, March 13, 2015, Sean Owen so...@cloudera.com wrote:
Yeah, I'm guessing that is all happening quite literally as we speak.
The Apache git tag is the one of
In ideal situation, +1 on removing all vendor specific builds and
making just hadoop version specific - that is what we should depend on
anyway.
Though I hope Sean is correct in assuming that vendor specific builds
for hadoop 2.4 are just that; and not 2.4- or 2.4+ which cause
incompatibilities
While I dont have any strong opinions about how we handle enum's
either way in spark, I assume the discussion is targetted at (new) api
being designed in spark.
Rewiring what we already have exposed will lead to incompatible api
change (StorageLevel for example, is in 1.0).
Regards,
Mridul
On
I have a strong dislike for java enum's due to the fact that they
are not stable across JVM's - if it undergoes serde, you end up with
unpredictable results at times [1].
One of the reasons why we prevent enum's from being key : though it is
highly possible users might depend on it internally
, seems
promising.
thanks,
Imran
On Tue, Feb 3, 2015 at 7:32 PM, Mridul Muralidharan mri...@gmail.com
javascript:_e(%7B%7D,'cvml','mri...@gmail.com'); wrote:
That is fairly out of date (we used to run some of our jobs on it ... But
that is forked off 1.1 actually).
Regards
Mridul
Congratulations !
Keep up the good work :-)
Regards
Mridul
On Tuesday, February 3, 2015, Matei Zaharia matei.zaha...@gmail.com wrote:
Hi all,
The PMC recently voted to add three new committers: Cheng Lian, Joseph
Bradley and Sean Owen. All three have been major contributors to Spark in
That is fairly out of date (we used to run some of our jobs on it ... But
that is forked off 1.1 actually).
Regards
Mridul
On Tuesday, February 3, 2015, Imran Rashid iras...@cloudera.com wrote:
Thanks for the explanations, makes sense. For the record looks like this
was worked on a while
I second that !
Would also be great if the JIRA was updated accordingly too.
Regards,
Mridul
On Wed, Dec 3, 2014 at 1:53 AM, Kay Ousterhout kayousterh...@gmail.com wrote:
Hi all,
I've noticed a bunch of times lately where a pull request changes to be
pretty different from the original pull
Brilliant stuff ! Congrats all :-)
This is indeed really heartening news !
Regards,
Mridul
On Fri, Oct 10, 2014 at 8:24 PM, Matei Zaharia matei.zaha...@gmail.com wrote:
Hi folks,
I interrupt your regularly scheduled user / dev list to bring you some pretty
cool news for the project, which
Is SPARK-3277 applicable to 1.1 ?
If yes, until it is fixed, I am -1 on the release (I am on break, so can't
verify or help fix, sorry).
Regards
Mridul
On 28-Aug-2014 9:33 pm, Patrick Wendell pwend...@gmail.com wrote:
Please vote on releasing the following candidate as Apache Spark version
and we'll patch it
and spin a new RC. We can also update the test coverage to cover LZ4.
- Patrick
On Thu, Aug 28, 2014 at 9:27 AM, Mridul Muralidharan mri...@gmail.com
wrote:
Is SPARK-3277 applicable to 1.1 ?
If yes, until it is fixed, I am -1 on the release (I am on break, so
can't
Weird that Patrick did not face this while creating the RC.
Essentially the yarn alpha pom.xml has not been updated properly in
the 1.1 branch.
Just change version to '1.1.1-SNAPSHOT' for yarn/alpha/pom.xml (to
make it same as any other pom).
Regards,
Mridul
On Thu, Aug 21, 2014 at 5:09 AM,
Issue with supporting this imo is the fact that scala-test uses the
same vm for all the tests (surefire plugin supports fork, but
scala-test ignores it iirc).
So different tests would initialize different spark context, and can
potentially step on each others toes.
Regards,
Mridul
On Fri, Aug
Just came across this mail, thanks for initiating this discussion Kay.
To add; another issue which recurs is very rapid commit's: before most
contributors have had a chance to even look at the changes proposed.
There is not much prior discussion on the jira or pr, and the time
between submitting
We tried with lower block size for lzf, but it barfed all over the place.
Snappy was the way to go for our jobs.
Regards,
Mridul
On Mon, Jul 14, 2014 at 12:31 PM, Reynold Xin r...@databricks.com wrote:
Hi Spark devs,
I was looking into the memory usage of shuffle and one annoying thing is
Hi,
I noticed today that gmail has been marking most of the mails from
spark github/jira I was receiving to spam folder; and I was assuming
it was lull in activity due to spark summit for past few weeks !
In case I have commented on specific PR/JIRA issues and not followed
up, apologies for
You are ignoring serde costs :-)
- Mridul
On Tue, Jul 8, 2014 at 8:48 PM, Aaron Davidson ilike...@gmail.com wrote:
Tachyon should only be marginally less performant than memory_only, because
we mmap the data from Tachyon's ramdisk. We do not have to, say, transfer
the data over a pipe from
101 - 200 of 228 matches
Mail list logo