[
https://issues.apache.org/jira/browse/SPARK-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14092427#comment-14092427
]
Mridul Muralidharan edited comment on SPARK-2962 at 8/11/14 4:35 AM
[
https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091746#comment-14091746
]
Mridul Muralidharan commented on SPARK-2931:
[~kayousterhout] this is weird, I
[
https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mridul Muralidharan updated SPARK-2931:
---
Attachment: test.patch
A patch to showcase the exception
getAllowedLocalityLevel
[
https://issues.apache.org/jira/browse/SPARK-2931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14091881#comment-14091881
]
Mridul Muralidharan commented on SPARK-2931:
[~joshrosen] [~kayousterhout
Issue with supporting this imo is the fact that scala-test uses the
same vm for all the tests (surefire plugin supports fork, but
scala-test ignores it iirc).
So different tests would initialize different spark context, and can
potentially step on each others toes.
Regards,
Mridul
On Fri, Aug
[
https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088018#comment-14088018
]
Mridul Muralidharan commented on SPARK-2881:
To add, this will affect spark
[
https://issues.apache.org/jira/browse/SPARK-2881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14088018#comment-14088018
]
Mridul Muralidharan edited comment on SPARK-2881 at 8/6/14 6:45 PM
Just came across this mail, thanks for initiating this discussion Kay.
To add; another issue which recurs is very rapid commit's: before most
contributors have had a chance to even look at the changes proposed.
There is not much prior discussion on the jira or pr, and the time
between submitting
[
https://issues.apache.org/jira/browse/SPARK-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14074186#comment-14074186
]
Mridul Muralidharan commented on SPARK-2685:
We moved to using
Mridul Muralidharan created SPARK-2532:
--
Summary: Fix issues with consolidated shuffle
Key: SPARK-2532
URL: https://issues.apache.org/jira/browse/SPARK-2532
Project: Spark
Issue Type
[
https://issues.apache.org/jira/browse/SPARK-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14060543#comment-14060543
]
Mridul Muralidharan commented on SPARK-2468:
We map the file content
[
https://issues.apache.org/jira/browse/SPARK-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14060545#comment-14060545
]
Mridul Muralidharan commented on SPARK-2468:
Writing mmap'ed buffers
[
https://issues.apache.org/jira/browse/SPARK-2468?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061094#comment-14061094
]
Mridul Muralidharan commented on SPARK-2468:
Ah, small files - those
We tried with lower block size for lzf, but it barfed all over the place.
Snappy was the way to go for our jobs.
Regards,
Mridul
On Mon, Jul 14, 2014 at 12:31 PM, Reynold Xin r...@databricks.com wrote:
Hi Spark devs,
I was looking into the memory usage of shuffle and one annoying thing is
[
https://issues.apache.org/jira/browse/SPARK-2398?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14060113#comment-14060113
]
Mridul Muralidharan commented on SPARK-2398:
As discussed in the PR, I am
You are lucky :-) for some of our jobs, in a 8gb container, overhead is
1.8gb !
On 13-Jul-2014 2:40 pm, nishkamravi2 g...@git.apache.org wrote:
Github user nishkamravi2 commented on the pull request:
https://github.com/apache/spark/pull/1391#issuecomment-48835560
Sean, the
Hi,
I noticed today that gmail has been marking most of the mails from
spark github/jira I was receiving to spam folder; and I was assuming
it was lull in activity due to spark summit for past few weeks !
In case I have commented on specific PR/JIRA issues and not followed
up, apologies for
You are ignoring serde costs :-)
- Mridul
On Tue, Jul 8, 2014 at 8:48 PM, Aaron Davidson ilike...@gmail.com wrote:
Tachyon should only be marginally less performant than memory_only, because
we mmap the data from Tachyon's ramdisk. We do not have to, say, transfer
the data over a pipe from
[
https://issues.apache.org/jira/browse/SPARK-2390?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14054162#comment-14054162
]
Mridul Muralidharan commented on SPARK-2390:
Here, and a bunch of other places
[
https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052275#comment-14052275
]
Mridul Muralidharan commented on SPARK-2277:
Hmm, good point - that PR does
[
https://issues.apache.org/jira/browse/SPARK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052289#comment-14052289
]
Mridul Muralidharan commented on SPARK-2017:
With aggregated metrics, we loose
[
https://issues.apache.org/jira/browse/SPARK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14052679#comment-14052679
]
Mridul Muralidharan commented on SPARK-2017:
Sounds great, ability to get
= 0 using a compressed bitmap. That way we can still avoid
requests for zero-sized blocks.
On Thu, Jul 3, 2014 at 3:12 PM, Reynold Xin r...@databricks.com wrote:
Yes, that number is likely == 0 in any real workload ...
On Thu, Jul 3, 2014 at 8:01 AM, Mridul Muralidharan mri...@gmail.com
Mridul Muralidharan created SPARK-2353:
--
Summary: ArrayIndexOutOfBoundsException in scheduler
Key: SPARK-2353
URL: https://issues.apache.org/jira/browse/SPARK-2353
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14051575#comment-14051575
]
Mridul Muralidharan commented on SPARK-2277:
I have not rechecked
[
https://issues.apache.org/jira/browse/SPARK-2353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mridul Muralidharan updated SPARK-2353:
---
Description:
I suspect the recent changes from SPARK-1937 to compute valid locality
On Thu, Jul 3, 2014 at 11:32 AM, Reynold Xin r...@databricks.com wrote:
On Wed, Jul 2, 2014 at 3:44 AM, Mridul Muralidharan mri...@gmail.com
wrote:
The other thing we do need is the location of blocks. This is actually
just
O(n) because we just need to know where the map was run
[
https://issues.apache.org/jira/browse/SPARK-2277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14050886#comment-14050886
]
Mridul Muralidharan commented on SPARK-2277:
I am not sure I follow
Hi Patrick,
Please see inline.
Regards,
Mridul
On Wed, Jul 2, 2014 at 10:52 AM, Patrick Wendell pwend...@gmail.com wrote:
b) Instead of pulling this information, push it to executors as part
of task submission. (What Patrick mentioned ?)
(1) a.1 from above is still an issue for this.
I
,
Mridul
On Tue, Jul 1, 2014 at 2:51 AM, Mridul Muralidharan mri...@gmail.com
wrote:
We had considered both approaches (if I understood the suggestions right) :
a) Pulling only map output states for tasks which run on the reducer
by modifying the Actor. (Probably along lines of what Aaron
the executor returns the result of a task when it's too big
for akka. We were thinking of refactoring this too, as using the block
manager has much higher latency than a direct TCP send.
On Mon, Jun 30, 2014 at 12:13 PM, Mridul Muralidharan mri...@gmail.com
wrote:
Our current hack is to use
[
https://issues.apache.org/jira/browse/SPARK-2294?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14045433#comment-14045433
]
Mridul Muralidharan commented on SPARK-2294:
I agree; We should bump
[
https://issues.apache.org/jira/browse/SPARK-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043088#comment-14043088
]
Mridul Muralidharan commented on SPARK-2268:
That is not because of this hook
[
https://issues.apache.org/jira/browse/SPARK-2268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14043071#comment-14043071
]
Mridul Muralidharan commented on SPARK-2268:
Setting priority for shutdown
,
Can you comment a little bit more on this issue? We are running into the
same stack trace but not sure whether it is just different Spark versions
on each cluster (doesn't seem likely) or a bug in Spark.
Thanks.
On Sat, May 17, 2014 at 4:41 AM, Mridul Muralidharan mri...@gmail.com
wrote
[
https://issues.apache.org/jira/browse/SPARK-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039742#comment-14039742
]
Mridul Muralidharan commented on SPARK-704:
---
If remote node goes down
[
https://issues.apache.org/jira/browse/SPARK-704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039742#comment-14039742
]
Mridul Muralidharan edited comment on SPARK-704 at 6/21/14 9:10 AM
[
https://issues.apache.org/jira/browse/SPARK-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039217#comment-14039217
]
Mridul Muralidharan commented on SPARK-2223:
[~tgraves] You could try running
[
https://issues.apache.org/jira/browse/SPARK-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14039236#comment-14039236
]
Mridul Muralidharan commented on SPARK-2089:
[~pwendell] SplitInfo is not from
On Wed, Jun 18, 2014 at 6:19 PM, Surendranauth Hiraman
suren.hira...@velos.io wrote:
Patrick,
My team is using shuffle consolidation but not speculation. We are also
using persist(DISK_ONLY) for caching.
Use of shuffle consolidation is probably what is causing the issue.
Would be good idea
[
https://issues.apache.org/jira/browse/SPARK-1353?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14033625#comment-14033625
]
Mridul Muralidharan commented on SPARK-1353:
This is due to limitation
[
https://issues.apache.org/jira/browse/SPARK-2018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14027808#comment-14027808
]
Mridul Muralidharan commented on SPARK-2018:
Ah ! This is an interesting bug
[
https://issues.apache.org/jira/browse/SPARK-2089?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14026397#comment-14026397
]
Mridul Muralidharan commented on SPARK-2089:
preferredNodeLocationData used
[
https://issues.apache.org/jira/browse/SPARK-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020789#comment-14020789
]
Mridul Muralidharan commented on SPARK-2064:
Depending on how long a job runs
[
https://issues.apache.org/jira/browse/SPARK-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14020936#comment-14020936
]
Mridul Muralidharan commented on SPARK-2064:
It is 100 MB (or more) of memory
[
https://issues.apache.org/jira/browse/SPARK-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021008#comment-14021008
]
Mridul Muralidharan commented on SPARK-2064:
Unfortunately OOM is a very big
[
https://issues.apache.org/jira/browse/SPARK-2064?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14021011#comment-14021011
]
Mridul Muralidharan commented on SPARK-2064:
I am probably missing the intent
[
https://issues.apache.org/jira/browse/SPARK-2017?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14019394#comment-14019394
]
Mridul Muralidharan commented on SPARK-2017:
Currently, for our jobs, I run
[
https://issues.apache.org/jira/browse/SPARK-1956?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14011741#comment-14011741
]
Mridul Muralidharan commented on SPARK-1956:
shuffle consolidation MUST
[
https://issues.apache.org/jira/browse/SPARK-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14001377#comment-14001377
]
Mridul Muralidharan commented on SPARK-1855:
Did not realize that mail replies
:38 AM, Mridul Muralidharan mri...@gmail.com
wrote:
I had echoed similar sentiments a while back when there was a
discussion
around 0.10 vs 1.0 ... I would have preferred 0.10 to stabilize
the
api
changes, add missing functionality, go through a hardening release
before
guaranteed 1.0.0 baseline.
On Sat, May 17, 2014 at 2:05 PM, Mridul Muralidharan
mri...@gmail.comwrote:
I would make the case for interface stability not just api stability.
Particularly given that we have significantly changed some of our
interfaces, I want to ensure developers/users
avoid hitting disk if we have
enough memory to use. We need to investigate more to find a good
solution. -Xiangrui
On Fri, May 16, 2014 at 4:00 PM, Mridul Muralidharan mri...@gmail.com
wrote:
Effectively this is persist without fault tolerance.
Failure of any node means complete lack of fault
I had echoed similar sentiments a while back when there was a discussion
around 0.10 vs 1.0 ... I would have preferred 0.10 to stabilize the api
changes, add missing functionality, go through a hardening release before
1.0
But the community preferred a 1.0 :-)
Regards,
Mridul
On 17-May-2014
I suspect this is an issue we have fixed internally here as part of a
larger change - the issue we fixed was not a config issue but bugs in spark.
Unfortunately we plan to contribute this as part of 1.1
Regards,
Mridul
On 17-May-2014 4:09 pm, sam (JIRA) j...@apache.org wrote:
sam created
.
On Sat, May 17, 2014 at 4:26 AM, Mridul Muralidharan mri...@gmail.com
wrote:
I had echoed similar sentiments a while back when there was a discussion
around 0.10 vs 1.0 ... I would have preferred 0.10 to stabilize the api
changes, add missing functionality, go through a hardening release
the discussion.
Regards
Mridul
issue, and what I am asking, is which pending bug fixes does anyone
anticipate will require breaking the public API guaranteed in rc9
On Sat, May 17, 2014 at 9:44 AM, Mridul Muralidharan mri...@gmail.com
wrote:
We made incompatible api changes whose impact
Mridul
If you can tell me about specific changes in the current release
candidate
that occasion new arguments for why a 1.0 release is an unacceptable idea,
then I'm listening.
On Sat, May 17, 2014 at 11:59 AM, Mridul Muralidharan mri...@gmail.com
wrote:
On 17-May-2014 11:40 pm, Mark Hamstra m
, Andrew Ash and...@andrewash.com
wrote:
+1 on the next release feeling more like a 0.10 than a 1.0
On May 17, 2014 4:38 AM, Mridul Muralidharan mri...@gmail.com
wrote:
I had echoed similar sentiments a while back when there was a
discussion
around 0.10 vs 1.0 ... I would have preferred
[
https://issues.apache.org/jira/browse/SPARK-1849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14000397#comment-14000397
]
Mridul Muralidharan commented on SPARK-1849:
Looks like textFile is probably
Effectively this is persist without fault tolerance.
Failure of any node means complete lack of fault tolerance.
I would be very skeptical of truncating lineage if it is not reliable.
On 17-May-2014 3:49 am, Xiangrui Meng (JIRA) j...@apache.org wrote:
Xiangrui Meng created SPARK-1855:
So was rc5 cancelled ? Did not see a note indicating that or why ... [1]
- Mridul
[1] could have easily missed it in the email storm though !
On Thu, May 15, 2014 at 1:32 AM, Patrick Wendell pwend...@gmail.com wrote:
Please vote on releasing the following candidate as Apache Spark version
Hi Sandy,
I assume you are referring to caching added to datanodes via new caching
api via NN ? (To preemptively mmap blocks).
I have not looked in detail, but does NN tell us about this in block
locations?
If yes, we can simply make those process local instead of node local for
executors on
[
https://issues.apache.org/jira/browse/SPARK-1813?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13996390#comment-13996390
]
Mridul Muralidharan commented on SPARK-1813:
Writing a KryoRegistrator
On a slightly related note (apologies Soren for hijacking the thread),
Reynold how much better is kryo from spark's usage point of view
compared to the default java serialization (in general, not for
closures) ?
The numbers on kyro site are interesting, but since you have played
the most with kryo
[
https://issues.apache.org/jira/browse/SPARK-1606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988756#comment-13988756
]
Mridul Muralidharan commented on SPARK-1606:
Crap, got to this too late.
We
[
https://issues.apache.org/jira/browse/SPARK-1706?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13988868#comment-13988868
]
Mridul Muralidharan commented on SPARK-1706:
Oh my, this was supposed
[
https://issues.apache.org/jira/browse/SPARK-1576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981313#comment-13981313
]
Mridul Muralidharan commented on SPARK-1576:
There is a misunderstanding here
[
https://issues.apache.org/jira/browse/SPARK-1586?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13981321#comment-13981321
]
Mridul Muralidharan commented on SPARK-1586:
Immediate issues fixed though
[
https://issues.apache.org/jira/browse/SPARK-1587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mridul Muralidharan resolved SPARK-1587.
Resolution: Fixed
Fixed, https://github.com/apache/spark/pull/504
Fix thread
[
https://issues.apache.org/jira/browse/BOOKKEEPER-560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mridul Muralidharan updated BOOKKEEPER-560:
---
Assignee: (was: Mridul Muralidharan)
Create readme for hedwig
[
https://issues.apache.org/jira/browse/BOOKKEEPER-648?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Mridul Muralidharan updated BOOKKEEPER-648:
---
Assignee: (was: Mridul Muralidharan)
BasicJMSTest failed
[
https://issues.apache.org/jira/browse/SPARK-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13978827#comment-13978827
]
Mridul Muralidharan commented on SPARK-1588:
Apparently, SPARK_YARN_USER_ENV
An iterator does not imply data has to be memory resident.
Think merge sort output as an iterator (disk backed).
Tom is actually planning to work on something similar with me on this
hopefully this or next month.
Regards,
Mridul
On Sun, Apr 20, 2014 at 11:46 PM, Sandy Ryza
[
https://issues.apache.org/jira/browse/SPARK-1524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972864#comment-13972864
]
Mridul Muralidharan commented on SPARK-1524:
The expectation is to fallback
[
https://issues.apache.org/jira/browse/SPARK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13972978#comment-13972978
]
Mridul Muralidharan commented on SPARK-1476:
[~matei] We are having some
[
https://issues.apache.org/jira/browse/SPARK-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13968390#comment-13968390
]
Mridul Muralidharan commented on SPARK-1453:
(d) becomes relevant in case
[
https://issues.apache.org/jira/browse/SPARK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967854#comment-13967854
]
Mridul Muralidharan edited comment on SPARK-1476 at 4/13/14 2:45 PM
[
https://issues.apache.org/jira/browse/SPARK-1476?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967419#comment-13967419
]
Mridul Muralidharan commented on SPARK-1476:
WIP Proposal:
- All references
[
https://issues.apache.org/jira/browse/SPARK-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13966332#comment-13966332
]
Mridul Muralidharan commented on SPARK-1391:
Another place where
[
https://issues.apache.org/jira/browse/SPARK-542?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967185#comment-13967185
]
Mridul Muralidharan commented on SPARK-542:
---
Spark uses only hostnames - not ip's
[
https://issues.apache.org/jira/browse/SPARK-1453?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13967193#comment-13967193
]
Mridul Muralidharan commented on SPARK-1453:
The timeout gets hit only when we
Hi,
We have a requirement to use a (potential) ephemeral storage, which
is not within the VM, which is strongly tied to a worker node. So
source of truth for a block would still be within spark; but to
actually do computation, we would need to copy data to external device
(where it might lie
is stored in a remote cluster or machines. And the
goal is to load the remote raw data only once?
Haoyuan
On Sat, Apr 5, 2014 at 4:30 PM, Mridul Muralidharan mri...@gmail.com
wrote:
Hi,
We have a requirement to use a (potential) ephemeral storage, which
is not within the VM, which
Hi,
So we are now receiving updates from three sources for each change to the PR.
While each of them handles a corner case which others might miss,
would be great if we could minimize the volume of duplicated
communication.
Regards,
Mridul
unsubscribe yourself from any of these sources, right?
- Patrick
On Sat, Mar 29, 2014 at 11:05 AM, Mridul Muralidharan
mri...@gmail.comwrote:
Hi,
So we are now receiving updates from three sources for each change to
the PR.
While each of them handles a corner case which others might miss
reasonably long running job (30 mins+) working on non
trivial dataset will fail due to accumulated failures in spark.
Regards,
Mridul
TD
On Tue, Mar 25, 2014 at 8:44 PM, Mridul Muralidharan mri...@gmail.comwrote:
Forgot to mention this in the earlier request for PR's
Would be great if the garbage collection PR is also committed - if not
the whole thing, atleast the part to unpersist broadcast variables
explicitly would be great.
Currently we are running with a custom impl which does something
similar, and I would like to move to standard distribution for that.
of April (not too far
;) ).
TD
On Wed, Mar 19, 2014 at 5:57 PM, Mridul Muralidharan mri...@gmail.comwrote:
Would be great if the garbage collection PR is also committed - if not
the whole thing, atleast the part to unpersist broadcast variables
explicitly would be great.
Currently we
Wonderful news ! Congrats all :-)
Regards,
Mridul
On Feb 20, 2014 10:07 PM, Andy Konwinski andykonwin...@gmail.com wrote:
Congrats Spark community! I think this means we are officially now a TLP!
-- Forwarded message --
From: Brett Porter chair...@apache.org
Date: Feb 19,
PM, Mridul Muralidharan mri...@gmail.com
wrote:
Case 3 can be a potential issue.
Current implementation might be returning a concrete class which we
might want to change later - making it a type change.
The intention might be to return an RDD (for example), but the
inferred type might
brought up is not a matter of readability or style. If it
returns a different type, it should be declared (otherwise it is just
wrong).
On Wed, Feb 19, 2014 at 12:17 AM, Mridul Muralidharan mri...@gmail.com
wrote:
You are right.
A degenerate case would be :
def createFoo = new FooImpl
example?
def myFunc = createFoo
is disallowed in my guideline. It is invoking a function createFoo, not the
constructor of Foo.
On Wed, Feb 19, 2014 at 10:39 AM, Mridul Muralidharan mri...@gmail.com
wrote:
Without bikeshedding this too much ... It is likely incorrect (not
wrong
19, 2014 at 10:39 AM, Mridul Muralidharan mri...@gmail.com
wrote:
Without bikeshedding this too much ... It is likely incorrect (not
wrong) -
and rules like this potentially cause things to slip through.
Explicit return type strictly specifies what is being exposed (think
I had not resolved it in time for 0.9 - but IIRC there was a recent PR
which fixed bugs in spill [1] : are you able to reproduce this with
spark master ?
Regards,
Mridul
[1] https://github.com/apache/incubator-spark/pull/533
On Wed, Feb 19, 2014 at 9:58 AM, Andrew Ash and...@andrewash.com
Case 3 can be a potential issue.
Current implementation might be returning a concrete class which we
might want to change later - making it a type change.
The intention might be to return an RDD (for example), but the
inferred type might be a subclass of RDD - and future changes will
cause
There is nothing wrong with 9k partitions - I actually use much higher :-) [1]
I have not really seen this interesting issue you mentioned - should
investigate more, thanks for the note !
Regards,
Mridul
[1] I do use insanely high frame size anyway - and my workers/master
run with 8g; maybe why
+1 !
- Mridul
On Tue, Feb 11, 2014 at 9:57 AM, Chris Mattmann mattm...@apache.org wrote:
Hi Everyone,
This is a new VOTE to decide if Apache Spark should graduate
from the Incubator. Please VOTE on the resolution pasted below
the ballot. I'll leave this VOTE open for at least 72 hours.
The reason I explicitly mentioned about binary compatibility was
because it was sort of hand waved in the proposal as good to have.
My understanding is that scala does make it painful to ensure binary
compatibility - but stability of interfaces is vital to ensure
dependable platforms.
shutdown hooks should not take 15 mins are you mentioned !
On the other hand, how busy was your disk when this was happening ?
(either due to spark or something else ?)
It might just be that there was a lot of stuff to remove ?
Regards,
Mridul
On Thu, Feb 6, 2014 at 3:50 PM, Andrew Ash
801 - 900 of 1226 matches
Mail list logo