update their apps, I think it's better
> to make the other small changes in 2.0 at the same time than to update once
> for Dataset and another time for 2.0.
>
> BTW just refer to Reynold's original post for the other proposed API
> changes.
>
> Matei
>
> On N
I think that Kostas' logic still holds. The majority of Spark users, and
likely an even vaster majority of people running vaster jobs, are still on
RDDs and on the cusp of upgrading to DataFrames. Users will probably want
to upgrade to the stable version of the Dataset / DataFrame API so they
To answer your fourth question from Cloudera's perspective, we would never
support a customer running Spark 2.0 on a Hadoop version < 2.6.
-Sandy
On Fri, Nov 20, 2015 at 1:39 PM, Reynold Xin wrote:
> OK I'm not exactly asking for a vote here :)
>
> I don't think we should
Another +1 to Reynold's proposal.
Maybe this is obvious, but I'd like to advocate against a blanket removal
of deprecated / developer APIs. Many APIs can likely be removed without
material impact (e.g. the SparkContext constructor that takes preferred
node location data), while others likely see
Oh and another question - should Spark 2.0 support Java 7?
On Tue, Nov 10, 2015 at 4:53 PM, Sandy Ryza <sandy.r...@cloudera.com> wrote:
> Another +1 to Reynold's proposal.
>
> Maybe this is obvious, but I'd like to advocate against a blanket removal
> of deprecated / develope
Hi Justin,
The Dataset API proposal is available here:
https://issues.apache.org/jira/browse/SPARK-.
-Sandy
On Tue, Nov 3, 2015 at 1:41 PM, Justin Uang wrote:
> Hi,
>
> I was looking through some of the PRs slated for 1.6.0 and I noted
> something called a Dataset,
+1 (non-binding)
built from source and ran some jobs against YARN
-Sandy
On Sat, Aug 29, 2015 at 5:50 AM, vaquar khan vaquar.k...@gmail.com wrote:
+1 (1.5.0 RC2)Compiled on Windows with YARN.
Regards,
Vaquar khan
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total
I see that there's an 1.5.0-rc2 tag in github now. Is that the official
RC2 tag to start trying out?
-Sandy
On Mon, Aug 24, 2015 at 7:23 AM, Sean Owen so...@cloudera.com wrote:
PS Shixiong Zhu is correct that this one has to be fixed:
https://issues.apache.org/jira/browse/SPARK-10168
For
Cool, thanks!
On Mon, Aug 24, 2015 at 2:07 PM, Reynold Xin r...@databricks.com wrote:
Nope --- I cut that last Friday but had an error. I will remove it and cut
a new one.
On Mon, Aug 24, 2015 at 2:06 PM, Sandy Ryza sandy.r...@cloudera.com
wrote:
I see that there's an 1.5.0-rc2 tag
Edit: the first line should read:
val groupedRdd = rdd.map((_, 1)).reduceByKey(_ + _)
On Sun, Jul 19, 2015 at 11:02 AM, Sandy Ryza sandy.r...@cloudera.com
wrote:
This functionality already basically exists in Spark. To create the
grouped RDD, one can run:
val groupedRdd
, Сергей Лихоман sergliho...@gmail.com
wrote:
Thanks for answer! Could you please answer for one more question? Will we
have in memory original rdd and grouped rdd in the same time?
2015-07-19 21:04 GMT+03:00 Sandy Ryza sandy.r...@cloudera.com:
Edit: the first line should read:
val
This functionality already basically exists in Spark. To create the
grouped RDD, one can run:
val groupedRdd = rdd.reduceByKey(_ + _)
To get it back into the original form:
groupedRdd.flatMap(x = List.fill(x._1)(x._2))
-Sandy
-Sandy
On Sun, Jul 19, 2015 at 10:40 AM, Сергей Лихоман
GMT+03:00 Sandy Ryza sandy.r...@cloudera.com:
The user gets to choose what they want to reside in memory. If they call
rdd.cache() on the original RDD, it will be in memory. If they call
rdd.cache() on the compact RDD, it will be in memory. If cache() is called
on both, they'll both
Hi Su,
Spark can't read excel files directly. Your best best is probably to
export the contents as a CSV and use the csvFile API.
-Sandy
On Mon, Jul 13, 2015 at 9:22 AM, spark user spark_u...@yahoo.com.invalid
wrote:
Hi
I need your help to save excel data in hive .
1. how to read
Hi Yash,
One of the main advantages is that, if you turn dynamic allocation on, and
executors are discarded, your application is still able to get at the
shuffle data that they wrote out.
-Sandy
On Thu, Jun 25, 2015 at 11:08 PM, yash datta sau...@gmail.com wrote:
Hi devs,
Can someone point
Hi Alexander,
There is currently no way to create an RDD with more partitions than its
parent RDD without causing a shuffle.
However, if the files are splittable, you can set the Hadoop configurations
that control split size to something smaller so that the HadoopRDD ends up
with more
This looks really awesome.
On Tue, Jun 16, 2015 at 10:27 AM, Huang, Jie jie.hu...@intel.com wrote:
Hi All
We are happy to announce Performance portal for Apache Spark
http://01org.github.io/sparkscore/ !
The Performance Portal for Apache Spark provides performance data on the
Spark
+1 (non-binding)
Built from source and ran some jobs against a pseudo-distributed YARN
cluster.
-Sandy
On Fri, Jun 5, 2015 at 11:05 AM, Ram Sriharsha sriharsha@gmail.com
wrote:
+1 , tested with hadoop 2.6/ yarn on centos 6.5 after building w/ -Pyarn
-Phadoop-2.4 -Dhadoop.version=2.6.0
+1 (non-binding)
Launched against a pseudo-distributed YARN cluster running Hadoop 2.6.0 and
ran some jobs.
-Sandy
On Sat, May 30, 2015 at 3:44 PM, Krishna Sankar ksanka...@gmail.com wrote:
+1 (non-binding, of course)
1. Compiled OSX 10.10 (Yosemite) OK Total time: 17:07 min
mvn clean
Wow, I hadn't noticed this, but 5 seconds is really long. It's true that
it's configurable, but I think we need to provide a decent out-of-the-box
experience. For comparison, the MapReduce equivalent is 1 second.
I filed https://issues.apache.org/jira/browse/SPARK-7533 for this.
-Sandy
On
Hi Twinkle,
Registering the class makes it so that writeClass only writes out a couple
bytes, instead of a full String of the class name.
-Sandy
On Thu, Apr 30, 2015 at 4:13 AM, twinkle sachdeva
twinkle.sachd...@gmail.com wrote:
Hi,
As per the code, KryoSerialization used
to store it as a byte buffer. I want to make
sure this will not cause OOM when the file size is large.
--
Kannan
On Tue, Apr 14, 2015 at 9:07 AM, Sandy Ryza sandy.r...@cloudera.com
wrote:
Hi Kannan,
Both in MapReduce and Spark, the amount of shuffle data a task produces
can exceed
On Fri, Apr 24, 2015 at 3:53 PM Sandy Ryza
sandy.r...@cloudera.com
wrote:
I think there are maybe two separate things we're talking about?
1. Design discussions and in-progress design docs.
My two cents are that JIRA is the best place for this. It allows
tracking
I think there are maybe two separate things we're talking about?
1. Design discussions and in-progress design docs.
My two cents are that JIRA is the best place for this. It allows tracking
the progression of a design across multiple PRs and contributors. A piece
of useful feedback that I've
I think one of the benefits of assignee fields that I've seen in other
projects is their potential to coordinate and prevent duplicate work. It's
really frustrating to put a lot of work into a patch and then find out that
someone has been doing the same. It's helpful for the project etiquette to
+1
Built against Hadoop 2.6 and ran some jobs against a pseudo-distributed
YARN cluster.
-Sandy
On Wed, Apr 8, 2015 at 12:49 PM, Patrick Wendell pwend...@gmail.com wrote:
Oh I see - ah okay I'm guessing it was a transient build error and
I'll get it posted ASAP.
On Wed, Apr 8, 2015 at 3:41
I definitely see the value in this. However, I think at this point it
would be an incompatible behavioral change. People often use count in
Spark to exercise their DAG. Omitting processing steps that were
previously included would likely mislead many users into thinking their
pipeline was
Regarding Patrick's question, you can just do new Configuration(oldConf)
to get a cloned Configuration object and add any new properties to it.
-Sandy
On Wed, Mar 25, 2015 at 4:42 PM, Imran Rashid iras...@cloudera.com wrote:
Hi Nick,
I don't remember the exact details of these scenarios, but
(ZVZOAAI.ELTE)
2015-03-24 16:30 GMT+01:00 Sandy Ryza sandy.r...@cloudera.com:
Hi Zoltan,
If running on YARN, the YARN NodeManager starts executors. I don't think
there's a 100% precise way for the Spark executor way to know how many
resources are allotted to it. It can come close by looking
Hi Zoltan,
If running on YARN, the YARN NodeManager starts executors. I don't think
there's a 100% precise way for the Spark executor way to know how many
resources are allotted to it. It can come close by looking at the Spark
configuration options used to request it (spark.executor.memory and
Hi Guillaume,
I've long thought something like this would be useful - i.e. the ability to
broadcast RDDs directly without first pulling data through the driver. If
I understand correctly, your requirement to block a matrix up and only
fetch the needed parts could be implemented on top of this by
+1 to what Andrew said, I think both make sense in different situations and
trusting developer discretion here is reasonable.
On Mon, Feb 9, 2015 at 1:48 PM, Andrew Or and...@databricks.com wrote:
In my experience I find it much more natural to use // for short multi-line
comments (2 or 3
JIRA updates don't go to this list, they go to iss...@spark.apache.org. I
don't think many are signed up for that list, and those that are probably
have a flood of emails anyway.
So I'd definitely be in favor of any JIRA cleanup that you're up for.
-Sandy
On Fri, Feb 6, 2015 at 6:45 AM, Sean
Hi Dirceu,
Does the issue not show up if you run map(f =
f(1).asInstanceOf[Int]).sum on the train RDD? It appears that f(1) is
an String, not an Int. If you're looking to parse and convert it, toInt
should be used instead of asInstanceOf.
-Sandy
On Wed, Jan 21, 2015 at 8:43 AM, Dirceu
I think clarifying these semantics is definitely worthwhile. Maybe this
complicates the process with additional terminology, but the way I've used
these has been:
+1 - I think this is safe to merge and, barring objections from others, would
merge it immediately.
LGTM - I have no concerns
Yeah, the ASF +1 has become partly overloaded to mean both I would like to see
this feature and this patch should be committed, although, at least in
Hadoop, using +1 on JIRA (as opposed to, say, in a release vote) should
unambiguously mean the latter unless qualified in some other way.
I
Hi Harikrishna,
A good place to start is taking a look at the wiki page on contributing:
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
-Sandy
On Fri, Dec 19, 2014 at 2:43 PM, Harikrishna Kamepalli
harikrishna.kamepa...@gmail.com wrote:
i am interested to contribute
Hi Lochana,
We haven't yet added this in 1.2.
https://issues.apache.org/jira/browse/SPARK-4081 tracks adding categorical
feature indexing, which one-hot encoding can be built on.
https://issues.apache.org/jira/browse/SPARK-1216 also tracks a version of
this prior to the ML pipelines work.
-Sandy
+1 (non-binding). Tested on Ubuntu against YARN.
On Thu, Dec 11, 2014 at 9:38 AM, Reynold Xin r...@databricks.com wrote:
+1
Tested on OS X.
On Wednesday, December 10, 2014, Patrick Wendell pwend...@gmail.com
wrote:
Please vote on releasing the following candidate as Apache Spark
I think that if we were able to maintain the full set of created RDDs as
well as some scheduler and block manager state, it would be enough for most
apps to recover.
On Wed, Dec 10, 2014 at 5:30 AM, Jun Feng Liu liuj...@cn.ibm.com wrote:
Well, it should not be mission impossible thinking there
+1 (non-binding)
built from source
fired up a spark-shell against YARN cluster
ran some jobs using parallelize
ran some jobs that read files
clicked around the web UI
On Sun, Nov 30, 2014 at 1:10 AM, GuoQiang Li wi...@qq.com wrote:
+1 (non-binding)
-- Original
Quizhang,
This is a known issue that ExternalAppendOnlyMap can do tons of tiny spills
in certain situations. SPARK-4452 aims to deal with this issue, but we
haven't finalized a solution yet.
Dinesh's solution should help as a workaround, but you'll likely experience
suboptimal performance when
You're the second person to request this today. Planning to include this in my
PR for Spark-4338.
-Sandy
On Nov 14, 2014, at 8:48 AM, Corey Nolet cjno...@gmail.com wrote:
In the past, I've built it by providing -Dhadoop.version=2.5.1 exactly like
you've mentioned. What prompted me to write
Currently there are no mandatory profiles required to build Spark. I.e.
mvn package just works. It seems sad that we would need to break this.
On Wed, Nov 12, 2014 at 10:59 PM, Patrick Wendell pwend...@gmail.com
wrote:
I think printing an error that says -Pscala-2.10 must be enabled is
a storage policy in which you can specify how
data should be stored. I think that would be a great API to have in the
long run. Designing it won't be trivial though.
On Fri, Nov 7, 2014 at 1:05 AM, Sandy Ryza sandy.r...@cloudera.com
wrote:
Hey all,
Was messing around with Spark
Hey all,
Was messing around with Spark and Google FlatBuffers for fun, and it got me
thinking about Spark and serialization. I know there's been work / talk
about in-memory columnar formats Spark SQL, so maybe there are ways to
provide this flexibility already that I've missed? Either way, my
It looks like the difference between the proposed Spark model and the
CloudStack / SVN model is:
* In the former, maintainers / partial committers are a way of centralizing
oversight over particular components among committers
* In the latter, maintainers / partial committers are a way of giving
This seems like a good idea.
An area that wasn't listed, but that I think could strongly benefit from
maintainers, is the build. Having consistent oversight over Maven, SBT,
and dependencies would allow us to avoid subtle breakages.
Component maintainers have come up several times within the
, Sandy Ryza (sandy.r...@cloudera.com)
wrote:
Hey All,
A couple questions came up about shared variables recently, and I wanted
to
confirm my understanding and update the doc to be a little more clear.
*Broadcast variables*
Now that tasks data is automatically broadcast, the only occasions
Hi Jun,
I believe that's correct that Spark authentication only works against YARN.
-Sandy
On Thu, Sep 11, 2014 at 2:14 AM, Jun Feng Liu liuj...@cn.ibm.com wrote:
Hi, there
I am trying to enable the authentication on spark on standealone model.
Seems like only SparkSubmit load the
After the change to broadcast all task data, is there any easy way to
discover the serialized size of the data getting sent down for a task?
thanks,
-Sandy
It used to be available on the UI, no?
On Thu, Sep 11, 2014 at 6:26 PM, Reynold Xin r...@databricks.com wrote:
I don't think so. We should probably add a line to log it.
On Thursday, September 11, 2014, Sandy Ryza sandy.r...@cloudera.com
wrote:
After the change to broadcast all task data
Hmm, well I can't find it now, must have been hallucinating. Do you know
off the top of your head where I'd be able to find the size to log it?
On Thu, Sep 11, 2014 at 6:33 PM, Reynold Xin r...@databricks.com wrote:
I didn't know about that
On Thu, Sep 11, 2014 at 6:29 PM, Sandy Ryza
That's right
On Tue, Sep 9, 2014 at 2:04 PM, Debasish Das debasish.da...@gmail.com
wrote:
Last time it did not show up on environment tab but I will give it another
shot...Expected behavior is that this env variable will show up right ?
On Tue, Sep 9, 2014 at 12:15 PM, Sandy Ryza sandy.r
Hi Deb,
The current state of the art is to increase
spark.yarn.executor.memoryOverhead until the job stops failing. We do have
plans to try to automatically scale this based on the amount of memory
requested, but it will still just be a heuristic.
-Sandy
On Tue, Sep 9, 2014 at 7:32 AM,
billion ratings...
On Tue, Sep 9, 2014 at 10:49 AM, Sandy Ryza sandy.r...@cloudera.com
wrote:
Hi Deb,
The current state of the art is to increase
spark.yarn.executor.memoryOverhead until the job stops failing. We do have
plans to try to automatically scale this based on the amount of memory
This doesn't help for every dependency, but Spark provides an option to
build the assembly jar without Hadoop and its dependencies. We make use of
this in CDH packaging.
-Sandy
On Tue, Sep 2, 2014 at 2:12 AM, scwf wangf...@huawei.com wrote:
Hi sean owen,
here are some problems when i used
Hi Debasish,
The fix is to raise spark.yarn.executor.memoryOverhead until this goes
away. This controls the buffer between the JVM heap size and the amount of
memory requested from YARN (JVMs can take up memory beyond their heap
size). You should also make sure that, in the YARN NodeManager
Hi Jun,
Spark currently doesn't have that feature, i.e. it aims for a fixed number
of executors per application regardless of resource usage, but it's
definitely worth considering. We could start more executors when we have a
large backlog of tasks and shut some down when we're underutilized.
* E-mail:* *liuj...@cn.ibm.com* liuj...@cn.ibm.com
[image: IBM]
BLD 28,ZGC Software Park
No.8 Rd.Dong Bei Wang West, Dist.Haidian Beijing 100193
China
*Sandy Ryza sandy.r...@cloudera.com sandy.r...@cloudera.com*
2014/08/08 15:14
To
Jun Feng Liu/China/IBM@IBMCN,
cc
Patrick
the first row for every file, or the header only for
the first file. The former is not really supported out of the box by the
input format I think?
On Mon, Jul 21, 2014 at 10:50 PM, Sandy Ryza sandy.r...@cloudera.com
wrote:
It could make sense to add a skipHeader argument
It could make sense to add a skipHeader argument to SparkContext.textFile?
On Mon, Jul 21, 2014 at 10:37 PM, Reynold Xin r...@databricks.com wrote:
If the purpose is for dropping csv headers, perhaps we don't really need a
common drop and only one that drops the first line in a file? I'd
On Wed, Jul 16, 2014 at 4:19 PM, Sandy Ryza sandy.r...@cloudera.com
wrote:
Hi Ron,
I just checked and this bug is fixed in recent releases of Spark.
-Sandy
On Sun, Jul 13, 2014 at 8:15 PM, Chester Chen ches...@alpinenow.com
wrote:
Ron,
Which distribution and Version
Stephen,
Often the shuffle is bound by writes to disk, so even if disks have enough
space to store the uncompressed data, the shuffle can complete faster by
writing less data.
Reynold,
This isn't a big help in the short term, but if we switch to a sort-based
shuffle, we'll only need a single
Woot!
On Thu, Jul 10, 2014 at 11:15 AM, Patrick Wendell patr...@databricks.com
wrote:
Just a heads up, we merged Prashant's work on having the sbt build read all
dependencies from Maven. Please report any issues you find on the dev list
or on JIRA.
One note here for developers, going
Hi Anish,
Spark, like MapReduce, makes an effort to schedule tasks on the same nodes
and racks that the input blocks reside on.
-Sandy
On Tue, Jul 8, 2014 at 12:27 PM, anishs...@yahoo.co.in
anishs...@yahoo.co.in wrote:
Hi All
My apologies for very basic question, do we have full support
Having a common framework for clustering makes sense to me. While we
should be careful about what algorithms we include, having solid
implementations of minibatch clustering and hierarchical clustering seems
like a worthwhile goal, and we should reuse as much code and APIs as
reasonable.
On
Hi Xiaokai,
I think MLLib is definitely interested in supporting additional GLMs. I'm
not aware of anybody working on this at the moment.
-Sandy
On Tue, Jun 17, 2014 at 5:00 PM, Xiaokai Wei x...@palantir.com wrote:
Hi,
I am an intern at PalantirTech and we are building some stuff on top
They should be - in the sense that the docs now recommend using
spark-submit and thus include entirely different invocations.
On Fri, May 30, 2014 at 12:46 AM, Reynold Xin r...@databricks.com wrote:
Can you take a look at the latest Spark 1.0 docs and see if they are fixed?
+1
On Mon, May 26, 2014 at 7:38 AM, Tathagata Das
tathagata.das1...@gmail.comwrote:
Please vote on releasing the following candidate as Apache Spark version
1.0.0!
This has a few important bug fixes on top of rc10:
SPARK-1900 and SPARK-1918: https://github.com/apache/spark/pull/853
---
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai
On Mon, May 19, 2014 at 7:38 PM, Sandy Ryza sandy.r...@cloudera.com
wrote:
It just hit me why this problem is showing up on YARN and not on
standalone.
The relevant difference between YARN
and throw exception if the
code is wrapped in security manager.
Sincerely,
DB Tsai
---
My Blog: https://www.dbtsai.com
LinkedIn: https://www.linkedin.com/in/dbtsai
On Wed, May 21, 2014 at 1:13 PM, Sandy Ryza sandy.r
+1
On Tue, May 20, 2014 at 5:26 PM, Andrew Or and...@databricks.com wrote:
+1
2014-05-20 13:13 GMT-07:00 Tathagata Das tathagata.das1...@gmail.com:
Please vote on releasing the following candidate as Apache Spark version
1.0.0!
This has a few bug fixes on top of rc9:
SPARK-1875:
.
Reflection lets you pick the ClassLoader, yes.
I would not call setContextClassLoader.
On Mon, May 19, 2014 at 12:00 AM, Sandy Ryza
sandy.r...@cloudera.com
wrote:
I spoke with DB offline about this a little while ago and he
confirmed
that
he was able to access
an object in that
way. Since the jars are already in distributed cache before the
executor starts, is there any reason we cannot add the locally cached
jars to classpath directly?
Best,
Xiangrui
On Sun, May 18, 2014 at 4:00 PM, Sandy Ryza sandy.r...@cloudera.com
wrote:
I spoke with DB offline
+1 (non-binding)
* Built the release from source.
* Compiled Java and Scala apps that interact with HDFS against it.
* Ran them in local mode.
* Ran them against a pseudo-distributed YARN cluster in both yarn-client
mode and yarn-cluster mode.
On Tue, May 13, 2014 at 9:09 PM, witgo wi...@qq.com
Hi AJ,
You might find this helpful -
http://blog.cloudera.com/blog/2014/04/how-to-run-a-simple-apache-spark-app-in-cdh-5/
-Sandy
On Sat, May 3, 2014 at 8:42 AM, Ajay Nair prodig...@gmail.com wrote:
Hi,
I have written a code that works just about fine in the spark shell on EC2.
The ec2
are under spark/docs. You can submit a PR for
changes. -Xiangrui
On Mon, Apr 21, 2014 at 6:01 PM, Sandy Ryza
sandy.r...@cloudera.com(mailto:
sandy.r...@cloudera.com) wrote:
How do I get permissions to edit the wiki?
On Mon, Apr 21, 2014 at 3:19 PM, Xiangrui Meng men...@gmail.com(mailto
).
Regards,
Mridul
On Mon, Apr 21, 2014 at 6:25 AM, Sandy Ryza sandy.r...@cloudera.com
wrote:
The issue isn't that the Iterator[P] can't be disk-backed. It's that,
with
a groupBy, each P is a (Key, Values) tuple, and the entire tuple is read
into memory at once. The ShuffledRDD
of this), scalable and parallelizable, well
documented and with reasonable expectation of dev support
Sent from my iPhone
On 21 Apr 2014, at 19:59, Sandy Ryza sandy.r...@cloudera.com wrote:
If it's not done already, would it make sense to codify this philosophy
somewhere? I imagine
/docs. You can submit a PR for
changes. -Xiangrui
On Mon, Apr 21, 2014 at 6:01 PM, Sandy Ryza sandy.r...@cloudera.com
wrote:
How do I get permissions to edit the wiki?
On Mon, Apr 21, 2014 at 3:19 PM, Xiangrui Meng men...@gmail.com wrote:
Cannot agree more with your words. Could you add
:
An iterator does not imply data has to be memory resident.
Think merge sort output as an iterator (disk backed).
Tom is actually planning to work on something similar with me on this
hopefully this or next month.
Regards,
Mridul
On Sun, Apr 20, 2014 at 11:46 PM, Sandy Ryza sandy.r
Hi Priya,
Here's a good place to start:
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
-Sandy
On Fri, Apr 11, 2014 at 12:05 PM, priya arora arora.priya4...@gmail.comwrote:
Hi,
May I know how one can contribute in this project
http://spark.apache.org/mllib/ or in
Our guys are looking into it. I'll post when things are back up.
-Sandy
On Mar 14, 2014, at 7:37 AM, Tom Graves tgraves...@yahoo.com wrote:
It appears the cloudera repo for the mqtt stuff is down again.
Did someone ping them the last time?
Can we pick this up from some other repo?
In the mean time, you don't need to wait for the task to be assigned to you
to start work. If you're worried about someone else picking it up, you can
drop a short comment on the JIRA saying that you're working on it.
On Wed, Mar 12, 2014 at 3:25 PM, Konstantin Boudnik c...@apache.org wrote:
Hi Lars,
Unfortunately, due to some incompatible changes we pulled in to be closer
to YARN trunk, Spark-on-YARN does not work against CDH 4.4+ (but does work
against CDH5)
-Sandy
On Tue, Mar 4, 2014 at 6:33 AM, Tom Graves tgraves...@yahoo.com wrote:
What is your question about Any hints?
@patrick - It seems like my point about being able to inherit the root pom
was addressed and there's a way to handle this.
The larger point I meant to make is that Maven is by far the most common
build tool in projects that are likely to share contributors with Spark. I
personally know 10 people
87 matches
Mail list logo