Have you reviewed this guide?
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
Nick
On Fri, Apr 10, 2015 at 7:29 PM Nitin Mathur ntnmat...@gmail.com wrote:
Hi Spark Dev Team,
I want to start contributing to Spark Open source. This is the first time I
will be doing
I've seen many other OSS projects ask contributors to sign CLAs. I've never
seen us do that.
I assume it's not an issue, since people opening PRs generally understand
what it means. But legally I'm sure there's some danger in taking an
implied vs. explicit license to do something.
So: Do we need
made a
contribution, didn't state anything about the license, but did not
intend somehow that the work could be licensed as the rest of the
project is. For reference Apache projects do not in general require a
CLA.
On Tue, Apr 7, 2015 at 8:59 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote
I've seen other projects use Appveyor http://www.appveyor.com/ for CI on
Windows.
Has anyone used them before?
I've seen on more than one occasion something break on Windows without us
knowing, so it might be worth looking into using something like this if
it's relatively straightforward.
Nick
:54 AM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
This is secondary to Marcelo’s question, but I wanted to comment on this:
Its main limitation is more cultural than technical: you need to get
people
to care about intermittent test runs, otherwise you can end up with
failures
shift towards
2.x at least as defaults.
On Sun, Mar 1, 2015 at 10:59 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
https://github.com/apache/spark/blob/fd8d283eeb98e310b1e85ef8c3a8af
9e547ab5e0/ec2/spark_ec2.py#L162-L164
Is there any reason we shouldn't update the default Hadoop
https://github.com/apache/spark/blob/fd8d283eeb98e310b1e85ef8c3a8af9e547ab5e0/ec2/spark_ec2.py#L162-L164
Is there any reason we shouldn't update the default Hadoop major version in
spark-ec2 to 2?
Nick
The first concern for Spark will probably be to ensure that we still build
and test against Python 2.6, since that's the minimum version of Python we
support.
Otherwise this seems OK. We use numpy and other Python packages in PySpark,
but I don't think we're pinned to any particular version of
it advance the house-cleaning a bit
more, but I'm sure we'd rediscover some important work and issues that need
attention.
On Sun, Feb 22, 2015 at 7:54 AM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
As of right now, there are no more open JIRA issues without an assigned
component
For fun:
http://acha-acha.co/#/repo/https://github.com/apache/spark
I just added Spark to this site. Some of these “achievements” are hilarious.
Leo Tolstoy: More than 10 lines in a commit message
Dangerous Game: Commit after 6PM friday
Nick
I guess on a technicality the docs just say first item in this RDD, not
first line in the source text file. AFAIK there is no way apart from
filtering to remove header lines
http://stackoverflow.com/a/24734612/877069.
As long as first() always returns the same value for a given RDD, I think
it's
for the cleanup!
Nick
On Sat Feb 07 2015 at 8:29:42 PM Nicholas Chammas nicholas.cham...@gmail.com
http://mailto:nicholas.cham...@gmail.com wrote:
Oh derp, missed the YARN component.
JIRA, does allow admins to make fields mandatory:
https://confluence.atlassian.com/display/JIRA/Specifying+Field
FYI: Here is the matching discussion over on the Pants dev list.
https://groups.google.com/forum/#!topic/pants-devel/rTaU-iIOIFE
On Mon Feb 02 2015 at 4:50:33 PM Nicholas Chammas nicholas.cham...@gmail.com
http://mailto:nicholas.cham...@gmail.com wrote:
To reiterate, I'm asking from
Random question for the PySpark and Python experts/enthusiasts on here:
How big of a deal would it be for PySpark and PySpark users if you could
run numpy on PyPy?
PySpark already supports running on PyPy
https://github.com/apache/spark/pull/2144, but libraries like MLlib that
use numpy are not
Found it:
https://github.com/apache/spark/compare/v1.2.0...v1.2.1#diff-73058f8e51951ec0b4cb3d48ade91a1fR73
GRRR BASH WORD SPLITTING
My path has a space in it...
Nick
On Wed Feb 11 2015 at 2:37:39 PM Nicholas Chammas
nicholas.cham...@gmail.com wrote:
This is what get:
spark-1.2.1-bin
lol yeah, I changed the path for the email... turned out to be the issue
itself.
On Wed Feb 11 2015 at 2:43:09 PM Ted Yu yuzhih...@gmail.com wrote:
I see.
'/path/to/spark-1.2.1-bin-hadoop2.4' didn't contain space :-)
On Wed, Feb 11, 2015 at 2:41 PM, Nicholas Chammas
nicholas.cham
The tragic thing here is that I was asked to review the patch that
introduced this
https://github.com/apache/spark/pull/3377#issuecomment-68077315, and
totally missed it... :(
On Wed Feb 11 2015 at 2:46:35 PM Nicholas Chammas
nicholas.cham...@gmail.com wrote:
lol yeah, I changed the path
got the following working
(against a directory with space in its name):
#!/usr/bin/env bash
OLDIFS=$IFS # save it
IFS= # don't split on any white space
dir=$1/*
for f in $dir; do
cat $f
done
IFS=$OLDIFS # restore IFS
Cheers
On Wed, Feb 11, 2015 at 2:47 PM, Nicholas Chammas
-hadoop2.4.0.jar
FYI
On Wed, Feb 11, 2015 at 2:27 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
I just downloaded 1.2.1 pre-built for Hadoop 2.4+ and ran
sbin/start-all.sh
on my OS X.
Failed to find Spark assembly in /path/to/spark-1.2.1-bin-hadoop2.4/lib
You need to build Spark
I just downloaded 1.2.1 pre-built for Hadoop 2.4+ and ran sbin/start-all.sh
on my OS X.
Failed to find Spark assembly in /path/to/spark-1.2.1-bin-hadoop2.4/lib
You need to build Spark before running this program.
Did the same for 1.2.0 and it worked fine.
Nick
+1 to an official deprecation + redirecting users to some other project
that will or already is taking this on.
Nate?
On Mon Feb 09 2015 at 10:08:27 AM Patrick Wendell pwend...@gmail.com
wrote:
I have wondered whether we should sort of deprecated it more
officially, since otherwise I think
:
I think we already have a YARN component.
https://issues.apache.org/jira/issues/?jql=project%20%
3D%20SPARK%20AND%20component%20%3D%20YARN
I don't think JIRA allows it to be mandatory, but if it does, that
would be useful.
On Sat, Feb 7, 2015 at 5:08 PM, Nicholas Chammas
nicholas.cham
at 11:53 AM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
Do we need some new components to be added to the JIRA project?
Like:
-
scheduler
-
YARN
- spark-submit
- ...?
Nick
On Fri Feb 06 2015 at 10:50:41 AM Nicholas Chammas
nicholas.cham
Lemme butt in randomly here and say there is an interesting discussion on
this Spark PR https://github.com/apache/spark/pull/4448 about
netlib-java, JBLAS, Breeze, and other things I know nothing of, that y'all
may find interesting. Among the participants is the author of netlib-java.
On Sun Feb
Do we need some new components to be added to the JIRA project?
Like:
-
scheduler
-
YARN
- spark-submit
- …?
Nick
On Fri Feb 06 2015 at 10:50:41 AM Nicholas Chammas
nicholas.cham...@gmail.com wrote:
+9000 on cleaning up JIRA.
Thank you Sean for laying out some
Y’all may already know this, but I haven’t seen it mentioned anywhere in
our docs on here and it’s a pretty easy win.
Maven supports parallel builds
https://cwiki.apache.org/confluence/display/MAVEN/Parallel+builds+in+Maven+3
with the -T command line option.
For example:
./build/mvn -T 1C
I believe this was changed for 1.2.1. Here are the relevant JIRA issues
https://issues.apache.org/jira/browse/SPARK-5289?jql=project%20%3D%20SPARK%20AND%20fixVersion%20%3D%201.2.1%20AND%20text%20~%20%22publish%22%20order%20by%20priority
.
On Tue Feb 03 2015 at 10:43:59 AM Dirceu Semighini Filho
Congratulations guys!
On Tue Feb 03 2015 at 2:36:12 PM Matei Zaharia matei.zaha...@gmail.com
wrote:
Hi all,
The PMC recently voted to add three new committers: Cheng Lian, Joseph
Bradley and Sean Owen. All three have been major contributors to Spark in
the past year: Cheng on Spark SQL,
for sbt and with a little bit of tweaking with maven as well.
2015-02-02 16:25 GMT-08:00 Nicholas Chammas nicholas.cham...@gmail.com:
Does anyone here have experience with Pants
http://pantsbuild.github.io/index.html or interest in trying to build
Spark with it?
Pants has an interesting story
2015 at 4:40:45 PM Nicholas Chammas
nicholas.cham...@gmail.com wrote:
I'm asking from an experimental standpoint; this is not happening anytime
soon.
Of course, if the experiment turns out very well, Pants would replace both
sbt and Maven (like it has at Twitter, for example). Pants also works
Does anyone here have experience with Pants
http://pantsbuild.github.io/index.html or interest in trying to build
Spark with it?
Pants has an interesting story. It was born at Twitter to help them build
their Scala, Java, and Python projects as several independent components in
one monolithic
Do we have any open JIRA issues to add automated testing on Windows to
Jenkins? I assume that's something we want to do.
On Sat Jan 31 2015 at 10:37:42 PM Matei Zaharia matei.zaha...@gmail.com
wrote:
This looks like a pretty serious problem, thanks! Glad people are testing
on Windows.
Matei
...@databricks.com wrote:
Thanks. I added one.
On Wed, Oct 8, 2014 at 8:49 AM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
I've created SPARK-3849: Automate remaining Scala style rules
https://issues.apache.org/jira/browse/SPARK-3849.
Please create sub-tasks on this issue for rules
What do y'all think of creating a standardized Spark development
environment, perhaps encoded as a Vagrantfile, and publishing it under
`dev/`?
The goal would be to make it easier for new developers to get started with
all the right configs and tools pre-installed.
If we use something like
Message -
From: Nicholas Chammas nicholas.cham...@gmail.com
To: Spark dev list dev@spark.apache.org
Sent: Tuesday, January 20, 2015 6:13:31 PM
Subject: Standardized Spark dev environment
What do y'all think of creating a standardized Spark development
environment, perhaps encoded
Just created: Integrate Python unit tests into Jenkins
https://issues.apache.org/jira/browse/SPARK-5178
Nick
On Fri Jan 09 2015 at 2:48:48 PM Josh Rosen rosenvi...@gmail.com wrote:
The Test Result pages for Jenkins builds shows some nice statistics for
the test run, including individual
Side question: Should this section
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-IDESetup
in
the wiki link to Useful Developer Tools
https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools?
On Thu Jan 08 2015 at 6:19:55 PM Sean Owen
You sent this to the dev list. Please send it instead to the user list.
We use the dev list to discuss development on Spark itself, new features,
fixes to known bugs, and so forth.
The user list is to discuss issues using Spark, which I believe is what you
are looking for.
Nick
On Tue Dec 30
Linkies for the curious:
- SPARK-4501 https://issues.apache.org/jira/browse/SPARK-4501: Create
build/mvn to automatically download maven/zinc/scalac
- https://github.com/apache/spark/pull/3707
- New build folder (mvn and sbt):
https://github.com/apache/spark/tree/master/build
Nick
Do we have access to the SQL specification (say, SQL-92) for reference
during Spark SQL development? I know it's not freely available on the web.
Usually, you can only access drafts.
I know that, generally, we look to other systems (especially Hive) when
figuring out how something in Spark SQL
://github.com/apache/spark
Search with Build Spark with Maven
On Thu, Dec 25, 2014 at 1:49 AM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
The correct docs link is:
https://spark.apache.org/docs/1.2.0/building-spark.html
Where did you get that bad link from?
Nick
On Thu Dec 25
The correct docs link is:
https://spark.apache.org/docs/1.2.0/building-spark.html
Where did you get that bad link from?
Nick
On Thu Dec 25 2014 at 12:00:53 AM Naveen Madhire vmadh...@umail.iu.edu
wrote:
Hi All,
I am starting to use Spark. I am having trouble getting the latest code
from
a close look
at this and I think we're in good shape her vis-a-vis this policy.
- Patrick
On Mon, Dec 22, 2014 at 5:29 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
Hitesh,
From your link:
You may not use ASF trademarks such as Apache or ApacheFoo or Foo
in
your own
Does this include contributions made against the spark-ec2
https://github.com/mesos/spark-ec2 repo?
On Wed Dec 17 2014 at 12:29:19 AM Patrick Wendell pwend...@gmail.com
wrote:
Hey All,
Due to the very high volume of contributions, we're switching to an
automated process for generating
Nicholas Chammas
nicholas.cham...@gmail.com wrote:
I recently came across this blog post, which reminded me of this thread.
How to Discourage Open Source Contributions
http://danluu.com/discourage-oss/
We are currently at 320+ open PRs, many of which haven't been updated in
over a month. We
to give us OAuth keys with repo:status access?
Nick
On Sat Sep 06 2014 at 1:29:53 PM Nicholas Chammas
nicholas.cham...@gmail.com wrote:
Aww, that's a bummer...
On Sat, Sep 6, 2014 at 1:10 PM, Reynold Xin r...@databricks.com wrote:
that would require github hooks permission and unfortunately asf
at 6:02 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
News flash!
From the latest version of the GitHub API
https://developer.github.com/v3/repos/statuses/:
Note that the repo:status OAuth scope
https://developer.github.com/v3/oauth/#scopes grants targeted access
://issues.apache.org/jira/browse/INFRA-7918
On Tue, Dec 16, 2014 at 6:23 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
Actually, reading through the existing issue opened for this
https://issues.apache.org/jira/browse/INFRA-7367 back in February, I
don’t see any explanation from ASF
Every time we run a test cycle on our Jenkins cluster, we generate hundreds
of XML reports covering all the tests we have (e.g.
`streaming/target/test-reports/org.apache.spark.streaming.util.WriteAheadLogSuite.xml`).
These reports contain interesting information about whether tests succeeded
or
request builder? what
others?
On Mon, Dec 15, 2014 at 1:33 AM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
Every time we run a test cycle on our Jenkins cluster, we generate
hundreds
of XML reports covering all the tests we have (e.g.
`streaming/target/test-reports
, 2014 at 11:31 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
What do y’all think of a report like this emailed out to the dev list on a
monthly basis?
The goal would be to increase visibility into our open issues and
encourage
developers to tend to our issue tracker more frequently
https://issues.apache.org/jira/browse/SPARK-636: Add mechanism to run
system management/configuration tasks on all workers
Andrew,
Does that seem more useful?
Nick
On Sun Dec 14 2014 at 3:20:54 AM Nicholas Chammas
nicholas.cham...@gmail.com wrote:
I formatted this report using
What do y’all think of a report like this emailed out to the dev list on a
monthly basis?
The goal would be to increase visibility into our open issues and encourage
developers to tend to our issue tracker more frequently.
Nick
There are 1,236 unresolved issues
Nevermind, seems to be back up now.
On Wed Dec 10 2014 at 7:46:30 PM Nicholas Chammas
nicholas.cham...@gmail.com wrote:
For example: https://issues.apache.org/jira/browse/SPARK-3431
Where do we report/track issues with JIRA itself being down?
Nick
So all this time the tests that Jenkins has been running via Jenkins and
SBT + ScalaTest... those haven't been running any of the Java unit tests?
SPARK-4159 https://issues.apache.org/jira/browse/SPARK-4159 only mentions
Maven as a problem, but I'm wondering how these tests got through Jenkins
on
SPARK-4159.
On Tue, Dec 9, 2014 at 11:30 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
So all this time the tests that Jenkins has been running via Jenkins and
SBT
+ ScalaTest... those haven't been running any of the Java unit tests?
SPARK-4159 only mentions Maven
went out to the dev list once a week that a)
reported the number of stale PRs, and b) directly linked to the 5 least
recently updated PRs?
Nick
On Sat Aug 30 2014 at 3:41:39 AM Nicholas Chammas
nicholas.cham...@gmail.com wrote:
On Tue, Aug 26, 2014 at 2:02 AM, Patrick Wendell pwend...@gmail.com
-obvious things we (as contributors) could do to make the committers¹
lives easier? Thanks!
-Ilya
On 12/8/14, 11:58 AM, Nicholas Chammas nicholas.cham...@gmail.com
wrote:
I recently came across this blog post, which reminded me of this thread.
How to Discourage Open Source Contributions
Ted,
I posted some updates
https://issues.apache.org/jira/browse/SPARK-3431?focusedCommentId=14236540page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-14236540
on
JIRA on my progress (or lack thereof) getting SBT to parallelize test
suites properly. I'm currently stuck
https://github.com/apache/spark/blob/master/docs/building-spark.md#speeding-up-compilation-with-zinc
Could someone summarize how they invoke zinc as part of a regular
build-test-etc. cycle?
I'll add it in to the aforelinked page if appropriate.
Nick
to do
anything for each build.
On Wed, Dec 3, 2014 at 3:44 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
https://github.com/apache/spark/blob/master/docs/
building-spark.md#speeding-up-compilation-with-zinc
Could someone summarize how they invoke zinc as part of a regular
build
it (either on this
thread or in the JIRA issue).
Nick
On Sun Sep 07 2014 at 8:28:51 PM Nicholas Chammas
nicholas.cham...@gmail.com wrote:
On Fri, Aug 8, 2014 at 1:12 PM, Reynold Xin r...@databricks.com wrote:
Nick,
Would you like to file a ticket to track this?
SPARK-3431 https
- currently the docs only contain information about building with maven,
and even then don’t cover many important cases
All other points aside, I just want to point out that the docs document
both how to use Maven and SBT and clearly state
1.1.1 was just released, and 1.2 is close to a release. That, plus
Thanksgiving in the US (where most Spark committers AFAIK are located),
probably means a temporary lull in committer activity on non-critical items
should be expected.
On Mon Nov 24 2014 at 9:33:27 AM York, Brennon
, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
Howdy folks,
I’m trying to understand why I’m getting “insufficient memory” errors when
trying to run Spark Units tests within a CentOS Docker container.
I’m building Spark and running the tests as follows:
# build
sbt/sbt -Pyarn -Phadoop
Howdy folks,
I’m trying to understand why I’m getting “insufficient memory” errors when
trying to run Spark Units tests within a CentOS Docker container.
I’m building Spark and running the tests as follows:
# build
sbt/sbt -Pyarn -Phadoop-2.3 -Dhadoop.version=2.3.0 -Pkinesis-asl
-Phive
The docs on using sbt are here:
https://github.com/apache/spark/blob/master/docs/building-spark.md#building-with-sbt
They'll be published with 1.2.0 presumably.
On 2014년 11월 17일 (월) at 오후 2:49 Michael Armbrust mich...@databricks.com
wrote:
* I moved from sbt to maven in June specifically due
Yeah, kudos to Josh for putting that together.
On Tue, Nov 11, 2014 at 3:26 AM, Yu Ishikawa yuu.ishikawa+sp...@gmail.com
wrote:
Great jobs!
I didn't know Spark PR Dashboard.
Thanks
Yu Ishikawa
-
-- Yu Ishikawa
--
View this message in context:
or the wiki...
On Tue, Nov 11, 2014 at 12:23 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
Yeah, kudos to Josh for putting that together.
On Tue, Nov 11, 2014 at 3:26 AM, Yu Ishikawa
yuu.ishikawa+sp...@gmail.com
wrote:
Great jobs!
I didn't know Spark PR Dashboard.
Thanks
On Sun, Nov 9, 2014 at 1:51 AM, Tathagata Das tathagata.das1...@gmail.com
wrote:
This causes a scalability vs. latency tradeoff - if your limit is 1000
tasks per second (simplifying from 1500), you could either configure
it to use 100 receivers at 100 ms batches (10 blocks/sec), or 1000
AM, Nicholas Chammas
nicholas.cham...@gmail.com
wrote:
Thanks for posting that script, Patrick. It looks like a good place to
start.
Regarding Docker vs. Packer, as I understand it you can use Packer to
create Docker containers at the same time as AMIs and other image types.
Nick
I just watched Kay's talk from 2013 on Sparrow
https://www.youtube.com/watch?v=ayjH_bG-RC0. Is replacing Spark's native
scheduler with Sparrow still on the books?
The Sparrow repo https://github.com/radlab/sparrow hasn't been updated
recently, and I don't see any JIRA issues about it.
It would
If, for example, you have a cluster of 100 machines, this means the
scheduler can launch 150 tasks per machine per second.
Did you mean 15 tasks per machine per second here? Or alternatively, 10
machines?
I don't know of any existing Spark clusters that have a large enough number
of
Sounds good. I'm looking forward to tracking improvements in this area.
Also, just to connect some more dots here, I just remembered that there is
currently an initiative to add an IndexedRDD
https://issues.apache.org/jira/browse/SPARK-2365 interface. Some
interesting use cases mentioned there
to complete in milliseconds.
So it looks like I misunderstood the current cost of task initialization.
It's already as low as 5ms (and not 100ms)?
Nick
On Fri, Nov 7, 2014 at 11:15 PM, Shivaram Venkataraman
shiva...@eecs.berkeley.edu wrote:
On Fri, Nov 7, 2014 at 8:04 PM, Nicholas Chammas
I think better tooling will make it much easier for committers to trim the
list of stale JIRA issues and PRs. Convenience enables action.
- Spark PR Dashboard https://spark-prs.appspot.com/: Additional
filters for stale PRs
https://github.com/databricks/spark-pr-dashboard/issues/1 or PRs
Did you mean to send this to the user list?
This is the dev list, where we discuss things related to development on
Spark itself.
On Thu, Nov 6, 2014 at 5:01 PM, Gordon Benjamin gordon.benjami...@gmail.com
wrote:
Hi All,
I'm using Spark/Shark as the foundation for some reporting that I'm
On Fri, Oct 31, 2014 at 3:45 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
I believe that benchmark has a pending certification on it. See
http://sortbenchmark.org under Process.
Regarding this comment, Reynold has just announced that this benchmark is
now certified
Steve Nunez, I believe the information behind the links below should
address your concerns earlier about Databricks's submission to the Daytona
Gray benchmark.
On Wed, Nov 5, 2014 at 6:43 PM, Nicholas Chammas nicholas.cham...@gmail.com
wrote:
On Fri, Oct 31, 2014 at 3:45 PM, Nicholas Chammas
forgetting to
mention that the last record was held by a 2001 Toyota Celica.
- Steve
From: Nicholas Chammas nicholas.cham...@gmail.com
Date: Wednesday, November 5, 2014 at 15:56
To: Steve Nunez snu...@hortonworks.com
Cc: Patrick Wendell pwend...@gmail.com, dev dev@spark.apache.org
Subject: Re
As part of my work for SPARK-3821
https://issues.apache.org/jira/browse/SPARK-3821, I tried building an AMI
today using create_image.sh.
This line
https://github.com/mesos/spark-ec2/blob/f6773584dd71afc49f1225be48439653313c0341/create_image.sh#L68
appears to be broken now (it wasn’t a week or so
?
http://search-hadoop.com/m/LgpTk2Pnw6O/andrew+apache+mirrorsubj=Re+All+mirrored+download+links+from+the+Apache+Hadoop+site+are+broken
Cheers
On Wed, Nov 5, 2014 at 7:36 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
As part of my work for SPARK-3821
https://issues.apache.org/jira
Yup, I just stumbled on that. I'll submit a PR to fix that link. Thanks Ted.
On Wed, Nov 5, 2014 at 11:13 PM, Ted Yu yuzhih...@gmail.com wrote:
The artifacts are in archive:
http://archive.apache.org/dist/hadoop/common/hadoop-2.4.1/
Cheers
On Nov 5, 2014, at 8:07 PM, Nicholas Chammas
FWIW, the official build instructions are here:
https://github.com/apache/spark#building-spark
On Tue, Nov 4, 2014 at 5:11 PM, Ted Yu yuzhih...@gmail.com wrote:
I built based on this commit today and the build was successful.
What command did you use ?
Cheers
On Tue, Nov 4, 2014 at 2:08
is not available at port 3030 - reverting to normal
incremental compile
Alex
On Tue, Nov 4, 2014 at 3:11 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
FWIW, the official build instructions are here:
https://github.com/apache/spark#building-spark
On Tue, Nov 4, 2014 at 5:11 PM, Ted Yu
in downloading and building
dependencies.
Anyway, if sbt is supported it would be great to add docs about somewhere,
especially since, as you point out, most devs are using it.
Thanks for your help.
Alex
On Tue, Nov 4, 2014 at 5:42 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote
Minor question, but when would be the right time to update the default
Spark version
https://github.com/apache/spark/blob/76386e1a23c55a58c0aeea67820aab2bac71b24b/ec2/spark_ec2.py#L42
in the EC2 script?
On Mon, Nov 3, 2014 at 3:55 AM, Patrick Wendell pwend...@gmail.com wrote:
Hi All,
I've
into documenting it, it's still hard to reproduce :(
On Friday, October 31, 2014, Nicholas Chammas nicholas.cham...@gmail.com
javascript:_e(%7B%7D,'cvml','nicholas.cham...@gmail.com'); wrote:
I believe that benchmark has a pending certification on it. See
http://sortbenchmark.org under Process.
It's true
.; we'll share the code on the
list as soon as we're done.
-Kay
On Fri, Oct 31, 2014 at 12:45 PM, Nicholas Chammas
nicholas.cham...@gmail.com
javascript:_e(%7B%7D,'cvml','nicholas.cham...@gmail.com'); wrote:
I believe that benchmark has a pending certification on it. See
http
to this or any other similar vendor
benchmark.
- Patrick
On Fri, Oct 31, 2014 at 10:38 AM, Nicholas Chammas
nicholas.cham...@gmail.com javascript:; wrote:
I know we don't want to be jumping at every benchmark someone posts out
there, but this one surprised me:
http://www.citusdata.com
and publish enough
information and code to let others repeat the exercise easily.
- Steve
On 10/31/14, 11:30, Nicholas Chammas nicholas.cham...@gmail.com
javascript:; wrote:
Thanks for the response, Patrick.
I guess the key takeaways are 1) the tuning/config details are everything
If one were to put together a short but comprehensive guide to setting up
Spark to run locally on OS X, would it look like this?
# Install Maven. On OS X, we suggest using Homebrew.
brew install maven
# Set some important Java and Maven environment variables.export
usually use SBT on Mac and that one doesn't require any setup ...
On Mon, Oct 20, 2014 at 4:43 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
If one were to put together a short but comprehensive guide to setting up
Spark to run locally on OS X, would it look like this?
# Install
even have to brew install it. Surely SBT isn't in the dev tools even?
I recall I had to install it. I'd be surprised to hear it required
zero setup.
On Mon, Oct 20, 2014 at 8:04 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
Yeah, I would use sbt too, but I thought if I wanted to publish
So back to my original question... :)
If we wanted to post this guide to the user list or to a gist for easy
reference, would we rather have Maven or SBT listed? And is there anything
else about the steps that should be modified?
Nick
On Mon, Oct 20, 2014 at 8:25 PM, Sean Owen
https://news.ycombinator.com/item?id=8471812
The parent thread has lots of interesting use cases for Docker, and the
linked comment seems most relevant to our testing predicament.
I might look into this after I finish something presentable with Packer and
our EC2 scripts, but if anyone else is
is that the frequency that these happen has decreased
significantly (3 in the past ~18hr).
seems like the git plugin downgrade has helped relieve the problem, but
hasn't fixed it. i'll be looking in to this more today.
On Wed, Oct 15, 2014 at 7:05 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote
On Thu, Oct 16, 2014 at 3:55 PM, shane knapp skn...@berkeley.edu wrote:
i really, truly hate non-deterministic failures.
Amen bruddah.
I support this effort. :thumbsup:
On Wed, Oct 15, 2014 at 4:52 PM, shane knapp skn...@berkeley.edu wrote:
i'm going to be downgrading our git plugin (from 2.2.7 to 2.2.2) to see if
that helps w/the git fetch timeouts.
this will require a short downtime (~20 mins for builds to finish, ~20
. :crossestoes: :)
On Wed, Oct 15, 2014 at 2:19 PM, shane knapp skn...@berkeley.edu
wrote:
ok, we're up and building... :crossesfingersfortheumpteenthtime:
On Wed, Oct 15, 2014 at 1:59 PM, Nicholas Chammas
nicholas.cham...@gmail.com wrote:
I support this effort. :thumbsup
301 - 400 of 483 matches
Mail list logo