Open pull request count is down to 254 right now from ~325 several weeks
ago.
Open JIRA count is down slightly to 1262 from a peak over ~1320.
Obviously, in the face of an ever faster and larger stream of contributions.
There's a real positive impact of JIRA being a little more meaningful, a
Open pull request count is down to 254 right now from ~325 several weeks
ago.
This great. Ideally, we need to get this down to 50 and keep it there.
Having so many open pull requests is just a bad signal to contributors. But
it will take some time to get there.
- 1+ Component
Sean, do you
For fun:
http://acha-acha.co/#/repo/https://github.com/apache/spark
I just added Spark to this site. Some of these “achievements” are hilarious.
Leo Tolstoy: More than 10 lines in a commit message
Dangerous Game: Commit after 6PM friday
Nick
So what are we expecting of Hive 0.12.0 builds with this RC? I know not
every combination of Hadoop and Hive versions, etc., can be supported, but
even an example build from the Building Spark page isn't looking too good
to me.
Working from f97b0d4, the example build command works: mvn -Pyarn
Hi Mike,
glmnet has definitely been very successful, and it would be great to see
how we can improve optimization in MLlib! There is some related work
ongoing; here are the JIRAs:
GLMNET implementation in Spark
https://issues.apache.org/jira/browse/SPARK-1673
LinearRegression with L1/L2
How about persisting the computed result table first before caching it?
So that you only need to cache the result table after restarting your
service without recomputing it. Somewhat like checkpointing.
Cheng
On 2/22/15 12:55 AM, nitin wrote:
Hi All,
I intend to build a long running spark
I guess on a technicality the docs just say first item in this RDD, not
first line in the source text file. AFAIK there is no way apart from
filtering to remove header lines
http://stackoverflow.com/a/24734612/877069.
As long as first() always returns the same value for a given RDD, I think
it's
Since RDDs are generally unordered, aren't things like textFile().first() not
guaranteed to return the first row (such as looking for a header row)? If so,
doesn't that make the example in
http://spark.apache.org/docs/1.2.1/quick-start.html#basics misleading?