I appreciate targets having the strong meaning you suggest, as its useful
to get a sense of what will realistically be included in a release.
Would it make sense (speaking as a relative outsider here) that we would
not enter into the RC phase of a release until all JIRA targeting that
release
Hi Eron,
Please register your Spark Package on http://spark-packages.org, which
helps users find your work. Do you have some performance benchmark to
share? Thanks!
Best,
Xiangrui
On Wed, Jun 10, 2015 at 10:48 PM, Nick Pentreath
nick.pentre...@gmail.com wrote:
Looks very interesting, thanks
So I'm a little confused, has Hive 0.12 support disappeared in 1.4.0 ? The
release notes didn't mention anything, but the documentation doesn't list a
way to build for 0.12 anymore (
http://spark.apache.org/docs/latest/building-spark.html#building-with-hive-and-jdbc-support,
in fact it doesn't
Hey all,
Over the past 1.5 months we added a number of new committers to the project,
and I wanted to welcome them now that all of their respective forms, accounts,
etc are in. Join me in welcoming the following new committers:
- Davies Liu
- DB Tsai
- Kousuke Saruta
- Sandy Ryza
- Yin Huai
Congratulations to All.
DB and Sandy, great works !
On Wed, Jun 17, 2015 at 3:12 PM, Matei Zaharia matei.zaha...@gmail.com
wrote:
Hey all,
Over the past 1.5 months we added a number of new committers to the
project, and I wanted to welcome them now that all of their respective
forms,
This looks really awesome.
On Tue, Jun 16, 2015 at 10:27 AM, Huang, Jie jie.hu...@intel.com wrote:
Hi All
We are happy to announce Performance portal for Apache Spark
http://01org.github.io/sparkscore/ !
The Performance Portal for Apache Spark provides performance data on the
Spark
Hi all,
I think we should refactor some machine learning model classes in Python to
reduce the software maintainability.
Inheriting JavaModelWrapper class, we can easily and directly call Scala API
for the model without PythonMLlibAPI.
In some case, a machine learning model class in Python has
Hi all,
At present, all the clustering algorithms in MLlib require the number of
clusters to be specified in advance.
The Dirichlet process (DP) is a popular non-parametric Bayesian mixture
model that allows for flexible clustering of data without having to specify
apriori the number of
Can someone help? Thank you!
From: Haopu Wang
Sent: Monday, June 15, 2015 3:36 PM
To: user; dev@spark.apache.org
Subject: [SparkStreaming] NPE in DStreamCheckPointData.scala:125
I use the attached program to test checkpoint. It's quite simple.
When I run
We are looking for more workloads – if you guys have any suggestions, let us
know.
-jiangang
From: Sandy Ryza [mailto:sandy.r...@cloudera.com]
Sent: Wednesday, June 17, 2015 5:51 PM
To: Huang, Jie
Cc: u...@spark.apache.org; dev@spark.apache.org
Subject: Re: [SparkScore] Performance portal for
I mostly use Amazon S3 for reading input data and writing output data for my
spark jobs. I want to know the numbers of bytes read written by my job
from S3.
In hadoop, there are FileSystemCounters for this, is there something similar
in spark ? If there is, can you please guide me on how to use
Is there any good sample code in java to implement *Implementing and
Using a Custom Actor-based Receiver .*
--
Thanks Regards,
Anshu Shukla
12 matches
Mail list logo