Hi
spark version I am using is spark-0.9.1-bin-hadoop2
I build spark-assembly_2.10-0.9.1-hadoop2.2.0.jar
I moved JavaKafkaWordCount.java from examples to new directory to play
with it.
My compile commands:
javac -cp
I'm thrilled to announce the availability of Spark 1.0.0! Spark 1.0.0
is a milestone release as the first in the 1.0 line of releases,
providing API stability for Spark's core interfaces.
Spark 1.0.0 is Spark's largest release ever, with contributions from
117 developers. I'd like to thank
Awesome work, Pat et al.!
--
Christopher T. Nguyen
Co-founder CEO, Adatao http://adatao.com
linkedin.com/in/ctnguyen
On Fri, May 30, 2014 at 3:12 AM, Patrick Wendell pwend...@gmail.com wrote:
I'm thrilled to announce the availability of Spark 1.0.0! Spark 1.0.0
is a milestone release as
Please update the http://spark.apache.org/docs/latest/ link
On Fri, May 30, 2014 at 4:03 PM, Margusja mar...@roo.ee wrote:
Is it possible to download pre build package?
http://mirror.symnds.com/software/Apache/incubator/
spark/spark-1.0.0/spark-1.0.0-bin-hadoop2.tgz - gives me 404
Best
It is updated - try holding Shift + refresh in your browser, you are
probably caching the page.
On Fri, May 30, 2014 at 3:46 AM, prabeesh k prabsma...@gmail.com wrote:
Please update the http://spark.apache.org/docs/latest/ link
On Fri, May 30, 2014 at 4:03 PM, Margusja mar...@roo.ee wrote:
Now I can download. Thanks.
Best regards, Margus (Margusja) Roo
+372 51 48 780
http://margus.roo.ee
http://ee.linkedin.com/in/margusroo
skype: margusja
ldapsearch -x -h ldap.sk.ee -b c=EE (serialNumber=37303140314)
On 30/05/14 13:48, Patrick Wendell wrote:
It is updated - try holding Shift +
Hi all
In https://spark.apache.org/downloads.html, the URL for release note of 1.0.0
seems to be wrong.
The URL should be https://spark.apache.org/releases/spark-release-1-0-0.html
but links to https://spark.apache.org/releases/spark-release-1.0.0.html
Best Regards,
Kousuke
From:
All:
In the pom.xml file I see the MapR repository, but it's not included in the
./project/SparkBuild.scala file. Is this expected? I know to build I have
to add it there otherwise sbt hates me with evil red messages and such.
John
On Fri, May 30, 2014 at 6:24 AM, Kousuke Saruta
Awesome work
On Fri, May 30, 2014 at 12:12 PM, Patrick Wendell pwend...@gmail.com
wrote:
I'm thrilled to announce the availability of Spark 1.0.0! Spark 1.0.0
is a milestone release as the first in the 1.0 line of releases,
providing API stability for Spark's core interfaces.
Spark 1.0.0
By the way:
This is great work. I am new to the spark world, and have been like a kid
in a candy store learnign all it can do.
Is there a good list of build variables? What I me is like the SPARK_HIVE
variable described on the Spark SQL page. I'd like to include that, but
once I found that I
My primary goal : To get top 10 hashtag for every 5 mins interval.
I want to do this efficiently. I have already done this by using
reducebykeyandwindow() and then sorting all hashtag in 5 mins interval
taking only top 10 elements. But this is very slow.
So I now I am thinking of retaining only
thanks for the reply. I am definitely running 1.0.0, I set it up manually.
To answer my question, I found out from the examples that it would need a
new data type called LabeledPoint instead of numpy array.
--
View this message in context:
How exciting! Congratulations! :-)
Ognen
On 5/30/14, 5:12 AM, Patrick Wendell wrote:
I'm thrilled to announce the availability of Spark 1.0.0! Spark 1.0.0
is a milestone release as the first in the 1.0 line of releases,
providing API stability for Spark's core interfaces.
Spark 1.0.0 is
Congratulations !!
-chanwit
--
Chanwit Kaewkasi
linkedin.com/in/chanwit
On Fri, May 30, 2014 at 5:12 PM, Patrick Wendell pwend...@gmail.com wrote:
I'm thrilled to announce the availability of Spark 1.0.0! Spark 1.0.0
is a milestone release as the first in the 1.0 line of releases,
providing
I was annoyed by this as well.
It appears that just permuting the order of decencies inclusion solves this
problem:
first spark, than your cdh hadoop distro.
HTH,
Pierre
--
View this message in context:
Hi,
i just migrate to 1.0. Still having the same issue.
Either with or without the custom registrator. Just the usage of the
KryoSerializer triggers the exception immediately.
I set the kryo settings through the property:
System.setProperty(spark.serializer, org.apache.spark.serializer.
Congratulations!!
On Fri, May 30, 2014 at 5:12 AM, Patrick Wendell pwend...@gmail.com wrote:
I'm thrilled to announce the availability of Spark 1.0.0! Spark 1.0.0
is a milestone release as the first in the 1.0 line of releases,
providing API stability for Spark's core interfaces.
Spark
Hi,
I recently posted a question on stackoverflow but didn't get any reply. I
joined the mailing list now. Can anyone of you guide me a way for the
problem mentioned in
http://stackoverflow.com/questions/23923966/writing-the-rdd-data-in-excel-file-along-mapping-in-apache-spark
Thanks in advance
The Spark 1.0.0 release notes state Internal instrumentation has been
added to allow applications to monitor and instrument Spark jobs. Can
anyone point me to the docs for this?
--
Daniel Siegmann, Software Developer
Velos
Accelerating Machine Learning
440 NINTH AVENUE, 11TH FLOOR, NEW YORK, NY
Congrats
Sent from my Windows Phone
From: Dean Wamplermailto:deanwamp...@gmail.com
Sent: 5/30/2014 6:53 AM
To: user@spark.apache.orgmailto:user@spark.apache.org
Subject: Re: Announcing Spark 1.0.0
Congratulations!!
On Fri, May 30, 2014 at 5:12 AM, Patrick
Hi all,
I am planning to use spark with HBase, where I generate RDD by reading data
from HBase Table.
I want to know that in the case when the size of HBase Table grows larger
than the size of RAM available in the cluster, will the application fail,
or will there be an impact in performance ?
Thanks Mayur for the reply.
Actually issue was the I was running Spark application on hadoop-2.2.0 and
hbase version there was 0.95.2.
But spark by default gets build by an older hbase version. So I had to
build spark again with hbase version as 0.95.2 in spark build file. And it
worked.
You guys were up late, eh? :) I'm looking forward to using this latest
version.
Is there any place we can get a list of the new functions in the Python
API? The release notes don't enumerate them.
Nick
On Fri, May 30, 2014 at 10:15 AM, Ian Ferreira ianferre...@hotmail.com
wrote:
Congrats
Is there a way to subscribe to news releases
http://spark.apache.org/news/index.html? That would be swell.
Nick
--
View this message in context:
http://apache-spark-user-list.1001560.n3.nabble.com/Subscribing-to-news-releases-tp6592.html
Sent from the Apache Spark User List mailing list
Great work!
On May 30, 2014 10:15 PM, Ian Ferreira ianferre...@hotmail.com wrote:
Congrats
Sent from my Windows Phone
--
From: Dean Wampler deanwamp...@gmail.com
Sent: 5/30/2014 6:53 AM
To: user@spark.apache.org
Subject: Re: Announcing Spark 1.0.0
Great News ! I've been awaiting this release to start doing some coding
with Spark using Java 8. Can I run Spark 1.0 examples on a virtual host
with 16 GB ram and fair descent amount of hard disk ? Or do I reaaly need
to use a cluster of machines.
Second, are there any good exmaples of using MLIB
With respect to virtual hosts, my team uses Vagrant/Virtualbox. We have 3
CentOS VMs with 4 GB RAM each - 2 worker nodes and a master node.
Everything works fine, though if you are using MapR, you have to make sure
they are all on the same subnet.
-Suren
On Fri, May 30, 2014 at 12:20 PM,
Also, the Spark examples can run out of the box on a single machine, as
well as a cluster. See the Master URLs heading here:
http://spark.apache.org/docs/latest/submitting-applications.html#master-urls
On Fri, May 30, 2014 at 9:24 AM, Surendranauth Hiraman
suren.hira...@velos.io wrote:
With
Hi Rahul,
I'll just copy paste your question here to aid with context, and
reply afterwards.
-
Can I write the RDD data in excel file along with mapping in
apache-spark? Is that a correct way? Isn't that a writing will be a
local function and can't be passed over the clusters??
Below is
Hello there,
On Fri, May 30, 2014 at 9:36 AM, Marcelo Vanzin van...@cloudera.com wrote:
workbook = xlsxwriter.Workbook('output_excel.xlsx')
worksheet = workbook.add_worksheet()
data = sc.textFile(xyz.txt)
# xyz.txt is a file whose each line contains string delimited by SPACE
row=0
def
Hi Rahul,
Marcelo's explanation is correct. Here's a possible approach to your
program, in pseudo-Python:
# connect to Spark cluster
sc = SparkContext(...)
# load input data
input_data = load_xls(file(input.xls))
input_rows = input_data['Sheet1'].rows
# create RDD on cluster
input_rdd =
Hey Folks,
I'm really having quite a bit of trouble getting spark running on ec2. I'm
not using scripts the https://github.com/apache/spark/tree/master/ec2
because I'd like to know how everything works. But I'm going a little
crazy. I think that something about the networking configuration must
Thanks, Stephen. I have eventually decided to go with assembly, but put
away Spark and Hadoop jars, and instead use `spark-submit` to automatically
provide these dependencies. This way no resource conflicts arise and
mergeStrategy needs no modification. To memorize this stable setup and also
share
Thanks Marcelo,
It actually made my few concepts clear. (y).
On Fri, May 30, 2014 at 10:14 PM, Marcelo Vanzin van...@cloudera.com
wrote:
Hello there,
On Fri, May 30, 2014 at 9:36 AM, Marcelo Vanzin van...@cloudera.com
wrote:
workbook = xlsxwriter.Workbook('output_excel.xlsx')
worksheet
Thanks jey
I was hellpful.
On Sat, May 31, 2014 at 12:45 AM, Rahul Bhojwani
rahulbhojwani2...@gmail.com wrote:
Thanks Marcelo,
It actually made my few concepts clear. (y).
On Fri, May 30, 2014 at 10:14 PM, Marcelo Vanzin van...@cloudera.com
wrote:
Hello there,
On Fri, May 30, 2014
I'm running a some kafka streaming spark contexts (on 0.9.1), and they seem
to be dying after 10 or so minutes with a lot of these errors. I can't
really tell what's going on here, except that maybe the driver is
unresponsive somehow? Has anyone seen this before?
14/05/31 01:13:30 ERROR
Congrats on the new 1.0 release. Amazing work !
It looks like there may some typos in the latest
http://spark.apache.org/docs/latest/sql-programming-guide.html
in the Running SQL on RDDs section when choosing the java example:
1. ctx is an instance of JavaSQLContext but the textFile method
Hi Jeremy,
That's interesting, I don't think anyone has ever reported an issue running
these scripts due to Python incompatibility, but they may require Python
2.7+. I regularly run them from the AWS Ubuntu 12.04 AMI... that might be a
good place to start. But if there is a straightforward way to
38 matches
Mail list logo