Hi,
I'm not a master on SparkSQL, but from what I understand, the problem ıs that
you're trying to access an RDD
inside an RDD here: val xyz = file.map(line = ***
extractCurRate(sqlContext.sql(select rate ... *** and
here: xyz = file.map(line = *** extractCurRate(sqlContext.sql(select rate
Hi,
I am trying to create an rdd out of large matrix sc.parallelize
suggest to use broadcast
But when I do
sc.broadcast(data)
I get this error:
Traceback (most recent call last):
File stdin, line 1, in module
File /usr/common/usg/spark/1.0.2/python/pyspark/context.py, line 370,
in
Specifically the error I see when I try to operate on rdd created by
sc.parallelize method
: org.apache.spark.SparkException: Job aborted due to stage failure:
Serialized task 12:12 was 12062263 bytes which exceeds spark.akka.frameSize
(10485760 bytes). Consider using broadcast variables for large
Hi
I am trying to perform read/write file operations in spark by creating
Writable object.
But, I am not able to write to a file. The concerned data is not rdd.
Can someone please tell me how to perform read/write file operations on
non-rdd data in spark.
Regards
karthik
Hi
I've written a job (I think not very complicated only 1 reduceByKey) the
driver JVM always hang with OOM killing the worker of course. How can I know
what is running on the driver and what is running on the worker how to debug
the memory problem.
I've already used --driver-memory 4g params to
Try increasing the number of partitions while doing a reduceByKey()
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.api.java.JavaPairRDD
Thanks
Best Regards
On Sun, Sep 14, 2014 at 5:11 PM, richiesgr richie...@gmail.com wrote:
Hi
I've written a job (I think not very
Hi,
I have tried to to run HBaseTest.scala, but I got following errors, any ideas
to how to fix them?
Q1)
scala package org.apache.spark.examples
console:1: error: illegal start of definition
package org.apache.spark.examples
Q2)
scala import
Spark examples builds against hbase 0.94 by default.
If you want to run against 0.98, see:
SPARK-1297 https://issues.apache.org/jira/browse/SPARK-1297
Cheers
On Sun, Sep 14, 2014 at 7:36 AM, arthur.hk.c...@gmail.com
arthur.hk.c...@gmail.com wrote:
Hi,
I have tried to to run
Hi,Thanks!!I tried to apply the patches, bothspark-1297-v2.txt andspark-1297-v4.txt are good here, but notspark-1297-v5.txt:$ patch -p1 -i spark-1297-v4.txtpatching file examples/pom.xml$ patch -p1 -i spark-1297-v5.txtcan't find file to patch at input line 5Perhaps you used the wrong -p or --strip
spark-1297-v5.txt is level 0 patch
Please use spark-1297-v5.txt
Cheers
On Sun, Sep 14, 2014 at 8:06 AM, arthur.hk.c...@gmail.com
arthur.hk.c...@gmail.com wrote:
Hi,
Thanks!!
I tried to apply the patches, both spark-1297-v2.txt and spark-1297-v4.txt
are good here, but not
Hi,
Thanks!
patch -p0 -i spark-1297-v5.txt
patching file docs/building-with-maven.md
patching file examples/pom.xml
Hunk #1 FAILED at 45.
Hunk #2 FAILED at 110.
2 out of 2 hunks FAILED -- saving rejects to file examples/pom.xml.rej
Still got errors.
Regards
Arthur
On 14 Sep, 2014, at 11:33
Hi,
My bad. Tried again, worked.
patch -p0 -i spark-1297-v5.txt
patching file docs/building-with-maven.md
patching file examples/pom.xml
Thanks!
Arthur
On 14 Sep, 2014, at 11:38 pm, arthur.hk.c...@gmail.com
arthur.hk.c...@gmail.com wrote:
Hi,
Thanks!
patch -p0 -i spark-1297-v5.txt
I applied the patch on master branch without rejects.
If you use spark 1.0.2, use pom.xml attached to the JIRA.
On Sun, Sep 14, 2014 at 8:38 AM, arthur.hk.c...@gmail.com
arthur.hk.c...@gmail.com wrote:
Hi,
Thanks!
patch -p0 -i spark-1297-v5.txt
patching file docs/building-with-maven.md
Hi,
I applied the patch.
1) patched
$ patch -p0 -i spark-1297-v5.txt
patching file docs/building-with-maven.md
patching file examples/pom.xml
2) Compilation result
[INFO]
[INFO] Reactor Summary:
[INFO]
[INFO] Spark
Take a look at bin/run-example
Cheers
On Sun, Sep 14, 2014 at 9:15 AM, arthur.hk.c...@gmail.com
arthur.hk.c...@gmail.com wrote:
Hi,
I applied the patch.
1) patched
$ patch -p0 -i spark-1297-v5.txt
patching file docs/building-with-maven.md
patching file examples/pom.xml
2)
Can you post your whole SBT build file(s)?
Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
Typesafe http://typesafe.com
@deanwampler http://twitter.com/deanwampler
http://polyglotprogramming.com
On Wed, Sep 10, 2014 at 6:48
Sorry, I meant any *other* SBT files.
However, what happens if you remove the line:
exclude(org.eclipse.jetty.orbit, javax.servlet)
dean
Dean Wampler, Ph.D.
Author: Programming Scala, 2nd Edition
http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
Typesafe http://typesafe.com
Hello
I'm new to Spark and I couldn't make the SimpleApp run on my macbook. I
feel it's related to network configuration. Could anyone take a look?
Thanks.
14/09/14 10:10:36 INFO Utils: Fetching
http://10.63.93.115:59005/jars/simple-project_2.11-1.0.jar to
I did actually try Seans suggestion just before I posted for the first
time in this thread. I got an error when doing this and thought that I
am not understanding what Sean was suggesting.
Now I re-attempted your suggestions with spark 1.0.0-cdh5.1.0, hbase
0.98.1-cdh5.1.0 and hadoop
How? Example please..
Also, if I am running this in pyspark shell.. how do i configure
spark.akka.frameSize ??
On Sun, Sep 14, 2014 at 7:43 AM, Akhil Das ak...@sigmoidanalytics.com
wrote:
When the data size is huge, you better of use the torrentBroadcastFactory.
Thanks
Best Regards
On
I've seen the file name too long error when compiling on an encrypted Linux
file system -- some of them have a limit on file name lengths. If you're on
Linux, can you try compiling inside /tmp instead?
Matei
On September 13, 2014 at 10:03:14 PM, Yin Huai (huaiyin@gmail.com) wrote:
Can you
And when I use sparksubmit script, I get the following error:
py4j.protocol.Py4JJavaError: An error occurred while calling
o26.trainKMeansModel.
: org.apache.spark.SparkException: Job aborted due to stage failure: All
masters are unresponsive! Giving up.
at
Hi Andrew,
I agree with Nicholas. That was a nice, concise summary of the
meaning of the locality customization options, indicators and default
Spark behaviors. I haven't combed through the documentation
end-to-end in a while, but I'm also not sure that information is
presently represented
Yeah that issue has been fixed by adding better docs, it just didn't make
it in time for the release:
https://github.com/apache/spark/blob/branch-1.1/make-distribution.sh#L54
On Thu, Sep 11, 2014 at 11:57 PM, Zhanfeng Huo huozhanf...@gmail.com
wrote:
resolved:
./make-distribution.sh --name
Hi,
On Spark Configuration document, spark.executor.extraClassPath is regarded
as a backwards-compatibility option. It also says that users typically
should not need to set this option.
Now, I must add a classpath to the executor environment (as well as to the
driver in the future, but for
Thank you very much.
It is helpful for end users.
Zhanfeng Huo
From: Patrick Wendell
Date: 2014-09-15 10:19
To: Zhanfeng Huo
CC: user
Subject: Re: spark-1.1.0 with make-distribution.sh problem
Yeah that issue has been fixed by adding better docs, it just didn't make it in
time for the
Hi,
I have a directory structure with parquet+avro data in it. There are a
couple of administrative files (.foo and/or _foo) that I need to ignore
when processing this data or Spark tries to read them as containing parquet
content, which they do not.
How can I set a PathFilter on the
Hey Chengi,
What's the version of Spark you are using? It have big improvements
about broadcast in 1.1, could you try it?
On Sun, Sep 14, 2014 at 8:29 PM, Chengi Liu chengi.liu...@gmail.com wrote:
Any suggestions.. I am really blocked on this one
On Sun, Sep 14, 2014 at 2:43 PM, Chengi Liu
I am using spark1.0.2.
This is my work cluster.. so I can't setup a new version readily...
But right now, I am not using broadcast ..
conf = SparkConf().set(spark.executor.memory,
32G).set(spark.akka.frameSize, 1000)
sc = SparkContext(conf = conf)
rdd = sc.parallelize(matrix,5)
from
And the thing is code runs just fine if I reduce the number of rows in my
data?
On Sun, Sep 14, 2014 at 8:45 PM, Chengi Liu chengi.liu...@gmail.com wrote:
I am using spark1.0.2.
This is my work cluster.. so I can't setup a new version readily...
But right now, I am not using broadcast ..
SPARK-1671 looks really promising.
Note that even right now, you don't need to un-cache the existing
table. You can do something like this:
newAdditionRdd.registerTempTable(table2)
sqlContext.cacheTable(table2)
val unionedRdd = sqlContext.table(table1).unionAll(sqlContext.table(table2))
When
31 matches
Mail list logo