That fixed it!
Thank you!
--Ben
On Thu, Apr 14, 2016 at 5:53 PM, Marcelo Vanzin wrote:
> On Thu, Apr 14, 2016 at 2:14 PM, Benjamin Zaitlen
> wrote:
> >> spark-submit --master yarn-cluster /home/ubuntu/test_spark.py --files
> >> /home/ubuntu/localtest.txt#appSees.txt
Hi All,
I'm trying to use the --files option with yarn:
spark-submit --master yarn-cluster /home/ubuntu/test_spark.py --files
> /home/ubuntu/localtest.txt#appSees.txt
I never see the file in HDFS or in the yarn containers. Am I doing
something incorrect ?
I'm running spark 1.6.0
Thanks,
--B
Hi All,
Sean patiently worked with me in solving this issue. The problem was
entirely my fault in settings MAVEN_OPTS env variable was set and was
overriding everything.
--Ben
On Tue, Sep 8, 2015 at 1:37 PM, Benjamin Zaitlen wrote:
> Yes, just reran with the following
>
> (spark_b
ation. You
> can run "zinc -J-Xmx4g..." in general, but in the provided script,
> ZINC_OPTS seems to be the equivalent, yes. It kind of looks like your
> mvn process isn't getting any special memory args there. Is MAVEN_OPTS
> really exported?
>
> FWIW I use my
gt;>> + return 1
>>> + exit 1
>>
>>
>> On Tue, Sep 8, 2015 at 10:03 AM, Sean Owen wrote:
>>
>>> It might need more memory in certain situations / running certain
>>> tests. If 3gb works for your relatively full build, yes you can open a
>
nge any occurrences of lower recommendations to 3gb.
>
> On Tue, Sep 8, 2015 at 3:02 PM, Benjamin Zaitlen
> wrote:
> > Ah, right. Should've caught that.
> >
> > The docs seem to recommend 2gb. Should that be increased as well?
> >
> > --Ben
> >
Ah, right. Should've caught that.
The docs seem to recommend 2gb. Should that be increased as well?
--Ben
On Tue, Sep 8, 2015 at 9:33 AM, Sean Owen wrote:
> It shows you there that Maven is out of memory. Give it more heap. I use
> 3gb.
>
> On Tue, Sep 8, 2015 at 1:53 PM,
Hi All,
I'm trying to build a distribution off of the latest in master and I keep
getting errors on MQTT and the build fails. I'm running the build on a
m1.large which has 7.5 GB of RAM and no other major processes are running.
MAVEN_OPTS="-Xmx2g -XX:MaxPermSize=512M -XX:ReservedCodeCacheSize=5
Hi All,
I'm not quite clear on whether submitting a python application to spark
standalone on ec2 is possible.
Am I reading this correctly:
*A common deployment strategy is to submit your application from a gateway
machine that is physically co-located with your worker machines (e.g.
Master node
HI Andy,
I built an anaconda/spark AMI a few months ago. I'm still iterating on it
so if things break please report them. If you want to give it awhirl:
./spark-ec2 -k my_key -i ~/.ssh/mykey.rsa -a ami-3ecd0c56
The nice thing about anaconda is that it come pre-baked with
ipython-notebook, matp
I may have missed this but is it possible to select on datetime in a
SparkSQL query
jan1 = sqlContext.sql("SELECT * FROM Stocks WHERE datetime = '2014-01-01'")
Additionally, is there a guide as to what SQL is valid? The guide says,
"Note that Spark SQL currently uses a very basic SQL parser" It
.html
>
> Hope that helps,
> -Jey
>
> On Thu, Jul 3, 2014 at 11:54 AM, Benjamin Zaitlen
> wrote:
> > Hi All,
> >
> > I'm a dev a Continuum and we are developing a fair amount of tooling
> around
> > Spark. A few days ago someone expressed int
Hi All,
I'm a dev a Continuum and we are developing a fair amount of tooling around
Spark. A few days ago someone expressed interest in numpy+pyspark and
Anaconda came up as a reasonable solution.
I spent a number of hours yesterday trying to rework the base Spark AMI on
EC2 but sadly was defeat
13 matches
Mail list logo