Did you try the Hive Context? Look under Hive Support here:
http://people.apache.org/~pwendell/catalyst-docs/sql-programming-guide.html
On Tue, May 27, 2014 at 2:09 AM, 정재부 itsjb.j...@samsung.com wrote:
Hi all,
I'm trying to compare functions available in Spark1.0 hql to original
All:
In the pom.xml file I see the MapR repository, but it's not included in the
./project/SparkBuild.scala file. Is this expected? I know to build I have
to add it there otherwise sbt hates me with evil red messages and such.
John
On Fri, May 30, 2014 at 6:24 AM, Kousuke Saruta
wondered if there were other options I should consider
before building.
Thanks!
On Fri, May 30, 2014 at 6:52 AM, John Omernik j...@omernik.com wrote:
All:
In the pom.xml file I see the MapR repository, but it's not included in
the ./project/SparkBuild.scala file. Is this expected? I know
So Python is used in many of the Spark Ecosystem products, but not
Streaming at this point. Is there a roadmap to include Python APIs in Spark
Streaming? Anytime frame on this?
Thanks!
John
On Thu, May 29, 2014 at 4:19 PM, Matei Zaharia matei.zaha...@gmail.com
wrote:
Quite a few people ask
Python integration/support, if possible, would be a home run.
On Wed, Jun 4, 2014 at 7:06 PM, Matei Zaharia matei.zaha...@gmail.com
wrote:
We are definitely investigating a Python API for Streaming, but no
announced deadline at this point.
Matei
On Jun 4, 2014, at 5:02 PM, John Omernik j
Michael -
Does Spark SQL support rlike and like yet? I am running into that same
error with a basic select * from table where field like '%foo%' using the
hql() funciton.
Thanks
On Wed, May 28, 2014 at 2:22 PM, Michael Armbrust mich...@databricks.com
wrote:
On Tue, May 27, 2014 at 6:08 PM,
I am trying to get my head around using Spark on Yarn from a perspective of
a cluster. I can start a Spark Shell no issues in Yarn. Works easily. This
is done in yarn-client mode and it all works well.
In multiple examples, I see instances where people have setup Spark
Clusters in Stand Alone
9, 2014, at 8:31 AM, John Omernik j...@omernik.com wrote:
I am trying to get my head around using Spark on Yarn from a
perspective of a cluster. I can start a Spark Shell no issues in Yarn.
Works easily. This is done in yarn-client mode and it all works well.
In multiple examples, I see
, Jul 9, 2014 at 12:41 PM, John Omernik j...@omernik.com wrote:
Thank you for the link. In that link the following is written:
For those familiar with the Spark API, an application corresponds to an
instance of the SparkContext class. An application can be used for a
single batch job
want to write a Spark application that fires off jobs on behalf of
remote processes, you would need to implement the communication between
those remote processes and your Spark application code yourself.
On Wed, Jul 9, 2014 at 10:41 AM, John Omernik j...@omernik.com wrote:
Thank you
SO this is good information for standalone, but how is memory distributed
within Mesos? There's coarse grain mode where the execute stays active, or
theres fine grained mode where it appears each task is it's only process in
mesos, how to memory allocations work in these cases? Thanks!
On Thu,
I am using spark-1.1.0-SNAPSHOT right now and trying to get familiar with
the JDBC thrift server. I have everything compiled correctly, I can access
data in spark-shell on yarn from my hive installation. Cached tables, etc
all work.
When I execute ./sbin/start-thriftserver.sh
I get the error
I gave things working on my cluster with the sparksql thrift server. (Thank
you Yin Huai at Databricks!)
That said, I was curious how I can cache a table via my instance here? I
tried the shark like create table table_cached as select * from table and
that did not create a cached table.
.
On Tue, Aug 5, 2014 at 9:02 AM, John Omernik j...@omernik.com wrote:
I gave things working on my cluster with the sparksql thrift server.
(Thank you Yin Huai at Databricks!)
That said, I was curious how I can cache a table via my instance here? I
tried the shark like create table table_cached
server resides in its own build profile and need
to be enabled explicitly by ./sbt/sbt -Phive-thriftserver assembly.
On Tue, Aug 5, 2014 at 4:54 AM, John Omernik j...@omernik.com wrote:
I am using spark-1.1.0-SNAPSHOT right now and trying to get familiar with
the JDBC thrift server. I
I am working with Spark SQL and the Thrift server. I ran into an
interesting bug, and I am curious on what information/testing I can provide
to help narrow things down.
My setup is as follows:
Hive 0.12 with a table that has lots of columns (50+) stored as rcfile.
Spark-1.1.0-SNAPSHOT with Hive
I am running the Thrift server in SparkSQL, and running it on the node I
compiled spark on. When I run it, tasks only work if they landed on that
node, other executors started on nodes I didn't compile spark on (and thus
don't have the compile directory) fail. Should spark be distributed
Any thoughts on this?
On Sat, Sep 20, 2014 at 12:16 PM, John Omernik j...@omernik.com wrote:
I am running the Thrift server in SparkSQL, and running it on the node I
compiled spark on. When I run it, tasks only work if they landed on that
node, other executors started on nodes I didn't
I am trying to do the sbt assembly for spark 1.2
sbt/sbt -Pmapr4 -Phive -Phive-thriftserver assembly
and I am getting the errors below. Any thoughts? Thanks in advance!
[warn] ::
[warn] :: FAILED DOWNLOADS::
[warn] :: ^ see
I am running Spark on Mesos and it works quite well. I have three
users, all who setup iPython notebooks to instantiate a spark instance
to work with on the notebooks. I love it so far.
Since I am auto instantiating (I don't want a user to have to
think about instantiating and submitting a spark
sort of time frame I could possibly
communicate to my team? Anything I can do?
Thanks!
On Fri, Feb 20, 2015 at 4:36 AM, Iulian Dragoș
iulian.dra...@typesafe.com wrote:
On Thu, Feb 19, 2015 at 2:49 PM, John Omernik j...@omernik.com wrote:
I am running Spark on Mesos and it works quite well
I have been posting on the Mesos list, as I am looking to see if it
it's possible or not to share spark drivers. Obviously, in stand
alone cluster mode, the Master handles requests, and you can
instantiate a new sparkcontext to a currently running master. However
in Mesos (and perhaps Yarn) I
-with-the-Spark-Kernel
Signed,
Chip Senkbeil
On Tue Feb 24 2015 at 8:04:08 AM John Omernik j...@omernik.com wrote:
I have been posting on the Mesos list, as I am looking to see if it
it's possible or not to share spark drivers. Obviously, in stand
alone cluster mode, the Master handles requests
that
this is happening at is way above my head. :)
On Fri, Jun 5, 2015 at 4:38 PM, John Omernik j...@omernik.com wrote:
Thanks all. The answers post is me too, I multi thread. That and Ted is
aware to and Mapr is helping me with it. I shall report the answer of that
investigation when we
Frameowrk – while iterator.hasNext() ……
Also check whether this is not some sort of Python Spark API bug – Python
seems to be the foster child here – Scala and Java are the darlings
*From:* John Omernik [mailto:j...@omernik.com]
*Sent:* Friday, June 5, 2015 4:08 PM
*To:* user
*Subject:* Spark
://twitter.com/deanwampler
http://polyglotprogramming.com
On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.com
javascript:_e(%7B%7D,'cvml','j...@omernik.com'); wrote:
All -
I am facing and odd issue and I am not really sure where to go for
support at this point. I am running MapR
I am learning more about Spark (and in this case Spark Streaming) and am
getting that a functions like dstream.map() takes a function call and does
something to each element of the rdd and that in turn returns a new rdd
based on the original.
That's cool for the simple map functions in the
Edition
http://shop.oreilly.com/product/0636920033073.do (O'Reilly)
Typesafe http://typesafe.com
@deanwampler http://twitter.com/deanwampler
http://polyglotprogramming.com
On Mon, Jun 1, 2015 at 2:49 PM, John Omernik j...@omernik.com wrote:
All -
I am facing and odd issue and I am not really
Is there pythonic/sparkonic way to test for an empty RDD before using the
foreachRDD? Basically I am using the Python example
https://spark.apache.org/docs/latest/streaming-programming-guide.html to
put records somewhere When I have data, it works fine, when I don't I
get an exception. I am not
Hey all, from my other post on Spark 1.3.1 issues, I think we found an
issue related to a previous closed Jira (
https://issues.apache.org/jira/browse/SPARK-1403) Basically it looks like
the threat context class loader is NULL which is causing the NPE in MapR
and that's similar to posted Jira.
All -
I am facing and odd issue and I am not really sure where to go for support
at this point. I am running MapR which complicates things as it relates to
Mesos, however this HAS worked in the past with no issues so I am stumped
here.
So for starters, here is what I am trying to run. This is a
Hey all,
I noticed today that if I take a tgz as my URI for Mesos, that I have to
repackaged it with my conf settings from where I execute say pyspark for
the executors to have the right configuration settings.
That is...
If I take a "stock" tgz from makedistribution.sh, unpack it, and then set
I have stumbled across and interesting (potential) bug. I have an
environment that is MapR FS and Mesos. I've posted a bit in the past
around getting this setup to work with Spark Mesos, and MapR and the Spark
community has been helpful.
In 1.4.1, I was able to get Spark working in this setup
Hey all -
Curious at the best way to include python packages in my Spark
installation. (Such as NLTK). Basically I am running on Mesos, and would
like to find a way to include the package in the binary distribution in
that I don't want to install packages on all nodes. We should be able to
I was searching in the 1.5.0 docs on the Docker on Mesos capabilities and
just found you CAN run it this way. Are there any user posts, blog posts,
etc on why and how you'd do this?
Basically, at first I was questioning why you'd run spark in a docker
container, i.e., if you run with tar balled
I have a happy healthy Mesos cluster (0.24) running in my lab. I've
compiled spark-1.5.0 and it seems to be working fine, except for one small
issue, my tasks all seem to run on one node. (I have 6 in the cluster).
Basically, I have directory of compressed text files. Compressed, these 25
files
All, I received this today, is this appropriate list use? Note: This was
unsolicited.
Thanks
John
From: Pierce Lamb
11:57 AM (1 hour ago)
to me
Hi John,
I saw you on the Spark Mailing List and noticed you worked for * and
wanted to reach out. My company, SnappyData,
Was there any other creative solutions for this? I am running into the same
issue with submitting to yarn from a Docker container and the solutions
don't provided don't work. (1. the host doesn't work, even if I use the
hostname of the physical node because when spark tries to bind to the
hostname
Hey all, are there any plans to implement the Mesos HTTP API rather than
native libs? The reason I ask is I am trying to run an application in a
docker container (Zeppelin) that would use Spark connecting to Mesos, but I
am finding that using the NATIVE_LIB from docker is difficult or would
The setting
spark.mesos.executor.docker.portmaps
Is interesting to me, without this setting, the docker executor uses
net=host and thus port mappings are not needed.
With this setting, (and just adding some random mappings) my executors fail
with less then helpful messages.
I guess some
Hey all,
I was wondering if there is a way to access/edit the command on Spark
Executors while using Docker on Mesos.
The reason is this: I am using the MapR File Client, and the Spark Driver
is trying to execute things as my user "user1" and since the executors are
running as root inside and
Hello all, I am running PySpark 2.1.1 as a user, jomernik. I am working
through some documentation here:
https://spark.apache.org/docs/latest/mllib-ensembles.html#random-forests
And was working on the Random Forest Classification, and found it to be
working! That said, when I try to save the
42 matches
Mail list logo