I am trying to do the sbt assembly for spark 1.2
sbt/sbt -Pmapr4 -Phive -Phive-thriftserver assembly
and I am getting the errors below. Any thoughts? Thanks in advance!
[warn] ::
[warn] :: FAILED DOWNLOADS::
[warn] :: ^ see
,
On Wed, Nov 19, 2014 at 5:35 PM, Yiming (John) Zhang sdi...@gmail.com wrote:
Thank you for your reply. I was wondering whether there is a method of
reusing locally-built components without installing them? That is, if I have
successfully built the spark project as a whole, how should I configure
are compiling all modules at once. If you want to
compile everything and reuse the local artifacts later, you need 'install' not
'package'.
On Mon, Nov 17, 2014 at 12:27 AM, Yiming (John) Zhang sdi...@gmail.com wrote:
Thank you Marcelo. I tried your suggestion (# mvn -pl :spark-examples_2.10
compile
at 5:31 PM, Yiming (John) Zhang sdi...@gmail.com wrote:
Hi,
I have already successfully compile and run spark examples. My problem
is that if I make some modifications (e.g., on SparkPi.scala or
LogQuery.scala) I have to use “mvn -DskipTests package” to rebuild the
whole spark project
Hi,
I have already successfully compile and run spark examples. My problem is
that if I make some modifications (e.g., on SparkPi.scala or LogQuery.scala)
I have to use mvn -DskipTests package to rebuild the whole spark project
and wait a relatively long time.
I also tried mvn scala:cc as
You can also build a Play 2.2.x + Spark 1.1.0 fat jar with sbt-assembly
for, e.g. yarn-client support or using with spark-shell for debugging:
play.Project.playScalaSettings
libraryDependencies ~= { _ map {
case m if m.organization == com.typesafe.play =
m.exclude(commons-logging,
://wiki.openstreetmap.org/wiki/Planet.osm
http://wiki.openstreetmap.org/wiki/OSM_XML
John
--
John S. Roberts
SigInt Technologies LLC, a Novetta Solutions Company
8830 Stanford Blvd, Suite 306; Columbia, MD 21045
-
To unsubscribe, e-mail
Any thoughts on this?
On Sat, Sep 20, 2014 at 12:16 PM, John Omernik j...@omernik.com wrote:
I am running the Thrift server in SparkSQL, and running it on the node I
compiled spark on. When I run it, tasks only work if they landed on that
node, other executors started on nodes I didn't
I am running the Thrift server in SparkSQL, and running it on the node I
compiled spark on. When I run it, tasks only work if they landed on that
node, other executors started on nodes I didn't compile spark on (and thus
don't have the compile directory) fail. Should spark be distributed
In Spark 1.1, I'm seeing tasks with callbacks that don't involve my code at
all!
I'd seen something like this before in 1.0.0, but the behavior seems to be
back
apply at Option.scala:120
http://localhost:4040/stages/stage?id=52attempt=0
, if I don't cache the table through cache table table1
in thrift, I get results for all queries. If I uncache, I start getting
results again.
I hope I was clear enough here, I am happy to help however I can.
John
What's the correct way to use setCallSite to get the change to show up in
the spark logs?
I have something like
class RichRDD (rdd : RDD[MyThing]) {
def mySpecialOperation() {
rdd.context.setCallSite(bubbles and candy!)
rdd.map()
val result = rdd.groupBy()
not just build the thrift server in? (I am not a
programming expert, and not trying to judge the decision to have it in a
separate profile, I would just like to understand why it'd done that way)
On Mon, Aug 11, 2014 at 11:47 AM, Cheng Lian lian.cs@gmail.com wrote:
Hi John, the JDBC Thrift
I gave things working on my cluster with the sparksql thrift server. (Thank
you Yin Huai at Databricks!)
That said, I was curious how I can cache a table via my instance here? I
tried the shark like create table table_cached as select * from table and
that did not create a cached table.
.
On Tue, Aug 5, 2014 at 9:02 AM, John Omernik j...@omernik.com wrote:
I gave things working on my cluster with the sparksql thrift server.
(Thank you Yin Huai at Databricks!)
That said, I was curious how I can cache a table via my instance here? I
tried the shark like create table table_cached
I am using spark-1.1.0-SNAPSHOT right now and trying to get familiar with
the JDBC thrift server. I have everything compiled correctly, I can access
data in spark-shell on yarn from my hive installation. Cached tables, etc
all work.
When I execute ./sbin/start-thriftserver.sh
I get the error
Hi,
Have you checked out SchemaRDD?
There should be an examp[le of writing to Parquet files there.
BTW, FYI I was discussing this with the SparlSQL developers last week and
possibly using Apache Gora [0] for achieving this.
HTH
Lewis
[0] http://gora.apache.org
On Wed, Jul 30, 2014 at 5:14 AM,
.
And there are so many info log from stdout like this:
BlockManagerMasterActor$BlockManagerInfo: Removed taskresult_xxx on
SHXJ-Hx-HBxxx:44126 in memory
Thank you.
John Wu
晶赞广告(上海)有限公司
Zamplus Advertising (Shanghai) Co., Ltd.
Tel: +8621-6076 0818 Ext. 885
Fax: +8621-6076 0812
Mobile: +86
SO this is good information for standalone, but how is memory distributed
within Mesos? There's coarse grain mode where the execute stays active, or
theres fine grained mode where it appears each task is it's only process in
mesos, how to memory allocations work in these cases? Thanks!
On Thu,
I am trying to get my head around using Spark on Yarn from a perspective of
a cluster. I can start a Spark Shell no issues in Yarn. Works easily. This
is done in yarn-client mode and it all works well.
In multiple examples, I see instances where people have setup Spark
Clusters in Stand Alone
9, 2014, at 8:31 AM, John Omernik j...@omernik.com wrote:
I am trying to get my head around using Spark on Yarn from a
perspective of a cluster. I can start a Spark Shell no issues in Yarn.
Works easily. This is done in yarn-client mode and it all works well.
In multiple examples, I see
, Jul 9, 2014 at 12:41 PM, John Omernik j...@omernik.com wrote:
Thank you for the link. In that link the following is written:
For those familiar with the Spark API, an application corresponds to an
instance of the SparkContext class. An application can be used for a
single batch job
want to write a Spark application that fires off jobs on behalf of
remote processes, you would need to implement the communication between
those remote processes and your Spark application code yourself.
On Wed, Jul 9, 2014 at 10:41 AM, John Omernik j...@omernik.com wrote:
Thank you
Michael -
Does Spark SQL support rlike and like yet? I am running into that same
error with a basic select * from table where field like '%foo%' using the
hql() funciton.
Thanks
On Wed, May 28, 2014 at 2:22 PM, Michael Armbrust mich...@databricks.com
wrote:
On Tue, May 27, 2014 at 6:08 PM,
I have a use case where I cannot figure out the spark streaming way to do
it.
Given two kafka topics corresponding to two different types of events A and
B. For each element from topic A correspond an element from topic B.
Unfortunately elements can arrive separately by hours.
The aggregation
Cool.
Looked at the Pull Requests, the upgrade to 1.0.0 was just merged
yesterday. https://github.com/Homebrew/homebrew/pull/30231
https://github.com/Homebrew/homebrew/blob/master/Library/Formula/apache-spark.rb
On Wed, Jun 18, 2014 at 1:57 PM, Matei Zaharia matei.zaha...@gmail.com
wrote:
So Python is used in many of the Spark Ecosystem products, but not
Streaming at this point. Is there a roadmap to include Python APIs in Spark
Streaming? Anytime frame on this?
Thanks!
John
On Thu, May 29, 2014 at 4:19 PM, Matei Zaharia matei.zaha...@gmail.com
wrote:
Quite a few people ask
Python integration/support, if possible, would be a home run.
On Wed, Jun 4, 2014 at 7:06 PM, Matei Zaharia matei.zaha...@gmail.com
wrote:
We are definitely investigating a Python API for Streaming, but no
announced deadline at this point.
Matei
On Jun 4, 2014, at 5:02 PM, John Omernik j
I have created some extension methods for RDDs in RichRecordRDD and these
are working exceptionally well for me.
However, when looking at the logs, its impossible to tell what's going on
because all the line number hints point to RichRecordRDD.scala rather than
the code that uses it. For example:
All:
In the pom.xml file I see the MapR repository, but it's not included in the
./project/SparkBuild.scala file. Is this expected? I know to build I have
to add it there otherwise sbt hates me with evil red messages and such.
John
On Fri, May 30, 2014 at 6:24 AM, Kousuke Saruta saru
wondered if there were other options I should consider
before building.
Thanks!
On Fri, May 30, 2014 at 6:52 AM, John Omernik j...@omernik.com wrote:
All:
In the pom.xml file I see the MapR repository, but it's not included in
the ./project/SparkBuild.scala file. Is this expected? I know
Did you try the Hive Context? Look under Hive Support here:
http://people.apache.org/~pwendell/catalyst-docs/sql-programming-guide.html
On Tue, May 27, 2014 at 2:09 AM, 정재부 itsjb.j...@samsung.com wrote:
Hi all,
I'm trying to compare functions available in Spark1.0 hql to original
Hi,
Spark newbie here with a general question In a stream consisting of
several types of events, how can I detect if event X happened within Z
transactions of event Y? is it just a matter of iterating thru all the RDDs,
when event type Y found, take the next Z transactions and check if
I'm just wondering are the SparkVector calculations really taking into
account the sparsity or just converting to dense?
On Fri, Apr 25, 2014 at 10:06 PM, John King usedforprinting...@gmail.comwrote:
I've been trying to use the Naive Bayes classifier. Each example in the
dataset is about 2
I've been trying to use the Naive Bayes classifier. Each example in the
dataset is about 2 million features, only about 20-50 of which are
non-zero, so the vectors are very sparse. I keep running out of memory
though, even for about 1000 examples on 30gb RAM while the entire dataset
is 4 million
./spark-shell: line 153: 17654 Killed
$FWDIR/bin/spark-class org.apache.spark.repl.Main $@
Any ideas?
Last command was:
val model = new NaiveBayes().run(points)
On Thu, Apr 24, 2014 at 4:27 PM, Xiangrui Meng men...@gmail.com wrote:
Could you share the command you used and more of the error message?
Also, is it an MLlib specific problem? -Xiangrui
On Thu, Apr 24, 2014 at 11:49 AM, John King
, Apr 24, 2014 at 11:38 AM, John King
usedforprinting...@gmail.com wrote:
I receive this error:
Traceback (most recent call last):
File stdin, line 1, in module
File
/home/ubuntu/spark-1.0.0-rc2/python/pyspark/mllib/classification.py,
line
178, in train
ans = sc
the
hostnames, this should not happen.
Matei
On Apr 24, 2014, at 11:36 AM, John King usedforprinting...@gmail.com
wrote:
Same problem.
On Thu, Apr 24, 2014 at 10:54 AM, Shubhabrata mail2shu...@gmail.comwrote:
Moreover it seems all the workers are registered and have sufficient
memory
PM, Xiangrui Meng men...@gmail.com wrote:
I tried locally with the example described in the latest guide:
http://54.82.157.211:4000/mllib-naive-bayes.html , and it worked fine.
Do you mind sharing the code you used? -Xiangrui
On Thu, Apr 24, 2014 at 1:57 PM, John King usedforprinting
)
points.cache()
val model = new NaiveBayes().run(points)
On Thu, Apr 24, 2014 at 6:57 PM, Xiangrui Meng men...@gmail.com wrote:
Do you mind sharing more code and error messages? The information you
provided is too little to identify the problem. -Xiangrui
On Thu, Apr 24, 2014 at 1:55 PM, John King
Also when will the official 1.0 be released?
On Thu, Apr 24, 2014 at 7:04 PM, John King usedforprinting...@gmail.comwrote:
I was able to run simple examples as well.
Which version of Spark? Did you use the most recent commit or from
branch-1.0?
Some background: I tried to build both
examples you have? Also, make sure you don't
have negative feature values. The error message you sent did not say
NaiveBayes went wrong, but the Spark shell was killed. -Xiangrui
On Thu, Apr 24, 2014 at 4:05 PM, John King usedforprinting...@gmail.com
wrote:
In the other thread I had an issue
Yahoo made some changes that drive mailing list posts into spam
folders: http://www.virusbtn.com/blog/2014/04_15.xml
On Mon, Apr 21, 2014 at 2:50 PM, Marcelo Vanzin van...@cloudera.com wrote:
Hi Joe,
On Mon, Apr 21, 2014 at 11:23 AM, Joe L selme...@yahoo.com wrote:
And, I haven't gotten any
, right now we're seeing the task just re-tried over and
over again in an infinite loop because there's a value that always
generates an exception.
John
On Apr 4, 2014, at 10:40 AM, John Salvatier jsalvat...@gmail.com wrote:
I'm trying to get a clear idea about how exceptions are handled in
Spark? Is there somewhere where I can read about this? I'm on spark .7
For some reason I was under the impression that such exceptions are
swallowed
Btw, thank you for your help.
On Fri, Apr 4, 2014 at 11:49 AM, John Salvatier jsalvat...@gmail.comwrote:
Is there a way to log exceptions inside a mapping function? logError and
logInfo seem to freeze things.
On Fri, Apr 4, 2014 at 11:02 AM, Matei Zaharia matei.zaha...@gmail.comwrote
101 - 147 of 147 matches
Mail list logo