Hi,
Can you try to using save method instead of write?
ex: out_df.save("path","parquet")
b0c1
--
Skype: boci13, Hangout: boci.b...@gmail.com
On Mon, Sep 7, 2015 at
/deanwampler
http://polyglotprogramming.com
On Wed, Jul 22, 2015 at 3:53 AM, boci boci.b...@gmail.com wrote:
Hi guys!
I'm a new in mesos. I have two spark application (one streaming and one
batch). I want to run both app in mesos cluster. Now for testing I want to
run in docker container so I started
Hi guys!
I'm a new in mesos. I have two spark application (one streaming and one
batch). I want to run both app in mesos cluster. Now for testing I want to
run in docker container so I started a simple redjack/mesos-master, but I
think a lot of think unclear for me (both mesos and spark-mesos).
Hi guys,
I using spark streaming with kafka... In local machine (start as java
application without using spark-submit) it's work, connect to kafka and do
the job (*). I tried to put into spark docker container (hadoop 2.6, spark
1.3.1, try spark submit wil local[5] and yarn-client too ) but I'm
SelectionPath
521 Mon Sep 29 12:05:36 PDT 2014 akka/actor/SelectionPathElement.class
Is the above jar in your classpath ?
On Sat, May 23, 2015 at 5:05 PM, boci boci.b...@gmail.com wrote:
Hi guys!
I have a small spark application. It's query some data from postgres,
enrich it and write
Hi guys!
I have a small spark application. It's query some data from postgres,
enrich it and write to elasticsearch. When I deployed into spark container
I got a very fustrating error:
https://gist.github.com/b0c1/66527e00bada1e4c0dc3
Spark version: 1.3.1
Hadoop version: 2.6.0
Additional info:
25, 2015 at 11:05 PM, boci boci.b...@gmail.com wrote:
Thanks your fast answer...
in windows it's not working, because hadoop (surprise suprise) need
winutils.exe. Without this it's not working, but if you not set the
hadoop
directory You simply get
15/02/26 00:03:16 ERROR Shell: Failed
is the dataset you want to use in prediction? -Xiangrui
On Mon, Dec 22, 2014 at 1:47 PM, boci boci.b...@gmail.com wrote:
Hi!
I want to try out spark mllib in my spark project, but I got a little
problem. I have training data (external file), but the real data com from
another rdd. How can I do
Hi!
I want to try out spark mllib in my spark project, but I got a little
problem. I have training data (external file), but the real data com from
another rdd. How can I do that?
I try to simple using same SparkContext to boot rdd (first I create rdd
using sc.textFile() and after
/
On Sat, Jul 19, 2014 at 2:39 PM, boci boci.b...@gmail.com wrote:
Hi guys!
I run out of ideas... I created a spark streaming job (kafka - spark -
ES).
If I start my app local machine (inside the editor, but connect to the
real kafka and ES) the application work correctly.
If I start
Hi Guys,
I try to create spark uber jar with sbt but I have a lot of problem... I
want to use the following:
- Spark streaming
- Kafka
- Elsaticsearch
- HBase
the current jar size is cca 60M and it's not working.
- When I deploy with spark-submit: It's running and exit without any error
- When I
files has
more than 65536 files, and Java 6 has various issues with jars this
large. If possible, use Java 7 everywhere.
https://issues.apache.org/jira/browse/SPARK-1520
On Sat, Jul 19, 2014 at 2:30 PM, boci boci.b...@gmail.com wrote:
Hi Guys,
I try to create spark uber jar with sbt but I
Hi guys!
I run out of ideas... I created a spark streaming job (kafka - spark -
ES).
If I start my app local machine (inside the editor, but connect to the real
kafka and ES) the application work correctly.
If I start it in my docker container (same kafka and ES, local mode
(local[4]) like inside
Hi guys,
I want to use Elasticsearch and HBase in my spark project, I want to create
a test. I pulled up ES and Zookeeper, but if I put val htest = new
HBaseTestingUtility() to my app I got a strange exception (compilation
time, not runtime).
https://gist.github.com/b0c1/4a4b3f6350816090c3b5
Hi!
I try to use spark with kafka, everything is work but I found a little
problem. I create a small test application which connect to real kafka
cluster, send a message and read it back. It's work, but when I run my test
second time (send/read) it's read the first and the second stream (maybe
your experiences with Elasticsearch Spark go :)
On Thu, Jun 26, 2014 at 3:17 PM, boci boci.b...@gmail.com wrote:
Wow, thanks your fast answer, it's help a lot...
b0c1
not called)
Any idea?
b0c1
--
Skype: boci13, Hangout: boci.b...@gmail.com
On Fri, Jun 27, 2014 at 4:53 PM, boci boci.b...@gmail.com wrote:
Another question
]?
2) When you say breakpoint, how are you setting this break point? There is
a good chance your breakpoint mechanism doesn't work in a distributed
environment, could you instead cause a side effect (like writing to a file)?
Cheers,
Holden :)
On Fri, Jun 27, 2014 at 2:04 PM, boci boci.b
--
Skype: boci13, Hangout: boci.b...@gmail.com
On Thu, Jun 26, 2014 at 1:20 AM, Holden Karau hol...@pigscanfly.ca wrote:
On Wed, Jun 25, 2014 at 4:16 PM, boci boci.b...@gmail.com wrote:
Hi guys, thanks the direction now I have some problem/question:
- in local (test) mode I want to use
/elasticsearch) and use the default config (host = localhost, port =
9200).
On Thu, Jun 26, 2014 at 9:04 AM, boci boci.b...@gmail.com wrote:
That's okay, but hadoop has ES integration. what happened if I run
saveAsHadoopFile without hadoop (or I must need to pull up hadoop
programatically? (if I
:)
On Thu, Jun 26, 2014 at 2:23 PM, boci boci.b...@gmail.com wrote:
Thanks. I without local option I can connect with es remote, now I only
have one problem. How can I use elasticsearch-hadoop with spark streaming?
I mean DStream doesn't have saveAsHadoopFiles method, my second problem
Hi guys, thanks the direction now I have some problem/question:
- in local (test) mode I want to use ElasticClient.local to create es
connection, but in prodution I want to use ElasticClient.remote, to this I
want to pass ElasticClient to mapPartitions, or what is the best practices?
- my stream
Hi guys,
I have a small question. I want to create a Worker class which using
ElasticClient to make query to elasticsearch. (I want to enrich my data
with geo search result).
How can I do that? I try to create a worker instance with ES host/port
parameter but spark throw an exceptino (my class
Ok but in this case where can I store the ES connection? Or all document
create new ES connection inside the worker?
--
Skype: boci13, Hangout: boci.b...@gmail.com
On
24 matches
Mail list logo