Re: OutOfMemory error with Spark ML 1.5 logreg example

2015-09-07 Thread boci
Hi, Can you try to using save method instead of write? ex: out_df.save("path","parquet") b0c1 -- Skype: boci13, Hangout: boci.b...@gmail.com On Mon, Sep 7, 2015 at

Re: Mesos + Spark

2015-07-24 Thread boci
/deanwampler http://polyglotprogramming.com On Wed, Jul 22, 2015 at 3:53 AM, boci boci.b...@gmail.com wrote: Hi guys! I'm a new in mesos. I have two spark application (one streaming and one batch). I want to run both app in mesos cluster. Now for testing I want to run in docker container so I started

Mesos + Spark

2015-07-22 Thread boci
Hi guys! I'm a new in mesos. I have two spark application (one streaming and one batch). I want to run both app in mesos cluster. Now for testing I want to run in docker container so I started a simple redjack/mesos-master, but I think a lot of think unclear for me (both mesos and spark-mesos).

Spark streaming with kafka

2015-05-28 Thread boci
Hi guys, I using spark streaming with kafka... In local machine (start as java application without using spark-submit) it's work, connect to kafka and do the job (*). I tried to put into spark docker container (hadoop 2.6, spark 1.3.1, try spark submit wil local[5] and yarn-client too ) but I'm

Re: Strange ClassNotFound exeption

2015-05-24 Thread boci
SelectionPath 521 Mon Sep 29 12:05:36 PDT 2014 akka/actor/SelectionPathElement.class Is the above jar in your classpath ? On Sat, May 23, 2015 at 5:05 PM, boci boci.b...@gmail.com wrote: Hi guys! I have a small spark application. It's query some data from postgres, enrich it and write

Strange ClassNotFound exeption

2015-05-23 Thread boci
Hi guys! I have a small spark application. It's query some data from postgres, enrich it and write to elasticsearch. When I deployed into spark container I got a very fustrating error: https://gist.github.com/b0c1/66527e00bada1e4c0dc3 Spark version: 1.3.1 Hadoop version: 2.6.0 Additional info:

Re: Standalone spark

2015-02-25 Thread boci
25, 2015 at 11:05 PM, boci boci.b...@gmail.com wrote: Thanks your fast answer... in windows it's not working, because hadoop (surprise suprise) need winutils.exe. Without this it's not working, but if you not set the hadoop directory You simply get 15/02/26 00:03:16 ERROR Shell: Failed

Re: MLLib beginner question

2014-12-23 Thread boci
is the dataset you want to use in prediction? -Xiangrui On Mon, Dec 22, 2014 at 1:47 PM, boci boci.b...@gmail.com wrote: Hi! I want to try out spark mllib in my spark project, but I got a little problem. I have training data (external file), but the real data com from another rdd. How can I do

MLLib beginner question

2014-12-22 Thread boci
Hi! I want to try out spark mllib in my spark project, but I got a little problem. I have training data (external file), but the real data com from another rdd. How can I do that? I try to simple using same SparkContext to boot rdd (first I create rdd using sc.textFile() and after

Re: Out of any idea

2014-07-20 Thread boci
/ On Sat, Jul 19, 2014 at 2:39 PM, boci boci.b...@gmail.com wrote: Hi guys! I run out of ideas... I created a spark streaming job (kafka - spark - ES). If I start my app local machine (inside the editor, but connect to the real kafka and ES) the application work correctly. If I start

Uber jar with SBT

2014-07-19 Thread boci
Hi Guys, I try to create spark uber jar with sbt but I have a lot of problem... I want to use the following: - Spark streaming - Kafka - Elsaticsearch - HBase the current jar size is cca 60M and it's not working. - When I deploy with spark-submit: It's running and exit without any error - When I

Re: Uber jar with SBT

2014-07-19 Thread boci
files has more than 65536 files, and Java 6 has various issues with jars this large. If possible, use Java 7 everywhere. https://issues.apache.org/jira/browse/SPARK-1520 On Sat, Jul 19, 2014 at 2:30 PM, boci boci.b...@gmail.com wrote: Hi Guys, I try to create spark uber jar with sbt but I

Out of any idea

2014-07-19 Thread boci
Hi guys! I run out of ideas... I created a spark streaming job (kafka - spark - ES). If I start my app local machine (inside the editor, but connect to the real kafka and ES) the application work correctly. If I start it in my docker container (same kafka and ES, local mode (local[4]) like inside

sbt + idea + test

2014-07-14 Thread boci
Hi guys, I want to use Elasticsearch and HBase in my spark project, I want to create a test. I pulled up ES and Zookeeper, but if I put val htest = new HBaseTestingUtility() to my app I got a strange exception (compilation time, not runtime). https://gist.github.com/b0c1/4a4b3f6350816090c3b5

Kafka/ES question

2014-06-29 Thread boci
Hi! I try to use spark with kafka, everything is work but I found a little problem. I create a small test application which connect to real kafka cluster, send a message and read it back. It's work, but when I run my test second time (send/read) it's read the first and the second stream (maybe

Re: ElasticSearch enrich

2014-06-27 Thread boci
your experiences with Elasticsearch Spark go :) On Thu, Jun 26, 2014 at 3:17 PM, boci boci.b...@gmail.com wrote: Wow, thanks your fast answer, it's help a lot... b0c1

Re: ElasticSearch enrich

2014-06-27 Thread boci
not called) Any idea? b0c1 -- Skype: boci13, Hangout: boci.b...@gmail.com On Fri, Jun 27, 2014 at 4:53 PM, boci boci.b...@gmail.com wrote: Another question

Re: ElasticSearch enrich

2014-06-27 Thread boci
]? 2) When you say breakpoint, how are you setting this break point? There is a good chance your breakpoint mechanism doesn't work in a distributed environment, could you instead cause a side effect (like writing to a file)? Cheers, Holden :) On Fri, Jun 27, 2014 at 2:04 PM, boci boci.b

Re: ElasticSearch enrich

2014-06-26 Thread boci
-- Skype: boci13, Hangout: boci.b...@gmail.com On Thu, Jun 26, 2014 at 1:20 AM, Holden Karau hol...@pigscanfly.ca wrote: On Wed, Jun 25, 2014 at 4:16 PM, boci boci.b...@gmail.com wrote: Hi guys, thanks the direction now I have some problem/question: - in local (test) mode I want to use

Re: ElasticSearch enrich

2014-06-26 Thread boci
/elasticsearch) and use the default config (host = localhost, port = 9200). On Thu, Jun 26, 2014 at 9:04 AM, boci boci.b...@gmail.com wrote: That's okay, but hadoop has ES integration. what happened if I run saveAsHadoopFile without hadoop (or I must need to pull up hadoop programatically? (if I

Re: ElasticSearch enrich

2014-06-26 Thread boci
:) On Thu, Jun 26, 2014 at 2:23 PM, boci boci.b...@gmail.com wrote: Thanks. I without local option I can connect with es remote, now I only have one problem. How can I use elasticsearch-hadoop with spark streaming? I mean DStream doesn't have saveAsHadoopFiles method, my second problem

Re: ElasticSearch enrich

2014-06-25 Thread boci
Hi guys, thanks the direction now I have some problem/question: - in local (test) mode I want to use ElasticClient.local to create es connection, but in prodution I want to use ElasticClient.remote, to this I want to pass ElasticClient to mapPartitions, or what is the best practices? - my stream

ElasticSearch enrich

2014-06-24 Thread boci
Hi guys, I have a small question. I want to create a Worker class which using ElasticClient to make query to elasticsearch. (I want to enrich my data with geo search result). How can I do that? I try to create a worker instance with ES host/port parameter but spark throw an exceptino (my class

Re: ElasticSearch enrich

2014-06-24 Thread boci
Ok but in this case where can I store the ES connection? Or all document create new ES connection inside the worker? -- Skype: boci13, Hangout: boci.b...@gmail.com On