Re: android + spark streaming?

2014-10-04 Thread ll
any comment/feedback/advice on this is much appreciated! thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/android-spark-streaming-tp15661p15735.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

scala Vector vs mllib Vector

2014-10-04 Thread ll
what are the pros/cons of each? when should we use mllib Vector, and when to use standard scala Vector? thanks. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/scala-Vector-vs-mllib-Vector-tp15736.html Sent from the Apache Spark User List mailing list

Re: Spark Streaming writing to HDFS

2014-10-04 Thread Sean Owen
Are you importing the '.mapred.' version of TextOutputFormat instead of the new API '.mapreduce.' version? On Sat, Oct 4, 2014 at 1:08 AM, Abraham Jacob abe.jac...@gmail.com wrote: Hi All, Would really appreciate if someone in the community can help me with this. I have a simple Java spark

Re: scala Vector vs mllib Vector

2014-10-04 Thread Dean Wampler
Briefly, MLlib's Vector and the concrete subclasses DenseVector and SparkVector wrap Java arrays, which are mutable and maximize memory efficiency. To update one of these vectors, you mutate the elements of the underlying array. That's great for performance, but dangerous in multithreaded programs

Re: spark 1.1.0 - hbase 0.98.6-hadoop2 version - py4j.protocol.Py4JJavaError java.lang.ClassNotFoundException

2014-10-04 Thread Nick Pentreath
forgot to copy user list On Sat, Oct 4, 2014 at 3:12 PM, Nick Pentreath nick.pentre...@gmail.com wrote: what version did you put in the pom.xml? it does seem to be in Maven central: http://search.maven.org/#artifactdetails%7Corg.apache.hbase%7Chbase%7C0.98.6-hadoop2%7Cpom dependency

Re: scala Vector vs mllib Vector

2014-10-04 Thread ll
thanks dean. thanks for the answer with great clarity! i'm working on an algorithm that has a weight vector W(w0, w1, .., wN). the elements of this weight vector are adjusted/updated frequently - every iteration of the algorithm. how would you recommend to implement this vector? what is the

Re: Spark Streaming writing to HDFS

2014-10-04 Thread Abraham Jacob
Hi Sean/All, I am importing among various other things the newer mapreduce version - import org.apache.hadoop.conf.Configuration; import org.apache.hadoop.io.IntWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.lib.output.TextOutputFormat; import

Impala comparisons

2014-10-04 Thread Debasish Das
Hi, We write the output of models and other information as parquet files and later we let data APIs run SQL queries on the columnar data... SparkSQL is used to dump the data in parquet format and now we are considering whether using SparkSQL or Impala to read it back... I came across this

Re: [ANN] SparkSQL support for Cassandra with Calliope

2014-10-04 Thread Rohit Rai
Hi Tian, We have published a build against Hadoop 2.0 with version *1.1.0-CTP-U2-H2* Let us know how your testing goes. Regards, Rohit *Founder CEO, **Tuplejump, Inc.* www.tuplejump.com *The Data Engineering Platform* On Sat, Oct 4, 2014 at 3:49 AM, tian zhang

Re: Worker with no Executor (YARN client-mode)

2014-10-04 Thread Sandy Ryza
Hey Jon, Since you're running on YARN, the Worker shouldn't be involved. Are you able to go to the YARN ResourceManager web UI and click on nodes in the top left. Does that node show up in the list? If you click on it, what's shown under Total Pmem allocated for Container? It also might be

org/apache/commons/math3/random/RandomGenerator issue

2014-10-04 Thread anny9699
Hi, I use the breeze.stats.distributions.Bernoulli in my code, however met this problem java.lang.NoClassDefFoundError: org/apache/commons/math3/random/RandomGenerator I read the posts about this problem before, and if I added dependency groupIdorg.apache.commons/groupId

Re: org/apache/commons/math3/random/RandomGenerator issue

2014-10-04 Thread Ted Yu
Cycling bits: http://search-hadoop.com/m/JW1q5UX9S1/breeze+sparksubj=Build+error+when+using+spark+with+breeze On Sat, Oct 4, 2014 at 12:59 PM, anny9699 anny9...@gmail.com wrote: Hi, I use the breeze.stats.distributions.Bernoulli in my code, however met this problem

Re: org/apache/commons/math3/random/RandomGenerator issue

2014-10-04 Thread Ted Yu
See the last comment in that thread from Xiangrui: bq. include breeze in the dependency set of your project. Do not rely on transitive dependencies Cheers On Sat, Oct 4, 2014 at 1:48 PM, 陈韵竹 anny9...@gmail.com wrote: Hi Ted, So according to previous posts, the problem should be solved by

Re: org/apache/commons/math3/random/RandomGenerator issue

2014-10-04 Thread anny9699
Hi Ted, I tried including dependency groupIdorg.apache.commons/groupId artifactIdcommons-math3/artifactId version3.3/version /dependency in my pom file and adding this jar to my classpath. However this error still appears as Exception in thread main

Asynchronous Broadcast from driver to workers, is it possible?

2014-10-04 Thread Peng Cheng
While Spark already offers support for asynchronous reduce (collect data from workers, while not interrupting execution of a parallel transformation) through accumulator, I have made little progress on making this process reciprocal, namely, to broadcast data from driver to workers to be used by

Re: org/apache/commons/math3/random/RandomGenerator issue

2014-10-04 Thread anny9699
Thanks Ted this is working now! Previously I added another commons-math3 jar to my classpath and that one doesn't work. This one included by maven seems to work. Thanks a lot! -- View this message in context: