Hi All,
I hit performance issues with running PCA for matrix with greater number of
features (2.5k x 15k):
import org.apache.spark.mllib.linalg.Matrix
import org.apache.spark.mllib.linalg.distributed.RowMatrix
import org.apache.spark.mllib.linalg.DenseVector
import
Hi you can take a look at:
https://github.com/hammerlab/spree
it's a bit outdated but maybe it's still possible to use with some more
recent Spark version.
M.
2016-08-25 11:55 GMT+02:00 Mich Talebzadeh :
> Hi,
>
> This may be already there.
>
> A spark job opens up a
Hi All - I try to use the new SQLContext API for populating DataFrame from
jdbc data source.
like this:
val jdbcDF = sqlContext.jdbc(url =
jdbc:postgresql://localhost:5430/dbname?user=userpassword=111, table =
se_staging.exp_table3 ,columnName=cs_id,lowerBound=1 ,upperBound =
1,
.
Cheers
On Sun, Mar 22, 2015 at 8:44 AM, Marek Wiewiorka
marek.wiewio...@gmail.com wrote:
Hi All - I try to use the new SQLContext API for populating DataFrame
from jdbc data source.
like this:
val jdbcDF = sqlContext.jdbc(url =
jdbc:postgresql://localhost:5430/dbname?user=userpassword=111
Hi,
here you can find some info regarding 1.0:
https://github.com/amplab/docker-scripts
Marek
2014-10-24 23:38 GMT+02:00 Josh J joshjd...@gmail.com:
Hi,
Is there a dockerfiles available which allow to setup a docker spark 1.1.0
cluster?
Thanks,
Josh
This actually what I've already mentioned - with rainbow tables kept in
memory it could be really fast!
Marek
2014-06-12 9:25 GMT+02:00 Michael Cutler mich...@tumra.com:
Hi Nick,
The great thing about any *unsalted* hashes is you can precompute them
ahead of time, then it is just a lookup
What about rainbow tables?
http://en.wikipedia.org/wiki/Rainbow_table
M.
2014-06-12 2:41 GMT+02:00 DB Tsai dbt...@stanford.edu:
I think creating the samples in the search space within RDD will be
too expensive, and the amount of data will probably be larger than any
cluster.
However, you
I was also thinking of using tachyon to store parquet files - maybe
tomorrow I will give a try as well.
2014-06-07 20:01 GMT+02:00 Michael Armbrust mich...@databricks.com:
Not a stupid question! I would like to be able to do this. For now, you
might try writing the data to tachyon
Exactly the same story - it used to work with 0.9.1 and does not work
anymore with 1.0.0.
I ran tests using spark-shell as well as my application(so tested turning
on coarse mode via env variable and SparkContext properties explicitly)
M.
2014-06-04 18:12 GMT+02:00 ajatix
Hi All,
there is information in 1.0.0 Spark's documentation that
there is an option --cores that one can use to set the number of cores
that spark-shell uses on the cluster:
You can also pass an option --cores numCores to control the number of
cores that spark-shell uses on the cluster.
This
That used to work with version 0.9.1 and earlier and does not seem to work
with 1.0.0.
M.
2014-06-03 17:53 GMT+02:00 Mikhail Strebkov streb...@gmail.com:
Try -c numCores instead, works for me, e.g.
bin/spark-shell -c 88
On Tue, Jun 3, 2014 at 8:15 AM, Marek Wiewiorka marek.wiewio
Hi All,
I'm trying to run my code that used to work with mesos-0.14 and spark-0.9.0
with mesos-0.18.2 and spark-1.0.0. and I'm getting a weird error when I use
coarse mode (see below).
If I use the fine-grained mode everything is ok.
Has anybody of you experienced a similar error?
more stderr
Hi All,
I've been experiencing a very strange error after upgrade from Spark 0.9 to
1.0 - it seems that saveAsTestFile function is throwing
java.lang.UnsupportedOperationException that I have never seen before.
Any hints appreciated.
scheduler.TaskSetManager: Loss was due to
at 8:46 PM, Marek Wiewiorka marek.wiewio...@gmail.com
wrote:
Hi All,
I've been experiencing a very strange error after upgrade from Spark 0.9
to 1.0 - it seems that saveAsTestFile function is throwing
java.lang.UnsupportedOperationException that I have never seen before.
Any hints appreciated
14 matches
Mail list logo