PCA slow in comparison with single-threaded R version

2017-02-06 Thread Marek Wiewiorka
Hi All, I hit performance issues with running PCA for matrix with greater number of features (2.5k x 15k): import org.apache.spark.mllib.linalg.Matrix import org.apache.spark.mllib.linalg.distributed.RowMatrix import org.apache.spark.mllib.linalg.DenseVector import

Re: Is there anyway Spark UI is set to poll and refreshes itself

2016-08-25 Thread Marek Wiewiorka
Hi you can take a look at: https://github.com/hammerlab/spree it's a bit outdated but maybe it's still possible to use with some more recent Spark version. M. 2016-08-25 11:55 GMT+02:00 Mich Talebzadeh : > Hi, > > This may be already there. > > A spark job opens up a

lowerupperBound not working/spark 1.3

2015-03-22 Thread Marek Wiewiorka
Hi All - I try to use the new SQLContext API for populating DataFrame from jdbc data source. like this: val jdbcDF = sqlContext.jdbc(url = jdbc:postgresql://localhost:5430/dbname?user=userpassword=111, table = se_staging.exp_table3 ,columnName=cs_id,lowerBound=1 ,upperBound = 1,

Re: lowerupperBound not working/spark 1.3

2015-03-22 Thread Marek Wiewiorka
. Cheers On Sun, Mar 22, 2015 at 8:44 AM, Marek Wiewiorka marek.wiewio...@gmail.com wrote: Hi All - I try to use the new SQLContext API for populating DataFrame from jdbc data source. like this: val jdbcDF = sqlContext.jdbc(url = jdbc:postgresql://localhost:5430/dbname?user=userpassword=111

Re: docker spark 1.1.0 cluster

2014-10-24 Thread Marek Wiewiorka
Hi, here you can find some info regarding 1.0: https://github.com/amplab/docker-scripts Marek 2014-10-24 23:38 GMT+02:00 Josh J joshjd...@gmail.com: Hi, Is there a dockerfiles available which allow to setup a docker spark 1.1.0 cluster? Thanks, Josh

Re: Using Spark to crack passwords

2014-06-12 Thread Marek Wiewiorka
This actually what I've already mentioned - with rainbow tables kept in memory it could be really fast! Marek 2014-06-12 9:25 GMT+02:00 Michael Cutler mich...@tumra.com: Hi Nick, The great thing about any *unsalted* hashes is you can precompute them ahead of time, then it is just a lookup

Re: Using Spark to crack passwords

2014-06-11 Thread Marek Wiewiorka
What about rainbow tables? http://en.wikipedia.org/wiki/Rainbow_table M. 2014-06-12 2:41 GMT+02:00 DB Tsai dbt...@stanford.edu: I think creating the samples in the search space within RDD will be too expensive, and the amount of data will probably be larger than any cluster. However, you

Re: cache spark sql parquet file in memory?

2014-06-07 Thread Marek Wiewiorka
I was also thinking of using tachyon to store parquet files - maybe tomorrow I will give a try as well. 2014-06-07 20:01 GMT+02:00 Michael Armbrust mich...@databricks.com: Not a stupid question! I would like to be able to do this. For now, you might try writing the data to tachyon

Re: Spark 1.0.0 fails if mesos.coarse set to true

2014-06-04 Thread Marek Wiewiorka
Exactly the same story - it used to work with 0.9.1 and does not work anymore with 1.0.0. I ran tests using spark-shell as well as my application(so tested turning on coarse mode via env variable and SparkContext properties explicitly) M. 2014-06-04 18:12 GMT+02:00 ajatix

---cores option in spark-shell

2014-06-03 Thread Marek Wiewiorka
Hi All, there is information in 1.0.0 Spark's documentation that there is an option --cores that one can use to set the number of cores that spark-shell uses on the cluster: You can also pass an option --cores numCores to control the number of cores that spark-shell uses on the cluster. This

Re: ---cores option in spark-shell

2014-06-03 Thread Marek Wiewiorka
That used to work with version 0.9.1 and earlier and does not seem to work with 1.0.0. M. 2014-06-03 17:53 GMT+02:00 Mikhail Strebkov streb...@gmail.com: Try -c numCores instead, works for me, e.g. bin/spark-shell -c 88 On Tue, Jun 3, 2014 at 8:15 AM, Marek Wiewiorka marek.wiewio

Spark 1.0.0 fails if mesos.coarse set to true

2014-06-03 Thread Marek Wiewiorka
Hi All, I'm trying to run my code that used to work with mesos-0.14 and spark-0.9.0 with mesos-0.18.2 and spark-1.0.0. and I'm getting a weird error when I use coarse mode (see below). If I use the fine-grained mode everything is ok. Has anybody of you experienced a similar error? more stderr

Strange problem with saveAsTextFile after upgrade Spark 0.9.0-1.0.0

2014-06-03 Thread Marek Wiewiorka
Hi All, I've been experiencing a very strange error after upgrade from Spark 0.9 to 1.0 - it seems that saveAsTestFile function is throwing java.lang.UnsupportedOperationException that I have never seen before. Any hints appreciated. scheduler.TaskSetManager: Loss was due to

Re: Strange problem with saveAsTextFile after upgrade Spark 0.9.0-1.0.0

2014-06-03 Thread Marek Wiewiorka
at 8:46 PM, Marek Wiewiorka marek.wiewio...@gmail.com wrote: Hi All, I've been experiencing a very strange error after upgrade from Spark 0.9 to 1.0 - it seems that saveAsTestFile function is throwing java.lang.UnsupportedOperationException that I have never seen before. Any hints appreciated