[Phpmyadmin-git] [phpmyadmin/localized_docs] 4c1de7: Translated using Weblate (Turkish)

2015-02-17 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 4c1de76c08153d1db8f3132a2c994a27e89a4701 https://github.com/phpmyadmin/localized_docs/commit/4c1de76c08153d1db8f3132a2c994a27e89a4701 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-02

[jira] [Commented] (SPARK-5811) Documentation for --packages and --repositories on Spark Shell

2015-02-17 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14323795#comment-14323795 ] Burak Yavuz commented on SPARK-5811: The documentation is not really blocked, but I

[jira] [Created] (SPARK-5857) pyspark PYTHONPATH not properly set up?

2015-02-16 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-5857: -- Summary: pyspark PYTHONPATH not properly set up? Key: SPARK-5857 URL: https://issues.apache.org/jira/browse/SPARK-5857 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-5810) Maven Coordinate Inclusion failing in pySpark

2015-02-16 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14323474#comment-14323474 ] Burak Yavuz commented on SPARK-5810: Makes sense to add a regression test. I'll add

[jira] [Created] (SPARK-5810) Maven Coordinate Inclusion failing in pySpark

2015-02-13 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-5810: -- Summary: Maven Coordinate Inclusion failing in pySpark Key: SPARK-5810 URL: https://issues.apache.org/jira/browse/SPARK-5810 Project: Spark Issue Type: Bug

[jira] [Created] (SPARK-5811) Documentation for --packages and --repositories on Spark Shell

2015-02-13 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-5811: -- Summary: Documentation for --packages and --repositories on Spark Shell Key: SPARK-5811 URL: https://issues.apache.org/jira/browse/SPARK-5811 Project: Spark

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 633212: Translated using Weblate (Turkish)

2015-02-12 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 63321222cd0e528555a6353d2d4e937216ef391c https://github.com/phpmyadmin/phpmyadmin/commit/63321222cd0e528555a6353d2d4e937216ef391c Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-02-12 (Thu

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 96b710: Translated using Weblate (Turkish)

2015-02-12 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 96b710be3d018132ab8c2cf9501ccb31d6ad2e68 https://github.com/phpmyadmin/phpmyadmin/commit/96b710be3d018132ab8c2cf9501ccb31d6ad2e68 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-02-12 (Thu

Re: generate a random matrix with uniform distribution

2015-02-09 Thread Burak Yavuz
wrote: Thanks a lot! Can I ask why this code generates a uniform distribution? If dist is N(0,1) data should be N(-1, 2). Let me know. Thanks, Luca 2015-02-07 3:00 GMT+00:00 Burak Yavuz brk...@gmail.com: Hi, You can do the following: ``` import

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 09539d: Translated using Weblate (Turkish)

2015-02-09 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 09539d41e6ce4eca62ff02b0ecd47bcbfe3c2fee https://github.com/phpmyadmin/phpmyadmin/commit/09539d41e6ce4eca62ff02b0ecd47bcbfe3c2fee Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-02-09 (Mon

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] d53e06: Translated using Weblate (Turkish)

2015-02-08 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: d53e064ee7a01f8768e7453d0eef73b2921b44be https://github.com/phpmyadmin/phpmyadmin/commit/d53e064ee7a01f8768e7453d0eef73b2921b44be Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-02-08 (Sun

Re: matrix of random variables with spark.

2015-02-06 Thread Burak Yavuz
Forgot to add the more recent training material: https://databricks-training.s3.amazonaws.com/index.html On Fri, Feb 6, 2015 at 12:12 PM, Burak Yavuz brk...@gmail.com wrote: Hi Luca, You can tackle this using RowMatrix (spark-shell example): ``` import

Re: matrix of random variables with spark.

2015-02-06 Thread Burak Yavuz
Hi Luca, You can tackle this using RowMatrix (spark-shell example): ``` import org.apache.spark.mllib.linalg.distributed.RowMatrix import org.apache.spark.mllib.random._ // sc is the spark context, numPartitions is the number of partitions you want the RDD to be in val data: RDD[Vector] =

Re: generate a random matrix with uniform distribution

2015-02-06 Thread Burak Yavuz
Hi, You can do the following: ``` import org.apache.spark.mllib.linalg.distributed.RowMatrix import org.apache.spark.mllib.random._ // sc is the spark context, numPartitions is the number of partitions you want the RDD to be in val dist: RDD[Vector] = RandomRDDs.normalVectorRDD(sc, n, k,

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 13d1c0: Translated using Weblate (Turkish)

2015-02-04 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 13d1c0dacda739d0c6af60097be3788f01ca2964 https://github.com/phpmyadmin/phpmyadmin/commit/13d1c0dacda739d0c6af60097be3788f01ca2964 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-02-04 (Wed

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] b4d7a5: Translated using Weblate (Turkish)

2015-01-28 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: b4d7a519fa2825bf91611d98c3112679b1b5cba9 https://github.com/phpmyadmin/phpmyadmin/commit/b4d7a519fa2825bf91611d98c3112679b1b5cba9 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-01-28 (Wed

[jira] [Created] (SPARK-5341) Support maven coordinates in spark-shell and spark-submit

2015-01-20 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-5341: -- Summary: Support maven coordinates in spark-shell and spark-submit Key: SPARK-5341 URL: https://issues.apache.org/jira/browse/SPARK-5341 Project: Spark Issue

[jira] [Created] (SPARK-5322) Add transpose() to BlockMatrix

2015-01-19 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-5322: -- Summary: Add transpose() to BlockMatrix Key: SPARK-5322 URL: https://issues.apache.org/jira/browse/SPARK-5322 Project: Spark Issue Type: New Feature

[jira] [Created] (SPARK-5321) Add transpose() method to Matrix

2015-01-19 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-5321: -- Summary: Add transpose() method to Matrix Key: SPARK-5321 URL: https://issues.apache.org/jira/browse/SPARK-5321 Project: Spark Issue Type: New Feature

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 22e00e: Translated using Weblate (Turkish)

2015-01-09 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 22e00e6a3578de1aede0ce06ef9e327c4bbe3f28 https://github.com/phpmyadmin/phpmyadmin/commit/22e00e6a3578de1aede0ce06ef9e327c4bbe3f28 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-01-09 (Fri

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 6f8431: Translated using Weblate (Turkish)

2015-01-08 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 6f8431a71d935b9710d8f5148b3941f21408052d https://github.com/phpmyadmin/phpmyadmin/commit/6f8431a71d935b9710d8f5148b3941f21408052d Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-01-08 (Thu

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 2eddd0: Translated using Weblate (Turkish)

2015-01-01 Thread Burak Yavuz
Branch: refs/heads/QA_4_3 Home: https://github.com/phpmyadmin/phpmyadmin Commit: 2eddd0dc06e3f5ce3899fd2436b6b5541fcbcbfc https://github.com/phpmyadmin/phpmyadmin/commit/2eddd0dc06e3f5ce3899fd2436b6b5541fcbcbfc Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-01-01 (Thu

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 530c04: Translated using Weblate (Turkish)

2015-01-01 Thread Burak Yavuz
Branch: refs/heads/QA_4_3 Home: https://github.com/phpmyadmin/phpmyadmin Commit: 530c04d14a9de6ba9b287b2a98306a09d04ee055 https://github.com/phpmyadmin/phpmyadmin/commit/530c04d14a9de6ba9b287b2a98306a09d04ee055 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-01-01 (Thu

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] f492a2: Translated using Weblate (Turkish)

2015-01-01 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: f492a2197d598a1618836719a47beaf16874ecfd https://github.com/phpmyadmin/phpmyadmin/commit/f492a2197d598a1618836719a47beaf16874ecfd Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-01-01 (Thu

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] d26bff: Translated using Weblate (Turkish)

2015-01-01 Thread Burak Yavuz
Branch: refs/heads/QA_4_3 Home: https://github.com/phpmyadmin/phpmyadmin Commit: d26bffd0ae44354c4f47e6852368c48166e1ab1f https://github.com/phpmyadmin/phpmyadmin/commit/d26bffd0ae44354c4f47e6852368c48166e1ab1f Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-01-01 (Thu

Re: null Error in ALS model predict

2014-12-24 Thread Burak Yavuz
Hi, The MatrixFactorizationModel consists of two RDD's. When you use the second method, Spark tries to serialize both RDD's for the .map() function, which is not possible, because RDD's are not serializable. Therefore you receive the NULLPointerException. You must use the first method. Best,

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 0e0eda: Translated using Weblate (Turkish)

2014-12-15 Thread Burak Yavuz
Branch: refs/heads/QA_4_3 Home: https://github.com/phpmyadmin/phpmyadmin Commit: 0e0eda5ff1f54eb07b26e9c46db734ff1eee966c https://github.com/phpmyadmin/phpmyadmin/commit/0e0eda5ff1f54eb07b26e9c46db734ff1eee966c Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-12-16 (Tue

[Phpmyadmin-git] [phpmyadmin/localized_docs] 3e6f0e: Translated using Weblate (Turkish)

2014-12-09 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 3e6f0edfc6e9be3c8cd45c4cb82b8d39afe8c9e6 https://github.com/phpmyadmin/localized_docs/commit/3e6f0edfc6e9be3c8cd45c4cb82b8d39afe8c9e6 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-12

Re: How can I make Spark Streaming count the words in a file in a unit test?

2014-12-08 Thread Burak Yavuz
Hi, https://github.com/databricks/spark-perf/tree/master/streaming-tests/src/main/scala/streaming/perf contains some performance tests for streaming. There are examples of how to generate synthetic files during the test in that repo, maybe you can find some code snippets that you can use there.

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] eed0ff: Translated using Weblate (Turkish)

2014-11-26 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: eed0ffa96b6ee739036175912c32fca25985bead https://github.com/phpmyadmin/phpmyadmin/commit/eed0ffa96b6ee739036175912c32fca25985bead Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-11-26 (Wed

[jira] [Created] (SPARK-4409) Additional (but limited) Linear Algebra Utils

2014-11-14 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-4409: -- Summary: Additional (but limited) Linear Algebra Utils Key: SPARK-4409 URL: https://issues.apache.org/jira/browse/SPARK-4409 Project: Spark Issue Type

[jira] [Updated] (SPARK-4409) Additional (but limited) Linear Algebra Utils

2014-11-14 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-4409: --- Description: This ticket is to discuss the addition of a very limited number of local matrix

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 346b62: Translated using Weblate (Turkish)

2014-11-04 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 346b62740ab25f5d325f4aa74aeadd8aad7236c4 https://github.com/phpmyadmin/phpmyadmin/commit/346b62740ab25f5d325f4aa74aeadd8aad7236c4 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-11-04 (Tue

[jira] [Commented] (SPARK-3974) Block matrix abstracitons and partitioners

2014-10-31 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192731#comment-14192731 ] Burak Yavuz commented on SPARK-3974: Hi everyone, The design doc for Block Matrix

Re: MLLib ALS ArrayIndexOutOfBoundsException with Scala Spark 1.1.0

2014-10-27 Thread Burak Yavuz
Hi, I've come across this multiple times, but not in a consistent manner. I found it hard to reproduce. I have a jira for it: SPARK-3080 Do you observe this error every single time? Where do you load your data from? Which version of Spark are you running? Figuring out the similarities may

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 7180bb: Translated using Weblate (Turkish)

2014-10-16 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 7180bb0f150e81dc6ceb0ff1e582bd85fdb69306 https://github.com/phpmyadmin/phpmyadmin/commit/7180bb0f150e81dc6ceb0ff1e582bd85fdb69306 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-10-16 (Thu

Re: Spark KMeans hangs at reduceByKey / collectAsMap

2014-10-14 Thread Burak Yavuz
Hi Ray, The reduceByKey / collectAsMap does a lot of calculations. Therefore it can take a very long time if: 1) The parameter number of runs is set very high 2) k is set high (you have observed this already) 3) data is not properly repartitioned It seems that it is hanging, but there is a lot

[Phpmyadmin-git] [phpmyadmin/localized_docs] ac28de: Translated using Weblate (Turkish)

2014-10-12 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: ac28dedf064d6f2064afb17d3311a929edd95dad https://github.com/phpmyadmin/localized_docs/commit/ac28dedf064d6f2064afb17d3311a929edd95dad Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-10

[jira] [Commented] (SPARK-3434) Distributed block matrix

2014-10-10 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167152#comment-14167152 ] Burak Yavuz commented on SPARK-3434: [~ConcreteVitamin], any updates? Anything I can

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 2079e9: Translated using Weblate (Turkish)

2014-10-06 Thread Burak Yavuz
Branch: refs/heads/QA_4_2 Home: https://github.com/phpmyadmin/phpmyadmin Commit: 2079e9cd9abf4d76e50494ce4bf8f7c1d4999164 https://github.com/phpmyadmin/phpmyadmin/commit/2079e9cd9abf4d76e50494ce4bf8f7c1d4999164 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-10-06 (Mon

[Phpmyadmin-git] [phpmyadmin/localized_docs] 38df14: Translated using Weblate (Turkish)

2014-10-02 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 38df143ca748c7a5236c70cb0c715ea948195184 https://github.com/phpmyadmin/localized_docs/commit/38df143ca748c7a5236c70cb0c715ea948195184 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-10

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 5288df: Translated using Weblate (Turkish)

2014-10-02 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 5288df43097df61237fe4d9320a56b0886ed11db https://github.com/phpmyadmin/phpmyadmin/commit/5288df43097df61237fe4d9320a56b0886ed11db Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-10-02 (Thu

[Phpmyadmin-git] [phpmyadmin/localized_docs] 1169c4: Translated using Weblate (Turkish)

2014-10-02 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 1169c49661f124d4d617d1316d62404d598d30bf https://github.com/phpmyadmin/localized_docs/commit/1169c49661f124d4d617d1316d62404d598d30bf Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-10

Re: MLlib Linear Regression Mismatch

2014-10-01 Thread Burak Yavuz
Hi, It appears that the step size is too high that the model is diverging with the added noise. Could you try by setting the step size to be 0.1 or 0.01? Best, Burak - Original Message - From: Krishna Sankar ksanka...@gmail.com To: user@spark.apache.org Sent: Wednesday, October 1,

[Phpmyadmin-git] [phpmyadmin/localized_docs] 1c004d: Translated using Weblate (Turkish)

2014-09-22 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 1c004d7e341e8e0d4b5c17dcdc64181220725193 https://github.com/phpmyadmin/localized_docs/commit/1c004d7e341e8e0d4b5c17dcdc64181220725193 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-09

[jira] [Commented] (SPARK-3631) Add docs for checkpoint usage

2014-09-22 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143484#comment-14143484 ] Burak Yavuz commented on SPARK-3631: Thanks for setting this up [~aash]! [~pwendell

Re: Python version of kmeans

2014-09-18 Thread Burak Yavuz
Hi, spark-1.0.1/examples/src/main/python/kmeans.py = Naive example for users to understand how to code in Spark spark-1.0.1/python/pyspark/mllib/clustering.py = Use this!!! Bonus: spark-1.0.1/examples/src/main/python/mllib/kmeans.py = Example on how to call KMeans. Feel free to use it as a

Re: Odd error when using a rdd map within a stream map

2014-09-18 Thread Burak Yavuz
Hi, I believe it's because you're trying to use a Function of an RDD, in an RDD, which is not possible. Instead of using a `FunctionJavaRDDFloat`, could you try FunctionFloat, and `public Void call(Float arg0) throws Exception { ` and `System.out.println(arg0)` instead. I'm not perfectly sure

Re: Spark on EC2

2014-09-18 Thread Burak Yavuz
Hi Gilberto, Could you please attach the driver logs as well, so that we can pinpoint what's going wrong? Could you also add the flag `--driver-memory 4g` while submitting your application and try that as well? Best, Burak - Original Message - From: Gilberto Lira g...@scanboo.com.br

[Phpmyadmin-git] [phpmyadmin/localized_docs] a01814: Translated using Weblate (Turkish)

2014-09-17 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: a01814147d950fa4fa4a4a9006a7c5690a9701b6 https://github.com/phpmyadmin/localized_docs/commit/a01814147d950fa4fa4a4a9006a7c5690a9701b6 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-09

Re: Spark and disk usage.

2014-09-17 Thread Burak Yavuz
Hi, The files you mentioned are temporary files written by Spark during shuffling. ALS will write a LOT of those files as it is a shuffle heavy algorithm. Those files will be deleted after your program completes as Spark looks for those files in case a fault occurs. Having those files ready

Re: Spark and disk usage.

2014-09-17 Thread Burak Yavuz
the directory will not be enough. Best, Burak - Original Message - From: Andrew Ash and...@andrewash.com To: Burak Yavuz bya...@stanford.edu Cc: Макар Красноперов connector@gmail.com, user user@spark.apache.org Sent: Wednesday, September 17, 2014 10:19:42 AM Subject: Re: Spark and disk usage. Hi

Re: Spark and disk usage.

2014-09-17 Thread Burak Yavuz
Streaming, and some MLlib algorithms. If you can help with the guide, I think it would be a nice feature to have! Burak - Original Message - From: Andrew Ash and...@andrewash.com To: Burak Yavuz bya...@stanford.edu Cc: Макар Красноперов connector@gmail.com, user user@spark.apache.org

Re: Size exceeds Integer.MAX_VALUE in BlockFetcherIterator

2014-09-17 Thread Burak Yavuz
Hi, Could you try repartitioning the data by .repartition(# of cores on machine) or while reading the data, supply the number of minimum partitions as in sc.textFile(path, # of cores on machine). It may be that the whole data is stored in one block? If it is billions of rows, then the indexing

Re: MLLib: LIBSVM issue

2014-09-17 Thread Burak Yavuz
Hi, The spacing between the inputs should be a single space, not a tab. I feel like your inputs have tabs between them instead of a single space. Therefore the parser cannot parse the input. Best, Burak - Original Message - From: Sameer Tilak ssti...@live.com To: user@spark.apache.org

Re: [mllib] State of Multi-Model training

2014-09-16 Thread Burak Yavuz
Hi Kyle, I'm actively working on it now. It's pretty close to completion, I'm just trying to figure out bottlenecks and optimize as much as possible. As Phase 1, I implemented multi model training on Gradient Descent. Instead of performing Vector-Vector operations on rows (examples) and

Re: Spark SQL

2014-09-14 Thread Burak Yavuz
Hi, I'm not a master on SparkSQL, but from what I understand, the problem ıs that you're trying to access an RDD inside an RDD here: val xyz = file.map(line = *** extractCurRate(sqlContext.sql(select rate ... *** and here: xyz = file.map(line = *** extractCurRate(sqlContext.sql(select rate

Re: Filter function problem

2014-09-09 Thread Burak Yavuz
Hi, val test = persons.value .map{tuple = (tuple._1, tuple._2 .filter{event = *inactiveIDs.filter(event2 = event2._1 == tuple._1).count() != 0})} Your problem is right between the asterisk. You can't make an RDD operation inside an RDD operation, because RDD's can't be serialized.

[Phpmyadmin-git] [phpmyadmin/localized_docs] 6d551e: Translated using Weblate (Turkish)

2014-09-07 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 6d551e2fce7ea6e02e1194acc6a800a1af836b5b https://github.com/phpmyadmin/localized_docs/commit/6d551e2fce7ea6e02e1194acc6a800a1af836b5b Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-09

[jira] [Created] (SPARK-3418) Additional BLAS and Local Sparse Matrix support

2014-09-05 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-3418: -- Summary: Additional BLAS and Local Sparse Matrix support Key: SPARK-3418 URL: https://issues.apache.org/jira/browse/SPARK-3418 Project: Spark Issue Type: New

[jira] [Updated] (SPARK-3418) [MLlib] Additional BLAS and Local Sparse Matrix support

2014-09-05 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-3418: --- Summary: [MLlib] Additional BLAS and Local Sparse Matrix support (was: Additional BLAS and Local

[Phpmyadmin-git] [phpmyadmin/localized_docs] 1e0179: Translated using Weblate (Turkish)

2014-08-30 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 1e0179a5b88ed87450de23a73fac265a686d0476 https://github.com/phpmyadmin/localized_docs/commit/1e0179a5b88ed87450de23a73fac265a686d0476 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-08

[jira] [Updated] (SPARK-3280) Made sort-based shuffle the default implementation

2014-08-28 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-3280: --- Attachment: hash-sort-comp.png Made sort-based shuffle the default implementation

[jira] [Commented] (SPARK-3280) Made sort-based shuffle the default implementation

2014-08-28 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114873#comment-14114873 ] Burak Yavuz commented on SPARK-3280: I don't have as detailed a comparison like Josh

Re: [VOTE] Release Apache Spark 1.1.0 (RC2)

2014-08-28 Thread Burak Yavuz
+1. Tested MLlib algorithms on Amazon EC2, algorithms show speed-ups between 1.5-5x compared to the 1.0.2 release. - Original Message - From: Patrick Wendell pwend...@gmail.com To: dev@spark.apache.org Sent: Thursday, August 28, 2014 8:32:11 PM Subject: Re: [VOTE] Release Apache Spark

Re: OutofMemoryError when generating output

2014-08-28 Thread Burak Yavuz
Yeah, saveAsTextFile is an RDD specific method. If you really want to use that method, just turn the map into an RDD: `sc.parallelize(x.toSeq).saveAsTextFile(...)` Reading through the api-docs will present you many more alternate solutions! Best, Burak - Original Message - From: SK

Re: Memory statistics in the Application detail UI

2014-08-28 Thread Burak Yavuz
Hi, Spark uses by default approximately 60% of the executor heap memory to store RDDs. That's why you have 8.6GB instead of 16GB. 95.5 is therefore the sum of all the 8.6 GB of executor memory + the driver memory. Best, Burak - Original Message - From: SK skrishna...@gmail.com To:

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] d0e0ed: Translated using Weblate (Turkish)

2014-08-27 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: d0e0ed047816fa84ce213df88be75670c765eeb5 https://github.com/phpmyadmin/phpmyadmin/commit/d0e0ed047816fa84ce213df88be75670c765eeb5 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-08-27 (Wed

Re: Amplab: big-data-benchmark

2014-08-27 Thread Burak Yavuz
Hi Sameer, I've faced this issue before. They don't show up on http://s3.amazonaws.com/big-data-benchmark/. But you can directly use: `sc.textFile(s3n://big-data-benchmark/pavlo/text/tiny/crawl)` The gotcha is that you also need to supply which dataset you want: crawl, uservisits, or rankings

Re: OutofMemoryError when generating output

2014-08-26 Thread Burak Yavuz
Hi, The error doesn't occur during saveAsTextFile but rather during the groupByKey as far as I can tell. We strongly urge users to not use groupByKey if they don't have to. What I would suggest is the following work-around: sc.textFile(baseFile)).map { line = val fields = line.split(\t)

Re: saveAsTextFile hangs with hdfs

2014-08-26 Thread Burak Yavuz
Hi David, Your job is probably hanging on the groupByKey process. Probably GC is kicking in and the process starts to hang or the data is unbalanced and you end up with stragglers (Once GC kicks in you'll start to get the connection errors you shared). If you don't care about the list of

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 9cedf0: Translated using Weblate (Turkish)

2014-08-25 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 9cedf0a58feaad7604cdf7d09828854c11c630e6 https://github.com/phpmyadmin/phpmyadmin/commit/9cedf0a58feaad7604cdf7d09828854c11c630e6 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-08-25 (Mon

Re: Finding Rank in Spark

2014-08-23 Thread Burak Yavuz
Spearman's Correlation requires the calculation of ranks for columns. You can checkout the code here and slice the part you need! https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/SpearmanCorrelation.scala Best, Burak - Original

Re: LDA example?

2014-08-22 Thread Burak Yavuz
You can check out this pull request: https://github.com/apache/spark/pull/476 LDA is on the roadmap for the 1.2 release, hopefully we will officially support it then! Best, Burak - Original Message - From: Denny Lee denny.g@gmail.com To: user@spark.apache.org Sent: Thursday, August

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] b574ff: Translated using Weblate (Romanian)

2014-08-16 Thread Burak Yavuz
https://github.com/phpmyadmin/phpmyadmin/commit/21a01002926cd479b2e2592b4fbea827509fed14 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-08-16 (Sat, 16 Aug 2014) Changed paths: M po/tr.po Log Message: --- Translated using Weblate (Turkish) Currently

[jira] [Created] (SPARK-3080) ArrayIndexOutOfBoundsException in ALS for Large datasets

2014-08-15 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-3080: -- Summary: ArrayIndexOutOfBoundsException in ALS for Large datasets Key: SPARK-3080 URL: https://issues.apache.org/jira/browse/SPARK-3080 Project: Spark Issue

[jira] [Updated] (SPARK-3080) ArrayIndexOutOfBoundsException in ALS for Large datasets

2014-08-15 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-3080: --- Description: The stack trace is below: {quote} java.lang.ArrayIndexOutOfBoundsException: 2716

[jira] [Updated] (SPARK-3080) ArrayIndexOutOfBoundsException in ALS for Large datasets

2014-08-15 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-3080: --- Description: The stack trace is below: {quote} java.lang.ArrayIndexOutOfBoundsException: 2716

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 2b61fb: Translated using Weblate (Turkish)

2014-08-13 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 2b61fb1281e094a885f580f83d8381c7cca8bb04 https://github.com/phpmyadmin/phpmyadmin/commit/2b61fb1281e094a885f580f83d8381c7cca8bb04 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-08-13 (Wed

[jira] [Resolved] (SPARK-2833) performance tests for linear regression

2014-08-12 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-2833. Resolution: Fixed performance tests for linear regression

[jira] [Resolved] (SPARK-2837) performance tests for ALS

2014-08-12 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-2837. Resolution: Done performance tests for ALS - Key

[jira] [Closed] (SPARK-2836) performance tests for k-means

2014-08-12 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz closed SPARK-2836. -- Resolution: Fixed performance tests for k-means - Key

[jira] [Resolved] (SPARK-2834) performance tests for linear algebra functions

2014-08-12 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-2834. Resolution: Fixed performance tests for linear algebra functions

[jira] [Resolved] (SPARK-2829) Implement MLlib performance tests in spark-perf

2014-08-12 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-2829. Resolution: Fixed Implement MLlib performance tests in spark-perf

[jira] [Resolved] (SPARK-2831) performance tests for linear classification methods

2014-08-12 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-2831. Resolution: Fixed performance tests for linear classification methods

Re: [MLLib]:choosing the Loss function

2014-08-11 Thread Burak Yavuz
Hi, // Initialize the optimizer using logistic regression as the loss function with L2 regularization val lbfgs = new LBFGS(new LogisticGradient(), new SquaredL2Updater()) // Set the hyperparameters

[jira] [Commented] (SPARK-2916) [MLlib] While running regression tests with dense vectors of length greater than 1000, the treeAggregate blows up after several iterations

2014-08-08 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090498#comment-14090498 ] Burak Yavuz commented on SPARK-2916: will do [MLlib] While running regression tests

[jira] [Updated] (SPARK-2916) [MLlib] While running regression tests with dense vectors of length greater than 1000, the treeAggregate blows up after several iterations

2014-08-08 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-2916: --- Description: While running any of the regression algorithms with gradient descent

[jira] [Updated] (SPARK-2916) [MLlib] While running regression tests with dense vectors of length greater than 1000, the treeAggregate blows up after several iterations

2014-08-08 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-2916: --- Component/s: Spark Core [MLlib] While running regression tests with dense vectors of length greater

[jira] [Updated] (SPARK-2916) [MLlib] While running regression tests with dense vectors of length greater than 1000, the treeAggregate blows up after several iterations

2014-08-07 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-2916: --- Summary: [MLlib] While running regression tests with dense vectors of length greater than 1000

[jira] [Created] (SPARK-2916) While running regression tests with dense vectors of length greater than 1000, the treeAggregate blows up after several iterations

2014-08-07 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-2916: -- Summary: While running regression tests with dense vectors of length greater than 1000, the treeAggregate blows up after several iterations Key: SPARK-2916 URL: https

[jira] [Updated] (SPARK-2916) [MLlib] While running regression tests with dense vectors of length greater than 1000, the treeAggregate blows up after several iterations

2014-08-07 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-2916: --- Description: While running any of the regression algorithms with gradient descent

Re: KMeans Input Format

2014-08-07 Thread Burak Yavuz
Hi, Could you try running spark-shell with the flag --driver-memory 2g or more if you have more RAM available and try again? Thanks, Burak - Original Message - From: AlexanderRiggers alexander.rigg...@gmail.com To: u...@spark.incubator.apache.org Sent: Thursday, August 7, 2014 7:37:40

Re: questions about MLLib recommendation models

2014-08-07 Thread Burak Yavuz
Hi Jay, I've had the same problem you've been having in Question 1 with a synthetic dataset. I thought I wasn't producing the dataset well enough. This seems to be a bug. I will open a JIRA for it. Instead of using: ratings.map{ case Rating(u,m,r) = { val pred = model.predict(u, m) (r

Re: [MLLib]:choosing the Loss function

2014-08-07 Thread Burak Yavuz
The following code will allow you to run Logistic Regression using L-BFGS: val lbfgs = new LBFGS(new LogisticGradient(), new SquaredL2Updater()) lbfgs.setMaxNumIterations(numIterations).setRegParam(regParam).setConvergenceTol(tol).setNumCorrections(numCor) val weights = lbfgs.optimize(data,

[Phpmyadmin-git] [phpmyadmin/localized_docs] 96fe35: Translated using Weblate (Turkish)

2014-08-06 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 96fe3575d28d71967ff6d906c4cc1c720014427e https://github.com/phpmyadmin/localized_docs/commit/96fe3575d28d71967ff6d906c4cc1c720014427e Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-08

Re: Regularization parameters

2014-08-06 Thread Burak Yavuz
Hi, That is interesting. Would you please share some code on how you are setting the regularization type, regularization parameters and running Logistic Regression? Thanks, Burak - Original Message - From: SK skrishna...@gmail.com To: u...@spark.incubator.apache.org Sent: Wednesday,

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] eab70e: Translated using Weblate (Turkish)

2014-08-05 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: eab70ea7b0e1e17034ffb90fe246cc836e76fd97 https://github.com/phpmyadmin/phpmyadmin/commit/eab70ea7b0e1e17034ffb90fe246cc836e76fd97 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-08-05 (Tue

Re: Hello All

2014-08-05 Thread Burak Yavuz
Hi Guru, Take a look at: https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark It has all the information you need on how to contribute to Spark. Also take a look at: https://issues.apache.org/jira/browse/SPARK/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] bc01c1: Translated using Weblate (Turkish)

2014-08-04 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: bc01c12eefc26e03088e30f36fe84cd1e727379c https://github.com/phpmyadmin/phpmyadmin/commit/bc01c12eefc26e03088e30f36fe84cd1e727379c Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2014-08-04 (Mon

<    4   5   6   7   8   9   10   11   >