[jira] [Resolved] (SPARK-8095) Spark package dependencies not resolved when package is in local-ivy-cache

2015-06-24 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-8095. Resolution: Fixed Spark package dependencies not resolved when package is in local-ivy-cache

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 3309cd: Translated using Weblate (Turkish)

2015-06-22 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 3309cd0128fad9b207ec4dcf38638ceb7c684054 https://github.com/phpmyadmin/phpmyadmin/commit/3309cd0128fad9b207ec4dcf38638ceb7c684054 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-06-22 (Mon

Re: Confusion matrix for binary classification

2015-06-22 Thread Burak Yavuz
Hi, In Spark 1.4, you may use DataFrame.stat.crosstab to generate the confusion matrix. This would be very simple if you are using the ML Pipelines Api, and are working with DataFrames. Best, Burak On Mon, Jun 22, 2015 at 4:21 AM, CD Athuraliya cdathural...@gmail.com wrote: Hi, I am looking

Re: unsafe/compile error

2015-06-21 Thread Burak Yavuz
In addition, if you want to run a single suite, you may use: mllib/testOnly $SUITE_NAME with sbt. On Jun 21, 2015 10:32 AM, Burak Yavuz brk...@gmail.com wrote: You need to build an assembly jar for the cluster tests to pass. You may use 'sbt assembly/assembly'. Best, Burak On Jun 21, 2015 3

Re: unsafe/compile error

2015-06-21 Thread Burak Yavuz
You need to build an assembly jar for the cluster tests to pass. You may use 'sbt assembly/assembly'. Best, Burak On Jun 21, 2015 3:43 AM, acidghost andreajemm...@gmail.com wrote: After an sbt update the tests run. But all the cluster ones fail on task size should be small in both training and

[jira] [Commented] (SPARK-8475) SparkSubmit with Ivy jars is very slow to load with no internet access

2015-06-21 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14595363#comment-14595363 ] Burak Yavuz commented on SPARK-8475: Me too. I prefer option 1 as well. SparkSubmit

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 02bf44: Translated using Weblate (Turkish)

2015-06-20 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 02bf444cc2fc57fc7129c5eeea275929b7a212ac https://github.com/phpmyadmin/phpmyadmin/commit/02bf444cc2fc57fc7129c5eeea275929b7a212ac Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-06-20 (Sat

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] a43aa4: Translated using Weblate (Turkish)

2015-06-19 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: a43aa4d24a93b97c660d76083aa1add76f073099 https://github.com/phpmyadmin/phpmyadmin/commit/a43aa4d24a93b97c660d76083aa1add76f073099 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-06-19 (Fri

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 204871: Translated using Weblate (Turkish)

2015-06-18 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 2048714b5ccee7f3e718cdbebcd290eba46155db https://github.com/phpmyadmin/phpmyadmin/commit/2048714b5ccee7f3e718cdbebcd290eba46155db Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-06-18 (Thu

[jira] [Updated] (SPARK-8475) SparkSubmit with Ivy jars is very slow to load with no internet access

2015-06-18 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8475?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-8475: --- Issue Type: Improvement (was: Bug) SparkSubmit with Ivy jars is very slow to load with no internet

Re: SparkSubmit with Ivy jars is very slow to load with no internet access

2015-06-18 Thread Burak Yavuz
Hey Nathan, I like the first idea better. Let's see what others think. I'd be happy to review your PR afterwards! Best, Burak On Thu, Jun 18, 2015 at 9:53 PM, Nathan McCarthy nathan.mccar...@quantium.com.au wrote: Hey, Spark Submit adds maven central spark bintray to the ChainResolver

Re: --packages Failed to load class for data source v1.4

2015-06-14 Thread Burak Yavuz
Hi Don, This seems related to a known issue, where the classpath on the driver is missing the related classes. This is a bug in py4j as py4j uses the System Classloader rather than Spark's Context Classloader. However, this problem existed in 1.3.0 as well, therefore I'm curious whether it's the

Re: How to read avro in SparkR

2015-06-13 Thread Burak Yavuz
Hi, Not sure if this is it, but could you please try com.databricks.spark.avro instead of just avro. Thanks, Burak On Jun 13, 2015 9:55 AM, Shing Hing Man mat...@yahoo.com.invalid wrote: Hi, I am trying to read a avro file in SparkR (in Spark 1.4.0). I started R using the following.

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 148cd1: Translated using Weblate (Turkish)

2015-06-13 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 148cd13b284b50185ba5fed479f79950a7873de5 https://github.com/phpmyadmin/phpmyadmin/commit/148cd13b284b50185ba5fed479f79950a7873de5 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-06-13 (Sat

[jira] [Created] (SPARK-8313) Support Spark Packages containing R code with --packages

2015-06-11 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-8313: -- Summary: Support Spark Packages containing R code with --packages Key: SPARK-8313 URL: https://issues.apache.org/jira/browse/SPARK-8313 Project: Spark Issue

Re: [VOTE] Release Apache Spark 1.4.0 (RC4)

2015-06-05 Thread Burak Yavuz
+1 Tested on Mac OS X Burak On Thu, Jun 4, 2015 at 6:35 PM, Calvin Jia jia.cal...@gmail.com wrote: +1 Tested with input from Tachyon and persist off heap. On Thu, Jun 4, 2015 at 6:26 PM, Timothy Chen tnac...@gmail.com wrote: +1 Been testing cluster mode and client mode with mesos with

Re: Ivy support in Spark vs. sbt

2015-06-04 Thread Burak Yavuz
Hi Marcelo, This is interesting. Can you please send me links to any failing builds if you see that problem please. For now you can set a conf: `spark.jars.ivy` to use a path except `~/.ivy2` for Spark. Thanks, Burak On Thu, Jun 4, 2015 at 4:29 AM, Sean Owen so...@cloudera.com wrote: I've

[jira] [Commented] (SPARK-8095) Spark package dependencies not resolved when package is in local-ivy-cache

2015-06-03 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8095?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14572128#comment-14572128 ] Burak Yavuz commented on SPARK-8095: In the local ivy cache, it should use

[jira] [Commented] (SPARK-8023) Random Number Generation inconsistent in projections in DataFrame

2015-06-01 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-8023?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14568265#comment-14568265 ] Burak Yavuz commented on SPARK-8023: cc [~yhuai] Random Number Generation

[jira] [Created] (SPARK-8023) Random Number Generation inconsistent in projections in DataFrame

2015-06-01 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-8023: -- Summary: Random Number Generation inconsistent in projections in DataFrame Key: SPARK-8023 URL: https://issues.apache.org/jira/browse/SPARK-8023 Project: Spark

[jira] [Commented] (SPARK-7944) Spark-Shell 2.11 1.4.0-RC-03 does not add jars to class path

2015-05-31 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14566700#comment-14566700 ] Burak Yavuz commented on SPARK-7944: I saw this issue with Yarn when using Scala 2.11

[jira] [Comment Edited] (SPARK-7944) Spark-Shell 2.11 1.4.0-RC-03 does not add jars to class path

2015-05-31 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7944?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14566700#comment-14566700 ] Burak Yavuz edited comment on SPARK-7944 at 5/31/15 7:46 PM

[jira] [Commented] (SPARK-7982) crosstab should use 0 instead of null for pairs that don't appear

2015-05-31 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14566680#comment-14566680 ] Burak Yavuz commented on SPARK-7982: The reason we used null's instead of 0L

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 4dcd08: Translated using Weblate (Turkish)

2015-05-31 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 4dcd0848fe8ec1757665caf36b35279c2186f8cd https://github.com/phpmyadmin/phpmyadmin/commit/4dcd0848fe8ec1757665caf36b35279c2186f8cd Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-05-31 (Sun

[jira] [Created] (SPARK-7957) Preserve partitioning in randomSplit in RDD.scala

2015-05-29 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7957: -- Summary: Preserve partitioning in randomSplit in RDD.scala Key: SPARK-7957 URL: https://issues.apache.org/jira/browse/SPARK-7957 Project: Spark Issue Type

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] bcefc2: Translated using Weblate (Turkish)

2015-05-27 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: bcefc28a4858f763c9d1f27b7a8d09a58458db89 https://github.com/phpmyadmin/phpmyadmin/commit/bcefc28a4858f763c9d1f27b7a8d09a58458db89 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-05-27 (Wed

[jira] [Commented] (SPARK-7287) Flaky test: o.a.s.deploy.SparkSubmitSuite --packages

2015-05-23 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14557631#comment-14557631 ] Burak Yavuz commented on SPARK-7287: I don't understand why that's failing. It's

[jira] [Commented] (SPARK-7785) Add missing items to pyspark.mllib.linalg.Matrices

2015-05-21 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555313#comment-14555313 ] Burak Yavuz commented on SPARK-7785: My belief on the Python linalg api so far has

Re: foreach plus accumulator Vs mapPartitions performance

2015-05-21 Thread Burak Yavuz
Or you can simply use `reduceByKeyLocally` if you don't want to worry about implementing accumulators and such, and assuming that the reduced values will fit in memory of the driver (which you are assuming by using accumulators). Best, Burak On Thu, May 21, 2015 at 2:46 PM, ben

[jira] [Commented] (SPARK-7785) Add pretty printing to pyspark.mllib.linalg.Matrices

2015-05-21 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7785?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14555440#comment-14555440 ] Burak Yavuz commented on SPARK-7785: For operations with BlockMatrix, you will need

Re: GradientBoostedTrees.trainRegressor with categoricalFeaturesInfo

2015-05-20 Thread Burak Yavuz
Could you please open a JIRA for it? The maxBins input is missing for the Python Api. Is it possible if you can use the current master? In the current master, you should be able to use trees with the Pipeline Api and DataFrames. Best, Burak On Wed, May 20, 2015 at 2:44 PM, Don Drake

[jira] [Created] (SPARK-7745) Replace assertions with requires (IllegalArgumentException) and modify other state checks

2015-05-19 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7745: -- Summary: Replace assertions with requires (IllegalArgumentException) and modify other state checks Key: SPARK-7745 URL: https://issues.apache.org/jira/browse/SPARK-7745

[jira] [Updated] (SPARK-7381) Missing Python API for o.a.s.ml

2015-05-08 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-7381: --- Summary: Missing Python API for o.a.s.ml (was: Python API for Transformers) Missing Python API

[jira] [Created] (SPARK-7488) Python API for ml.recommendation

2015-05-08 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7488: -- Summary: Python API for ml.recommendation Key: SPARK-7488 URL: https://issues.apache.org/jira/browse/SPARK-7488 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-7487) Python API for ml.regression

2015-05-08 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7487: -- Summary: Python API for ml.regression Key: SPARK-7487 URL: https://issues.apache.org/jira/browse/SPARK-7487 Project: Spark Issue Type: Sub-task

[jira] [Updated] (SPARK-7492) Convert LocalDataFrame to LocalMatrix

2015-05-08 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-7492: --- Description: Having a method like, {code: java} Matrices.fromDataFrame(df) {code} would provide

[jira] [Created] (SPARK-7492) Convert LocalDataFrame to LocalMatrix

2015-05-08 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7492: -- Summary: Convert LocalDataFrame to LocalMatrix Key: SPARK-7492 URL: https://issues.apache.org/jira/browse/SPARK-7492 Project: Spark Issue Type: New Feature

[jira] [Updated] (SPARK-7492) Convert LocalDataFrame to LocalMatrix

2015-05-08 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-7492: --- Description: Having a method like, {code:java} Matrices.fromDataFrame(df) {code} would provide users

[jira] [Reopened] (SPARK-7245) Spearman correlation for DataFrames

2015-05-08 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz reopened SPARK-7245: Sorry, mixed this with Pearson correlation Spearman correlation for DataFrames

[jira] [Resolved] (SPARK-7245) Spearman correlation for DataFrames

2015-05-08 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7245?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-7245. Resolution: Done Fix Version/s: 1.4.0 Spearman correlation for DataFrames

[jira] [Commented] (SPARK-7486) Add the streaming implementation for estimating quantiles and median

2015-05-08 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7486?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14535616#comment-14535616 ] Burak Yavuz commented on SPARK-7486: Yes, this is a clone of SPARK-6760 and SPARK-7246

[Phpmyadmin-git] [phpmyadmin/localized_docs] fa89f7: Translated using Weblate (Turkish)

2015-05-07 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: fa89f7e5243ea2b1f3778b5dc76d3f711f57fb45 https://github.com/phpmyadmin/localized_docs/commit/fa89f7e5243ea2b1f3778b5dc76d3f711f57fb45 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-05

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] aa060d: Translated using Weblate (Turkish)

2015-05-06 Thread Burak Yavuz
Branch: refs/heads/QA_4_4 Home: https://github.com/phpmyadmin/phpmyadmin Commit: aa060d37188c28cfefe2ebb9da0a3a1b779b987b https://github.com/phpmyadmin/phpmyadmin/commit/aa060d37188c28cfefe2ebb9da0a3a1b779b987b Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-05-06 (Wed

[jira] [Created] (SPARK-7388) Python Api for Param[Array[T]]

2015-05-05 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7388: -- Summary: Python Api for Param[Array[T]] Key: SPARK-7388 URL: https://issues.apache.org/jira/browse/SPARK-7388 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-7381) Python API for Transformers

2015-05-05 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7381: -- Summary: Python API for Transformers Key: SPARK-7381 URL: https://issues.apache.org/jira/browse/SPARK-7381 Project: Spark Issue Type: Umbrella

[jira] [Created] (SPARK-7382) Python API for ml.classification

2015-05-05 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7382: -- Summary: Python API for ml.classification Key: SPARK-7382 URL: https://issues.apache.org/jira/browse/SPARK-7382 Project: Spark Issue Type: Sub-task

[jira] [Created] (SPARK-7383) Python API for ml.feature

2015-05-05 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7383: -- Summary: Python API for ml.feature Key: SPARK-7383 URL: https://issues.apache.org/jira/browse/SPARK-7383 Project: Spark Issue Type: Sub-task

Re: ReduceByKey and sorting within partitions

2015-05-04 Thread Burak Yavuz
I think this Spark Package may be what you're looking for! http://spark-packages.org/package/tresata/spark-sorted Best, Burak On Mon, May 4, 2015 at 12:56 PM, Imran Rashid iras...@cloudera.com wrote: oh wow, that is a really interesting observation, Marco Jerry. I wonder if this is worth

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 6204da: Translated using Weblate (Turkish)

2015-05-01 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 6204da5a3dba59b95591bd04c1c36b084ca69a48 https://github.com/phpmyadmin/phpmyadmin/commit/6204da5a3dba59b95591bd04c1c36b084ca69a48 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-05-01 (Fri

[jira] [Commented] (SPARK-7306) SPARK-7224 broke build with jdk6

2015-05-01 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7306?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14523592#comment-14523592 ] Burak Yavuz commented on SPARK-7306: I'll submit a patch using Guava within an hour

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] c5ad2b: Translated using Weblate (Turkish)

2015-04-30 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: c5ad2b542d5237badc0a041e10691fe9212e6df8 https://github.com/phpmyadmin/phpmyadmin/commit/c5ad2b542d5237badc0a041e10691fe9212e6df8 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-04-30 (Thu

Re: DataFrame filter referencing error

2015-04-30 Thread Burak Yavuz
Is new a reserved word for MySQL? On Thu, Apr 30, 2015 at 2:41 PM, Francesco Bigarella francesco.bigare...@gmail.com wrote: Do you know how I can check that? I googled a bit but couldn't find a clear explanation about it. I also tried to use explain() but it doesn't really help. I still

[jira] [Created] (SPARK-7224) Mock repositories for testing with --packages

2015-04-29 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7224: -- Summary: Mock repositories for testing with --packages Key: SPARK-7224 URL: https://issues.apache.org/jira/browse/SPARK-7224 Project: Spark Issue Type: Test

[jira] [Created] (SPARK-7215) Make repartition and coalesce a part of the query plan

2015-04-28 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7215: -- Summary: Make repartition and coalesce a part of the query plan Key: SPARK-7215 URL: https://issues.apache.org/jira/browse/SPARK-7215 Project: Spark Issue Type

[jira] [Created] (SPARK-7205) Support local ivy cache in --packages

2015-04-28 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7205: -- Summary: Support local ivy cache in --packages Key: SPARK-7205 URL: https://issues.apache.org/jira/browse/SPARK-7205 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-7185) Python API for math functions in DataFrames

2015-04-28 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-7185?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz resolved SPARK-7185. Resolution: Duplicate Python API for math functions in DataFrames

Fwd: Change ivy cache for spark on Windows

2015-04-27 Thread Burak Yavuz
+user -- Forwarded message -- From: Burak Yavuz brk...@gmail.com Date: Mon, Apr 27, 2015 at 1:59 PM Subject: Re: Change ivy cache for spark on Windows To: mj jone...@gmail.com Hi, In your conf file (SPARK_HOME\conf\spark-defaults.conf) you can set: `spark.jars.ivy \your\path

[jira] [Created] (SPARK-7185) Python API for math functions in DataFrames

2015-04-27 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-7185: -- Summary: Python API for math functions in DataFrames Key: SPARK-7185 URL: https://issues.apache.org/jira/browse/SPARK-7185 Project: Spark Issue Type: New

Re: Understanding Spark/MLlib failures

2015-04-23 Thread Burak Yavuz
Hi Andrew, I observed similar behavior under high GC pressure, when running ALS. What happened to me was that, there would be very long Full GC pauses (over 600 seconds at times). These would prevent the executors from sending heartbeats to the driver. Then the driver would think that the

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 1f6747: Translated using Weblate (Turkish)

2015-04-19 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 1f67478ac7928a8610a5f6f613ebd4a87c86597d https://github.com/phpmyadmin/phpmyadmin/commit/1f67478ac7928a8610a5f6f613ebd4a87c86597d Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-04-19 (Sun

Re: Benchmaking col vs row similarities

2015-04-10 Thread Burak Yavuz
Depends... The heartbeat you received happens due to GC pressure (probably due to Full GC). If you increase the memory too much, the GC's may be less frequent, but the Full GC's may take longer. Try increasing the following confs: spark.executor.heartbeatInterval

[jira] [Commented] (SPARK-6407) Streaming ALS for Collaborative Filtering

2015-04-06 Thread Burak Yavuz (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-6407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14481874#comment-14481874 ] Burak Yavuz commented on SPARK-6407: I actually worked on this over the weekend

Re: Spark 2.0: Rearchitecting Spark for Mobile, Local, Social

2015-04-01 Thread Burak Yavuz
This is awesome! I can write the apps for it, to make the Web UI more functional! On Wed, Apr 1, 2015 at 12:37 AM, Tathagata Das tathagata.das1...@gmail.com wrote: This is a significant effort that Reynold has undertaken, and I am super glad to see that it's finally taking a concrete form.

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 9a3cad: Translated using Weblate (Greek)

2015-03-31 Thread Burak Yavuz
: 1ebe9b20148b305f03ad5eff0af0a60d851e8eb8 https://github.com/phpmyadmin/phpmyadmin/commit/1ebe9b20148b305f03ad5eff0af0a60d851e8eb8 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-31 (Tue, 31 Mar 2015) Changed paths: M po/tr.po Log Message: --- Translated using

Re: Query REST web service with Spark?

2015-03-31 Thread Burak Yavuz
Hi, If I recall correctly, I've read people integrating REST calls to Spark Streaming jobs in the user list. I don't imagine any cases for why it shouldn't be possible. Best, Burak On Tue, Mar 31, 2015 at 1:46 PM, Minnow Noir minnown...@gmail.com wrote: We have have some data on Hadoop that

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 428dec: Translated using Weblate (Turkish)

2015-03-30 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 428dec6350904d0024cda60991bddc407526b6d7 https://github.com/phpmyadmin/phpmyadmin/commit/428dec6350904d0024cda60991bddc407526b6d7 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-30 (Mon

Re: Why KMeans with mllib is so slow ?

2015-03-28 Thread Burak Yavuz
Hi David, Can you also try with Spark 1.3 if possible? I believe there was a 2x improvement on K-Means between 1.2 and 1.3. Thanks, Burak On Sat, Mar 28, 2015 at 9:04 PM, davidshen84 davidshe...@gmail.com wrote: Hi Jao, Sorry to pop up this old thread. I am have the same problem like you

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 0cbca7: Translated using Weblate (Turkish)

2015-03-26 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 0cbca7a0225fc13726580ab11f1bd0881de739ca https://github.com/phpmyadmin/phpmyadmin/commit/0cbca7a0225fc13726580ab11f1bd0881de739ca Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-26 (Thu

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 7074a9: Translated using Weblate (Turkish)

2015-03-26 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 7074a980569f797a8cd67bf20d916c2d127f5adb https://github.com/phpmyadmin/phpmyadmin/commit/7074a980569f797a8cd67bf20d916c2d127f5adb Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-26 (Thu

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 25d7d2: Translated using Weblate (Turkish)

2015-03-22 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 25d7d23d2b3840f6c998522b277c992e3e438685 https://github.com/phpmyadmin/phpmyadmin/commit/25d7d23d2b3840f6c998522b277c992e3e438685 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-22 (Sun

Re: netlib-java cannot load native lib in Windows when using spark-submit

2015-03-22 Thread Burak Yavuz
Did you build Spark with: -Pnetlib-lgpl? Ref: https://spark.apache.org/docs/latest/mllib-guide.html Burak On Sun, Mar 22, 2015 at 7:37 AM, Ted Yu yuzhih...@gmail.com wrote: How about pointing LD_LIBRARY_PATH to native lib folder ? You need Spark 1.2.0 or higher for the above to work. See

Re: Which linear algebra interface to use within Spark MLlib?

2015-03-20 Thread Burak Yavuz
Hi, We plan to add a more comprehensive local linear algebra package for MLlib 1.4. This local linear algebra package can then easily be extended to BlockMatrix to support the same operations in a distributed fashion. You may find the JIRA to track this here: SPARK-6442

[jira] [Created] (SPARK-6442) MLlib 1.4 Local Linear Algebra Package

2015-03-20 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-6442: -- Summary: MLlib 1.4 Local Linear Algebra Package Key: SPARK-6442 URL: https://issues.apache.org/jira/browse/SPARK-6442 Project: Spark Issue Type: New Feature

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 3c0978: Translated using Weblate (Turkish)

2015-03-18 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 3c0978daa901b478d64db109ccde23ec81c8fe3d https://github.com/phpmyadmin/phpmyadmin/commit/3c0978daa901b478d64db109ccde23ec81c8fe3d Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-18 (Wed

Re: RDD ordering after map

2015-03-18 Thread Burak Yavuz
Hi, Yes, ordering is preserved with map. Shuffles break ordering. Burak On Wed, Mar 18, 2015 at 2:02 PM, sergunok ser...@gmail.com wrote: Does map(...) preserve ordering of original RDD? -- View this message in context:

[Phpmyadmin-git] [phpmyadmin/localized_docs] b42374: Translated using Weblate (Turkish)

2015-03-15 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: b42374d63b0bfb19e1d7a65557d12833c3537ebb https://github.com/phpmyadmin/localized_docs/commit/b42374d63b0bfb19e1d7a65557d12833c3537ebb Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03

Re: Getting incorrect weights for LinearRegression

2015-03-13 Thread Burak Yavuz
Hi, I would suggest you use LBFGS, as I think the step size is hurting you. You can run the same thing in LBFGS as: ``` val algorithm = new LBFGS(new LeastSquaresGradient(), new SimpleUpdater()) val initialWeights = Vectors.dense(Array.fill(3)( scala.util.Random.nextDouble())) val weights =

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 984a0e: Translated using Weblate (Turkish)

2015-03-09 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 984a0e9b87af45cf705e4c57d2818371300f1a90 https://github.com/phpmyadmin/phpmyadmin/commit/984a0e9b87af45cf705e4c57d2818371300f1a90 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-09 (Mon

Re: Solve least square problem of the form min norm(A x - b)^2^ + lambda * n * norm(x)^2 ?

2015-03-09 Thread Burak Yavuz
Hi Jaonary, The RowPartitionedMatrix is a special case of the BlockMatrix, where the colsPerBlock = nCols. I hope that helps. Burak On Mar 6, 2015 9:13 AM, Jaonary Rabarisoa jaon...@gmail.com wrote: Hi Shivaram, Thank you for the link. I'm trying to figure out how can I port this to mllib.

Re: what are the types of tasks when running ALS iterations

2015-03-09 Thread Burak Yavuz
+user On Mar 9, 2015 8:47 AM, Burak Yavuz brk...@gmail.com wrote: Hi, In the web UI, you don't see every single task. You see the name of the last task before the stage boundary (which is a shuffle like a groupByKey), which in your case is a flatMap. Therefore you only see flatMap in the UI

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] f2e0b8: Translated using Weblate (Turkish)

2015-03-08 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: f2e0b860751476246e9d0e4d350370d4327b96a3 https://github.com/phpmyadmin/phpmyadmin/commit/f2e0b860751476246e9d0e4d350370d4327b96a3 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-08 (Sun

Re: How to reuse a ML trained model?

2015-03-07 Thread Burak Yavuz
Hi, There is model import/export for some of the ML algorithms on the current master (and they'll be shipped with the 1.3 release). Burak On Mar 7, 2015 4:17 AM, Xi Shen davidshe...@gmail.com wrote: Wait...it seem SparkContext does not provide a way to save/load object files. It can only

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 2ce287: Translated using Weblate (Turkish)

2015-03-06 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 2ce287a4ad329e2a80f83cf81b3c506e3acfb245 https://github.com/phpmyadmin/phpmyadmin/commit/2ce287a4ad329e2a80f83cf81b3c506e3acfb245 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-06 (Fri

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] a4a0fd: Translated using Weblate (Turkish)

2015-03-06 Thread Burak Yavuz
Branch: refs/heads/QA_4_4 Home: https://github.com/phpmyadmin/phpmyadmin Commit: a4a0fd66888665f8416f6b5063f5cbb33d4fb71f https://github.com/phpmyadmin/phpmyadmin/commit/a4a0fd66888665f8416f6b5063f5cbb33d4fb71f Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-06 (Fri

Re: spark-sorted, or secondary sort and streaming reduce for spark

2015-03-06 Thread Burak Yavuz
Hi Koert, Would you like to register this on spark-packages.org? Burak On Fri, Mar 6, 2015 at 8:53 AM, Koert Kuipers ko...@tresata.com wrote: currently spark provides many excellent algorithms for operations per key as long as the data send to the reducers per key fits in memory. operations

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 86198f: Translated using Weblate (Turkish)

2015-03-04 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 86198fc45d0ffaf5b59ba3402ef59ed6b7d2cffd https://github.com/phpmyadmin/phpmyadmin/commit/86198fc45d0ffaf5b59ba3402ef59ed6b7d2cffd Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-04 (Wed

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 457f11: Translated using Weblate (Turkish)

2015-03-03 Thread Burak Yavuz
Branch: refs/heads/QA_4_4 Home: https://github.com/phpmyadmin/phpmyadmin Commit: 457f117af7b7b96f6d6392a2cf2d46696f7ba550 https://github.com/phpmyadmin/phpmyadmin/commit/457f117af7b7b96f6d6392a2cf2d46696f7ba550 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-03 (Tue

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] c7fecf: Translated using Weblate (Bulgarian)

2015-03-02 Thread Burak Yavuz
(Hungarian) Currently translated at 99.9% (3031 of 3032 strings) [CI skip] Commit: a826f7c3e2e2592da8cf14b43b201f96c55d4b6a https://github.com/phpmyadmin/phpmyadmin/commit/a826f7c3e2e2592da8cf14b43b201f96c55d4b6a Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-02 (Mon, 02

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 2117b4: Translated using Weblate (Interlingua)

2015-03-01 Thread Burak Yavuz
/60b5dacf0408c96137271ce731e9147b6771017c Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-03-01 (Sun, 01 Mar 2015) Changed paths: M po/tr.po Log Message: --- Translated using Weblate (Turkish) Currently translated at 99.9% (3029 of 3030 strings) [CI skip] Compare: https

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 4e956a: Translated using Weblate (Turkish)

2015-02-27 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 4e956a574b158ca7999ddf2a588b80acea7f5a51 https://github.com/phpmyadmin/phpmyadmin/commit/4e956a574b158ca7999ddf2a588b80acea7f5a51 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-02-27 (Fri

[Phpmyadmin-git] [phpmyadmin/localized_docs] 817ab0: Translated using Weblate (Turkish)

2015-02-27 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/localized_docs Commit: 817ab08393bcc769c2e70ea57aad1c37b6f60315 https://github.com/phpmyadmin/localized_docs/commit/817ab08393bcc769c2e70ea57aad1c37b6f60315 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-02

Re: Problem getting program to run on 15TB input

2015-02-27 Thread Burak Yavuz
Hi, Not sure if it can help, but `StorageLevel.MEMORY_AND_DISK_SER` generates many small objects that lead to very long GC time, causing the executor losts, heartbeat not received, and GC overhead limit exceeded messages. Could you try using `StorageLevel.MEMORY_AND_DISK` instead? You can also

[jira] [Created] (SPARK-6047) pyspark - class loading on driver failing with --jars and --packages

2015-02-26 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-6047: -- Summary: pyspark - class loading on driver failing with --jars and --packages Key: SPARK-6047 URL: https://issues.apache.org/jira/browse/SPARK-6047 Project: Spark

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] cf2055: Translated using Weblate (Turkish)

2015-02-25 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: cf2055db0a2662eac217829c7f01972bed6b2c9f https://github.com/phpmyadmin/phpmyadmin/commit/cf2055db0a2662eac217829c7f01972bed6b2c9f Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-02-25 (Wed

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] 553795: Translated using Weblate (Turkish)

2015-02-25 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: 553795327e14bdf1bfb550c85869b05ac1a6c122 https://github.com/phpmyadmin/phpmyadmin/commit/553795327e14bdf1bfb550c85869b05ac1a6c122 Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-02-25 (Wed

[jira] [Created] (SPARK-6032) Move ivy logging to System.err in --packages

2015-02-25 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-6032: -- Summary: Move ivy logging to System.err in --packages Key: SPARK-6032 URL: https://issues.apache.org/jira/browse/SPARK-6032 Project: Spark Issue Type

[jira] [Created] (SPARK-6031) Refactor --packages to work inside the DriverBootstrapper so that the jars can be added to the driver classpath

2015-02-25 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-6031: -- Summary: Refactor --packages to work inside the DriverBootstrapper so that the jars can be added to the driver classpath Key: SPARK-6031 URL: https://issues.apache.org/jira/browse

[jira] [Created] (SPARK-5979) `--packages` should not exclude spark streaming assembly jars for kafka and flume

2015-02-24 Thread Burak Yavuz (JIRA)
Burak Yavuz created SPARK-5979: -- Summary: `--packages` should not exclude spark streaming assembly jars for kafka and flume Key: SPARK-5979 URL: https://issues.apache.org/jira/browse/SPARK-5979 Project

[Phpmyadmin-git] [phpmyadmin/phpmyadmin] e1b049: Translated using Weblate (Turkish)

2015-02-22 Thread Burak Yavuz
Branch: refs/heads/master Home: https://github.com/phpmyadmin/phpmyadmin Commit: e1b049c9b9122967f06ec20c51681b4b06057def https://github.com/phpmyadmin/phpmyadmin/commit/e1b049c9b9122967f06ec20c51681b4b06057def Author: Burak Yavuz hitowerdi...@hotmail.com Date: 2015-02-23 (Mon

Re: Why is RDD lookup slow?

2015-02-19 Thread Burak Yavuz
If your dataset is large, there is a Spark Package called IndexedRDD optimized for lookups. Feel free to check that out. Burak On Feb 19, 2015 7:37 AM, Ilya Ganelin ilgan...@gmail.com wrote: Hi Shahab - if your data structures are small enough a broadcasted Map is going to provide faster

<    3   4   5   6   7   8   9   10   11   >