Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: 4c1de76c08153d1db8f3132a2c994a27e89a4701
https://github.com/phpmyadmin/localized_docs/commit/4c1de76c08153d1db8f3132a2c994a27e89a4701
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-02
[
https://issues.apache.org/jira/browse/SPARK-5811?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14323795#comment-14323795
]
Burak Yavuz commented on SPARK-5811:
The documentation is not really blocked, but I
Burak Yavuz created SPARK-5857:
--
Summary: pyspark PYTHONPATH not properly set up?
Key: SPARK-5857
URL: https://issues.apache.org/jira/browse/SPARK-5857
Project: Spark
Issue Type: Bug
[
https://issues.apache.org/jira/browse/SPARK-5810?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14323474#comment-14323474
]
Burak Yavuz commented on SPARK-5810:
Makes sense to add a regression test. I'll add
Burak Yavuz created SPARK-5810:
--
Summary: Maven Coordinate Inclusion failing in pySpark
Key: SPARK-5810
URL: https://issues.apache.org/jira/browse/SPARK-5810
Project: Spark
Issue Type: Bug
Burak Yavuz created SPARK-5811:
--
Summary: Documentation for --packages and --repositories on Spark
Shell
Key: SPARK-5811
URL: https://issues.apache.org/jira/browse/SPARK-5811
Project: Spark
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 63321222cd0e528555a6353d2d4e937216ef391c
https://github.com/phpmyadmin/phpmyadmin/commit/63321222cd0e528555a6353d2d4e937216ef391c
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-02-12 (Thu
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 96b710be3d018132ab8c2cf9501ccb31d6ad2e68
https://github.com/phpmyadmin/phpmyadmin/commit/96b710be3d018132ab8c2cf9501ccb31d6ad2e68
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-02-12 (Thu
wrote:
Thanks a lot!
Can I ask why this code generates a uniform distribution?
If dist is N(0,1) data should be N(-1, 2).
Let me know.
Thanks,
Luca
2015-02-07 3:00 GMT+00:00 Burak Yavuz brk...@gmail.com:
Hi,
You can do the following:
```
import
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 09539d41e6ce4eca62ff02b0ecd47bcbfe3c2fee
https://github.com/phpmyadmin/phpmyadmin/commit/09539d41e6ce4eca62ff02b0ecd47bcbfe3c2fee
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-02-09 (Mon
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: d53e064ee7a01f8768e7453d0eef73b2921b44be
https://github.com/phpmyadmin/phpmyadmin/commit/d53e064ee7a01f8768e7453d0eef73b2921b44be
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-02-08 (Sun
Forgot to add the more recent training material:
https://databricks-training.s3.amazonaws.com/index.html
On Fri, Feb 6, 2015 at 12:12 PM, Burak Yavuz brk...@gmail.com wrote:
Hi Luca,
You can tackle this using RowMatrix (spark-shell example):
```
import
Hi Luca,
You can tackle this using RowMatrix (spark-shell example):
```
import org.apache.spark.mllib.linalg.distributed.RowMatrix
import org.apache.spark.mllib.random._
// sc is the spark context, numPartitions is the number of partitions you
want the RDD to be in
val data: RDD[Vector] =
Hi,
You can do the following:
```
import org.apache.spark.mllib.linalg.distributed.RowMatrix
import org.apache.spark.mllib.random._
// sc is the spark context, numPartitions is the number of partitions you
want the RDD to be in
val dist: RDD[Vector] = RandomRDDs.normalVectorRDD(sc, n, k,
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 13d1c0dacda739d0c6af60097be3788f01ca2964
https://github.com/phpmyadmin/phpmyadmin/commit/13d1c0dacda739d0c6af60097be3788f01ca2964
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-02-04 (Wed
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: b4d7a519fa2825bf91611d98c3112679b1b5cba9
https://github.com/phpmyadmin/phpmyadmin/commit/b4d7a519fa2825bf91611d98c3112679b1b5cba9
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-01-28 (Wed
Burak Yavuz created SPARK-5341:
--
Summary: Support maven coordinates in spark-shell and spark-submit
Key: SPARK-5341
URL: https://issues.apache.org/jira/browse/SPARK-5341
Project: Spark
Issue
Burak Yavuz created SPARK-5322:
--
Summary: Add transpose() to BlockMatrix
Key: SPARK-5322
URL: https://issues.apache.org/jira/browse/SPARK-5322
Project: Spark
Issue Type: New Feature
Burak Yavuz created SPARK-5321:
--
Summary: Add transpose() method to Matrix
Key: SPARK-5321
URL: https://issues.apache.org/jira/browse/SPARK-5321
Project: Spark
Issue Type: New Feature
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 22e00e6a3578de1aede0ce06ef9e327c4bbe3f28
https://github.com/phpmyadmin/phpmyadmin/commit/22e00e6a3578de1aede0ce06ef9e327c4bbe3f28
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-01-09 (Fri
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 6f8431a71d935b9710d8f5148b3941f21408052d
https://github.com/phpmyadmin/phpmyadmin/commit/6f8431a71d935b9710d8f5148b3941f21408052d
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-01-08 (Thu
Branch: refs/heads/QA_4_3
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 2eddd0dc06e3f5ce3899fd2436b6b5541fcbcbfc
https://github.com/phpmyadmin/phpmyadmin/commit/2eddd0dc06e3f5ce3899fd2436b6b5541fcbcbfc
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-01-01 (Thu
Branch: refs/heads/QA_4_3
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 530c04d14a9de6ba9b287b2a98306a09d04ee055
https://github.com/phpmyadmin/phpmyadmin/commit/530c04d14a9de6ba9b287b2a98306a09d04ee055
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-01-01 (Thu
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: f492a2197d598a1618836719a47beaf16874ecfd
https://github.com/phpmyadmin/phpmyadmin/commit/f492a2197d598a1618836719a47beaf16874ecfd
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-01-01 (Thu
Branch: refs/heads/QA_4_3
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: d26bffd0ae44354c4f47e6852368c48166e1ab1f
https://github.com/phpmyadmin/phpmyadmin/commit/d26bffd0ae44354c4f47e6852368c48166e1ab1f
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-01-01 (Thu
Hi,
The MatrixFactorizationModel consists of two RDD's. When you use the second
method, Spark tries to serialize both RDD's for the .map() function,
which is not possible, because RDD's are not serializable. Therefore you
receive the NULLPointerException. You must use the first method.
Best,
Branch: refs/heads/QA_4_3
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 0e0eda5ff1f54eb07b26e9c46db734ff1eee966c
https://github.com/phpmyadmin/phpmyadmin/commit/0e0eda5ff1f54eb07b26e9c46db734ff1eee966c
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-12-16 (Tue
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: 3e6f0edfc6e9be3c8cd45c4cb82b8d39afe8c9e6
https://github.com/phpmyadmin/localized_docs/commit/3e6f0edfc6e9be3c8cd45c4cb82b8d39afe8c9e6
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-12
Hi,
https://github.com/databricks/spark-perf/tree/master/streaming-tests/src/main/scala/streaming/perf
contains some performance tests for streaming. There are examples of how to
generate synthetic files during the test in that repo, maybe you
can find some code snippets that you can use there.
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: eed0ffa96b6ee739036175912c32fca25985bead
https://github.com/phpmyadmin/phpmyadmin/commit/eed0ffa96b6ee739036175912c32fca25985bead
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-11-26 (Wed
Burak Yavuz created SPARK-4409:
--
Summary: Additional (but limited) Linear Algebra Utils
Key: SPARK-4409
URL: https://issues.apache.org/jira/browse/SPARK-4409
Project: Spark
Issue Type
[
https://issues.apache.org/jira/browse/SPARK-4409?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz updated SPARK-4409:
---
Description:
This ticket is to discuss the addition of a very limited number of local matrix
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 346b62740ab25f5d325f4aa74aeadd8aad7236c4
https://github.com/phpmyadmin/phpmyadmin/commit/346b62740ab25f5d325f4aa74aeadd8aad7236c4
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-11-04 (Tue
[
https://issues.apache.org/jira/browse/SPARK-3974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14192731#comment-14192731
]
Burak Yavuz commented on SPARK-3974:
Hi everyone,
The design doc for Block Matrix
Hi,
I've come across this multiple times, but not in a consistent manner. I found
it hard to reproduce. I have a jira for it: SPARK-3080
Do you observe this error every single time? Where do you load your data from?
Which version of Spark are you running?
Figuring out the similarities may
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 7180bb0f150e81dc6ceb0ff1e582bd85fdb69306
https://github.com/phpmyadmin/phpmyadmin/commit/7180bb0f150e81dc6ceb0ff1e582bd85fdb69306
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-10-16 (Thu
Hi Ray,
The reduceByKey / collectAsMap does a lot of calculations. Therefore it can
take a very long time if:
1) The parameter number of runs is set very high
2) k is set high (you have observed this already)
3) data is not properly repartitioned
It seems that it is hanging, but there is a lot
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: ac28dedf064d6f2064afb17d3311a929edd95dad
https://github.com/phpmyadmin/localized_docs/commit/ac28dedf064d6f2064afb17d3311a929edd95dad
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-10
[
https://issues.apache.org/jira/browse/SPARK-3434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14167152#comment-14167152
]
Burak Yavuz commented on SPARK-3434:
[~ConcreteVitamin], any updates? Anything I can
Branch: refs/heads/QA_4_2
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 2079e9cd9abf4d76e50494ce4bf8f7c1d4999164
https://github.com/phpmyadmin/phpmyadmin/commit/2079e9cd9abf4d76e50494ce4bf8f7c1d4999164
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-10-06 (Mon
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: 38df143ca748c7a5236c70cb0c715ea948195184
https://github.com/phpmyadmin/localized_docs/commit/38df143ca748c7a5236c70cb0c715ea948195184
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-10
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 5288df43097df61237fe4d9320a56b0886ed11db
https://github.com/phpmyadmin/phpmyadmin/commit/5288df43097df61237fe4d9320a56b0886ed11db
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-10-02 (Thu
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: 1169c49661f124d4d617d1316d62404d598d30bf
https://github.com/phpmyadmin/localized_docs/commit/1169c49661f124d4d617d1316d62404d598d30bf
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-10
Hi,
It appears that the step size is too high that the model is diverging with the
added noise.
Could you try by setting the step size to be 0.1 or 0.01?
Best,
Burak
- Original Message -
From: Krishna Sankar ksanka...@gmail.com
To: user@spark.apache.org
Sent: Wednesday, October 1,
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: 1c004d7e341e8e0d4b5c17dcdc64181220725193
https://github.com/phpmyadmin/localized_docs/commit/1c004d7e341e8e0d4b5c17dcdc64181220725193
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-09
[
https://issues.apache.org/jira/browse/SPARK-3631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14143484#comment-14143484
]
Burak Yavuz commented on SPARK-3631:
Thanks for setting this up [~aash]! [~pwendell
Hi,
spark-1.0.1/examples/src/main/python/kmeans.py = Naive example for users to
understand how to code in Spark
spark-1.0.1/python/pyspark/mllib/clustering.py = Use this!!!
Bonus: spark-1.0.1/examples/src/main/python/mllib/kmeans.py = Example on how
to call KMeans. Feel free to use it as a
Hi,
I believe it's because you're trying to use a Function of an RDD, in an RDD,
which is not possible. Instead of using a
`FunctionJavaRDDFloat`, could you try FunctionFloat, and
`public Void call(Float arg0) throws Exception { `
and
`System.out.println(arg0)`
instead. I'm not perfectly sure
Hi Gilberto,
Could you please attach the driver logs as well, so that we can pinpoint what's
going wrong? Could you also add the flag
`--driver-memory 4g` while submitting your application and try that as well?
Best,
Burak
- Original Message -
From: Gilberto Lira g...@scanboo.com.br
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: a01814147d950fa4fa4a4a9006a7c5690a9701b6
https://github.com/phpmyadmin/localized_docs/commit/a01814147d950fa4fa4a4a9006a7c5690a9701b6
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-09
Hi,
The files you mentioned are temporary files written by Spark during shuffling.
ALS will write a LOT of those files as it is a shuffle heavy algorithm.
Those files will be deleted after your program completes as Spark looks for
those files in case a fault occurs. Having those files ready
the directory will not be enough.
Best,
Burak
- Original Message -
From: Andrew Ash and...@andrewash.com
To: Burak Yavuz bya...@stanford.edu
Cc: Макар Красноперов connector@gmail.com, user
user@spark.apache.org
Sent: Wednesday, September 17, 2014 10:19:42 AM
Subject: Re: Spark and disk usage.
Hi
Streaming, and some MLlib algorithms.
If you can help with the guide, I think it would be a nice feature to have!
Burak
- Original Message -
From: Andrew Ash and...@andrewash.com
To: Burak Yavuz bya...@stanford.edu
Cc: Макар Красноперов connector@gmail.com, user
user@spark.apache.org
Hi,
Could you try repartitioning the data by .repartition(# of cores on machine) or
while reading the data, supply the number of minimum partitions as in
sc.textFile(path, # of cores on machine).
It may be that the whole data is stored in one block? If it is billions of
rows, then the indexing
Hi,
The spacing between the inputs should be a single space, not a tab. I feel like
your inputs have tabs between them instead of a single space. Therefore the
parser
cannot parse the input.
Best,
Burak
- Original Message -
From: Sameer Tilak ssti...@live.com
To: user@spark.apache.org
Hi Kyle,
I'm actively working on it now. It's pretty close to completion, I'm just
trying to figure out bottlenecks and optimize as much as possible.
As Phase 1, I implemented multi model training on Gradient Descent. Instead of
performing Vector-Vector operations on rows (examples) and
Hi,
I'm not a master on SparkSQL, but from what I understand, the problem ıs that
you're trying to access an RDD
inside an RDD here: val xyz = file.map(line = ***
extractCurRate(sqlContext.sql(select rate ... *** and
here: xyz = file.map(line = *** extractCurRate(sqlContext.sql(select rate
Hi,
val test = persons.value
.map{tuple = (tuple._1, tuple._2
.filter{event = *inactiveIDs.filter(event2 = event2._1 ==
tuple._1).count() != 0})}
Your problem is right between the asterisk. You can't make an RDD operation
inside an RDD operation, because RDD's can't be serialized.
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: 6d551e2fce7ea6e02e1194acc6a800a1af836b5b
https://github.com/phpmyadmin/localized_docs/commit/6d551e2fce7ea6e02e1194acc6a800a1af836b5b
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-09
Burak Yavuz created SPARK-3418:
--
Summary: Additional BLAS and Local Sparse Matrix support
Key: SPARK-3418
URL: https://issues.apache.org/jira/browse/SPARK-3418
Project: Spark
Issue Type: New
[
https://issues.apache.org/jira/browse/SPARK-3418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz updated SPARK-3418:
---
Summary: [MLlib] Additional BLAS and Local Sparse Matrix support (was:
Additional BLAS and Local
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: 1e0179a5b88ed87450de23a73fac265a686d0476
https://github.com/phpmyadmin/localized_docs/commit/1e0179a5b88ed87450de23a73fac265a686d0476
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-08
[
https://issues.apache.org/jira/browse/SPARK-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz updated SPARK-3280:
---
Attachment: hash-sort-comp.png
Made sort-based shuffle the default implementation
[
https://issues.apache.org/jira/browse/SPARK-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14114873#comment-14114873
]
Burak Yavuz commented on SPARK-3280:
I don't have as detailed a comparison like Josh
+1. Tested MLlib algorithms on Amazon EC2, algorithms show speed-ups between
1.5-5x compared to the 1.0.2 release.
- Original Message -
From: Patrick Wendell pwend...@gmail.com
To: dev@spark.apache.org
Sent: Thursday, August 28, 2014 8:32:11 PM
Subject: Re: [VOTE] Release Apache Spark
Yeah, saveAsTextFile is an RDD specific method. If you really want to use that
method, just turn the map into an RDD:
`sc.parallelize(x.toSeq).saveAsTextFile(...)`
Reading through the api-docs will present you many more alternate solutions!
Best,
Burak
- Original Message -
From: SK
Hi,
Spark uses by default approximately 60% of the executor heap memory to store
RDDs. That's why you have 8.6GB instead of 16GB. 95.5 is therefore the sum of
all the 8.6 GB of executor memory + the driver memory.
Best,
Burak
- Original Message -
From: SK skrishna...@gmail.com
To:
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: d0e0ed047816fa84ce213df88be75670c765eeb5
https://github.com/phpmyadmin/phpmyadmin/commit/d0e0ed047816fa84ce213df88be75670c765eeb5
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-08-27 (Wed
Hi Sameer,
I've faced this issue before. They don't show up on
http://s3.amazonaws.com/big-data-benchmark/. But you can directly use:
`sc.textFile(s3n://big-data-benchmark/pavlo/text/tiny/crawl)`
The gotcha is that you also need to supply which dataset you want: crawl,
uservisits, or rankings
Hi,
The error doesn't occur during saveAsTextFile but rather during the groupByKey
as far as I can tell. We strongly urge users to not use groupByKey
if they don't have to. What I would suggest is the following work-around:
sc.textFile(baseFile)).map { line =
val fields = line.split(\t)
Hi David,
Your job is probably hanging on the groupByKey process. Probably GC is kicking
in and the process starts to hang or the data is unbalanced and you end up with
stragglers (Once GC kicks in you'll start to get the connection errors you
shared). If you don't care about the list of
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 9cedf0a58feaad7604cdf7d09828854c11c630e6
https://github.com/phpmyadmin/phpmyadmin/commit/9cedf0a58feaad7604cdf7d09828854c11c630e6
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-08-25 (Mon
Spearman's Correlation requires the calculation of ranks for columns. You can
checkout the code here and slice the part you need!
https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/mllib/stat/correlation/SpearmanCorrelation.scala
Best,
Burak
- Original
You can check out this pull request: https://github.com/apache/spark/pull/476
LDA is on the roadmap for the 1.2 release, hopefully we will officially support
it then!
Best,
Burak
- Original Message -
From: Denny Lee denny.g@gmail.com
To: user@spark.apache.org
Sent: Thursday, August
https://github.com/phpmyadmin/phpmyadmin/commit/21a01002926cd479b2e2592b4fbea827509fed14
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-08-16 (Sat, 16 Aug 2014)
Changed paths:
M po/tr.po
Log Message:
---
Translated using Weblate (Turkish)
Currently
Burak Yavuz created SPARK-3080:
--
Summary: ArrayIndexOutOfBoundsException in ALS for Large datasets
Key: SPARK-3080
URL: https://issues.apache.org/jira/browse/SPARK-3080
Project: Spark
Issue
[
https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz updated SPARK-3080:
---
Description:
The stack trace is below:
{quote}
java.lang.ArrayIndexOutOfBoundsException: 2716
[
https://issues.apache.org/jira/browse/SPARK-3080?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz updated SPARK-3080:
---
Description:
The stack trace is below:
{quote}
java.lang.ArrayIndexOutOfBoundsException: 2716
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 2b61fb1281e094a885f580f83d8381c7cca8bb04
https://github.com/phpmyadmin/phpmyadmin/commit/2b61fb1281e094a885f580f83d8381c7cca8bb04
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-08-13 (Wed
[
https://issues.apache.org/jira/browse/SPARK-2833?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz resolved SPARK-2833.
Resolution: Fixed
performance tests for linear regression
[
https://issues.apache.org/jira/browse/SPARK-2837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz resolved SPARK-2837.
Resolution: Done
performance tests for ALS
-
Key
[
https://issues.apache.org/jira/browse/SPARK-2836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz closed SPARK-2836.
--
Resolution: Fixed
performance tests for k-means
-
Key
[
https://issues.apache.org/jira/browse/SPARK-2834?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz resolved SPARK-2834.
Resolution: Fixed
performance tests for linear algebra functions
[
https://issues.apache.org/jira/browse/SPARK-2829?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz resolved SPARK-2829.
Resolution: Fixed
Implement MLlib performance tests in spark-perf
[
https://issues.apache.org/jira/browse/SPARK-2831?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz resolved SPARK-2831.
Resolution: Fixed
performance tests for linear classification methods
Hi,
// Initialize the optimizer using logistic regression as the loss function with
L2 regularization
val lbfgs = new LBFGS(new LogisticGradient(), new SquaredL2Updater())
// Set the hyperparameters
[
https://issues.apache.org/jira/browse/SPARK-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14090498#comment-14090498
]
Burak Yavuz commented on SPARK-2916:
will do
[MLlib] While running regression tests
[
https://issues.apache.org/jira/browse/SPARK-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz updated SPARK-2916:
---
Description:
While running any of the regression algorithms with gradient descent
[
https://issues.apache.org/jira/browse/SPARK-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz updated SPARK-2916:
---
Component/s: Spark Core
[MLlib] While running regression tests with dense vectors of length greater
[
https://issues.apache.org/jira/browse/SPARK-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz updated SPARK-2916:
---
Summary: [MLlib] While running regression tests with dense vectors of
length greater than 1000
Burak Yavuz created SPARK-2916:
--
Summary: While running regression tests with dense vectors of
length greater than 1000, the treeAggregate blows up after several iterations
Key: SPARK-2916
URL: https
[
https://issues.apache.org/jira/browse/SPARK-2916?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz updated SPARK-2916:
---
Description:
While running any of the regression algorithms with gradient descent
Hi,
Could you try running spark-shell with the flag --driver-memory 2g or more if
you have more RAM available and try again?
Thanks,
Burak
- Original Message -
From: AlexanderRiggers alexander.rigg...@gmail.com
To: u...@spark.incubator.apache.org
Sent: Thursday, August 7, 2014 7:37:40
Hi Jay,
I've had the same problem you've been having in Question 1 with a synthetic
dataset. I thought I wasn't producing the dataset well enough. This seems to
be a bug. I will open a JIRA for it.
Instead of using:
ratings.map{ case Rating(u,m,r) = {
val pred = model.predict(u, m)
(r
The following code will allow you to run Logistic Regression using L-BFGS:
val lbfgs = new LBFGS(new LogisticGradient(), new SquaredL2Updater())
lbfgs.setMaxNumIterations(numIterations).setRegParam(regParam).setConvergenceTol(tol).setNumCorrections(numCor)
val weights = lbfgs.optimize(data,
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: 96fe3575d28d71967ff6d906c4cc1c720014427e
https://github.com/phpmyadmin/localized_docs/commit/96fe3575d28d71967ff6d906c4cc1c720014427e
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-08
Hi,
That is interesting. Would you please share some code on how you are setting
the regularization type, regularization parameters and running Logistic
Regression?
Thanks,
Burak
- Original Message -
From: SK skrishna...@gmail.com
To: u...@spark.incubator.apache.org
Sent: Wednesday,
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: eab70ea7b0e1e17034ffb90fe246cc836e76fd97
https://github.com/phpmyadmin/phpmyadmin/commit/eab70ea7b0e1e17034ffb90fe246cc836e76fd97
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-08-05 (Tue
Hi Guru,
Take a look at:
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark
It has all the information you need on how to contribute to Spark. Also take a
look at:
https://issues.apache.org/jira/browse/SPARK/?selectedTab=com.atlassian.jira.jira-projects-plugin:summary-panel
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: bc01c12eefc26e03088e30f36fe84cd1e727379c
https://github.com/phpmyadmin/phpmyadmin/commit/bc01c12eefc26e03088e30f36fe84cd1e727379c
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2014-08-04 (Mon
801 - 900 of 1060 matches
Mail list logo