Not sure if it would be the most efficient, but maybe you can think of the
filesystem as a key value store, and write each batch to a sub-directory,
where the directory name is the batch time. If the directory already
exists, then you shouldn't write it. Then you may have a following batch
job
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 7d4aa69470082c559894fe89f47d594071f30f77
https://github.com/phpmyadmin/phpmyadmin/commit/7d4aa69470082c559894fe89f47d594071f30f77
Author: Burak Yavuz <hitowerdi...@hotmail.com>
Date: 2015
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: a039b6232ae3f45e0115411587f088b419cb3b63
https://github.com/phpmyadmin/phpmyadmin/commit/a039b6232ae3f45e0115411587f088b419cb3b63
Author: Burak Yavuz <hitowerdi...@hotmail.com>
Date: 2015
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: c7378b15cff94f1780969ce4c181d7a5a1c00f2c
https://github.com/phpmyadmin/phpmyadmin/commit/c7378b15cff94f1780969ce4c181d7a5a1c00f2c
Author: Burak Yavuz <hitowerdi...@hotmail.com>
Date: 2015
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: abb554fa28ccaee33437b0b2d2b1da52a81b912e
https://github.com/phpmyadmin/localized_docs/commit/abb554fa28ccaee33437b0b2d2b1da52a81b912e
Author: Burak Yavuz <hitowerdi...@hotmail.com>
Date:
Burak Yavuz created SPARK-11731:
---
Summary: Enable batching on Driver WriteAheadLog by default
Key: SPARK-11731
URL: https://issues.apache.org/jira/browse/SPARK-11731
Project: Spark
Issue Type
Hi,
The BlockMatrix multiplication should be much more efficient on the current
master (and will be available with Spark 1.6). Could you please give that a
try if you have the chance?
Thanks,
Burak
On Fri, Nov 13, 2015 at 10:11 AM, Sabarish Sasidharan <
sabarish.sasidha...@manthan.com> wrote:
15)
Changed paths:
M po/es.mo
M po/es.po
Log Message:
---
Translated using Weblate (Spanish)
Currently translated at 87.2% (1667 of 1911 strings)
[CI skip]
Commit: 72697ec297bb8c4f31ac3db3df37117aa2feaaeb
https://github.com/phpmyadmin/localized_docs/commit/72697ec
Hi Jakob,
> As another, general question, are spark packages the go-to way of
extending spark functionality?
Definitely. There are ~150 Spark Packages out there in spark-packages.org.
I use a lot of them in every day Spark work.
The number of released packages have steadily increased rate over
Burak Yavuz created SPARK-11639:
---
Summary: Flaky test: BatchedWriteAheadLog - name log with
aggregated entries with the timestamp of last entry
Key: SPARK-11639
URL: https://issues.apache.org/jira/browse/SPARK
[
https://issues.apache.org/jira/browse/SPARK-11198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14985701#comment-14985701
]
Burak Yavuz commented on SPARK-11198:
-
Just tested this. It works during regular operation
Burak Yavuz created SPARK-11419:
---
Summary: WriteAheadLog recovery improvements for when
closeFileAfterWrite is enabled
Key: SPARK-11419
URL: https://issues.apache.org/jira/browse/SPARK-11419
Project
[
https://issues.apache.org/jira/browse/SPARK-11198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14982926#comment-14982926
]
Burak Yavuz commented on SPARK-11198:
-
[~boneill42], did you need to do anything special for de
Burak Yavuz created SPARK-11324:
---
Summary: Flag to close Write Ahead Log after writing
Key: SPARK-11324
URL: https://issues.apache.org/jira/browse/SPARK-11324
Project: Spark
Issue Type
Branch: refs/heads/QA_4_5
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 6b5b3b395bb73ff3e6bb1948b1c69d379b78ae7d
https://github.com/phpmyadmin/phpmyadmin/commit/6b5b3b395bb73ff3e6bb1948b1c69d379b78ae7d
Author: Burak Yavuz <hitowerdi...@hotmail.com>
Date: 2015
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 2a1db0a27fcfdfe90e496a8a316642c404f620bf
https://github.com/phpmyadmin/phpmyadmin/commit/2a1db0a27fcfdfe90e496a8a316642c404f620bf
Author: Burak Yavuz <hitowerdi...@hotmail.com>
Date: 2015
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: a2f0f17cab64d953172fed279cf76a104f5f99bd
https://github.com/phpmyadmin/phpmyadmin/commit/a2f0f17cab64d953172fed279cf76a104f5f99bd
Author: Burak Yavuz <hitowerdi...@hotmail.com>
Date: 2015
Burak Yavuz created SPARK-11141:
---
Summary: Batching of ReceivedBlockTrackerLogEvents for efficient
WAL writes
Key: SPARK-11141
URL: https://issues.apache.org/jira/browse/SPARK-11141
Project: Spark
.po
Log Message:
---
Translated using Weblate (Italian)
Currently translated at 100.0% (3211 of 3211 strings)
[CI skip]
Commit: 7aa6cf23ea5035be48a1b8c03548882cd363afcb
https://github.com/phpmyadmin/phpmyadmin/commit/7aa6cf23ea5035be48a1b8c03548882cd363afcb
Author: Burak Yavuz <hitowerdi..
Hi Jerry,
The --packages feature doesn't support private repositories right now.
However, in the case of s3, maybe it might work. Could you please try using
the --repositories flag and provide the address:
`$ spark-submit --packages my:awesome:package --repositories
2b17630752
https://github.com/phpmyadmin/localized_docs/commit/e9ee84ea686d81b08bac7cb1b4e0622b17630752
Author: Burak Yavuz <hitowerdi...@hotmail.com>
Date: 2015-09-30 (Wed, 30 Sep 2015)
Changed paths:
M po/tr.mo
M po/tr.po
Log Message:
---
Translated usi
Burak Yavuz created SPARK-10891:
---
Summary: Add MessageHandler to KinesisUtils.createStream similar
to Direct Kafka
Key: SPARK-10891
URL: https://issues.apache.org/jira/browse/SPARK-10891
Project: Spark
[
https://issues.apache.org/jira/browse/SPARK-10889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14939023#comment-14939023
]
Burak Yavuz commented on SPARK-10889:
-
In addition, KCL 1.4.0 supports de-aggregation of records
[
https://issues.apache.org/jira/browse/SPARK-10599?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz updated SPARK-10599:
Description:
The BlockMatrix multiply sends each block to all the corresponding columns
Burak Yavuz created SPARK-10599:
---
Summary: Decrease communication in BlockMatrix multiply and
increase performance
Key: SPARK-10599
URL: https://issues.apache.org/jira/browse/SPARK-10599
Project: Spark
ES PLEASE!
>
> :)))
>
> On Tue, Aug 25, 2015 at 1:57 PM, Burak Yavuz <brk...@gmail.com> wrote:
>
>> Hmm. I have a lot of code on the local linear algebra operations using
>> Spark's Matrix and Vector representations
>> done for https://issues.apache.org/jira/bro
+1. Tested complex R package support (Scala + R code), BLAS and DataFrame
fixes good.
Burak
On Thu, Sep 3, 2015 at 8:56 AM, mkhaitman wrote:
> Built and tested on CentOS 7, Hadoop 2.7.1 (Built for 2.6 profile),
> Standalone without any problems. Re-tested dynamic
.po
Log Message:
---
Translated using Weblate (Italian)
Currently translated at 100.0% (3209 of 3209 strings)
[CI skip]
Commit: ed21004fdc9a62ca57955780c00fefc198768cde
https://github.com/phpmyadmin/phpmyadmin/commit/ed21004fdc9a62ca57955780c00fefc198768cde
Author: Burak Yavuz <hitowerdi..
Branch: refs/heads/QA_4_5
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 9656d600b992c275a7179414a70990e63a828823
https://github.com/phpmyadmin/phpmyadmin/commit/9656d600b992c275a7179414a70990e63a828823
Author: Burak Yavuz <hitowerdi...@hotmail.com>
Date: 2015
[
https://issues.apache.org/jira/browse/SPARK-10353?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz updated SPARK-10353:
Affects Version/s: 1.5.0
MLlib BLAS gemm outputs wrong result when beta = 0.0 for transpose
Burak Yavuz created SPARK-10353:
---
Summary: MLlib BLAS gemm outputs wrong result when beta = 0.0 for
transpose transpose matrix multiplication
Key: SPARK-10353
URL: https://issues.apache.org/jira/browse/SPARK-10353
Or you can just call describe() on the dataframe? In addition to min-max,
you'll also get the mean, and count of non-null and non-NA elements as well.
Burak
On Fri, Aug 28, 2015 at 10:09 AM, java8964 java8...@hotmail.com wrote:
Or RDD.max() and RDD.min() won't work for you?
Yong
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: ee9be6b3473bb3b71f639bfb36be1dba2e55c0a2
https://github.com/phpmyadmin/localized_docs/commit/ee9be6b3473bb3b71f639bfb36be1dba2e55c0a2
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-08
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: e6440d74dd9e8d50d45de6055a984de582d29684
https://github.com/phpmyadmin/localized_docs/commit/e6440d74dd9e8d50d45de6055a984de582d29684
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-08
Hmm. I have a lot of code on the local linear algebra operations using
Spark's Matrix and Vector representations
done for https://issues.apache.org/jira/browse/SPARK-6442.
I can make a Spark package with that code if people are interested.
Best,
Burak
On Tue, Aug 25, 2015 at 10:54 AM, Kristina
textFile is a lazy operation. It doesn't evaluate until you call an action
on it, such as .count(). Therefore, you won't catch the exception there.
Best,
Burak
On Mon, Aug 24, 2015 at 9:09 AM, Roberto Coluccio
roberto.coluc...@gmail.com wrote:
Hello folks,
I'm experiencing an unexpected
to evaluate the actual result), and there I can observe and
catch the exception. Even considering Spark's laziness, shouldn't I catch
the exception while occurring in the try..catch statement that encloses the
textFile invocation?
Best,
Roberto
On Mon, Aug 24, 2015 at 7:38 PM, Burak Yavuz brk
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 70b5c9ddeb96b00090e24de363464b3058dffa41
https://github.com/phpmyadmin/phpmyadmin/commit/70b5c9ddeb96b00090e24de363464b3058dffa41
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-08-22 (Sat
:50 PM, Burak Yavuz wrote:
Matrix.toBreeze is a private method. MLlib matrices have the same
structure as Breeze Matrices. Just create a new Breeze matrix like this
https://github.com/apache/spark/blob/43e0135421b2262cbb0e06aae53523f663b4f959/mllib/src/main/scala/org/apache/spark/mllib/linalg
Matrix.toBreeze is a private method. MLlib matrices have the same structure
as Breeze Matrices. Just create a new Breeze matrix like this
https://github.com/apache/spark/blob/43e0135421b2262cbb0e06aae53523f663b4f959/mllib/src/main/scala/org/apache/spark/mllib/linalg/Matrices.scala#L270
.
Best,
If you would like to try using spark-csv, please use
`pyspark --packages com.databricks:spark-csv_2.11:1.2.0`
You're missing a dependency.
Best,
Burak
On Thu, Aug 20, 2015 at 1:08 PM, Charlie Hack charles.t.h...@gmail.com
wrote:
Hi,
I'm new to spark and am trying to create a Spark df from a
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: eb4b3a9ce6d83d3a990b0a8cf0def9d92de0af1e
https://github.com/phpmyadmin/phpmyadmin/commit/eb4b3a9ce6d83d3a990b0a8cf0def9d92de0af1e
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-08-15 (Sat
I would recommend this spark package for your unit testing needs (
http://spark-packages.org/package/holdenk/spark-testing-base).
Best,
Burak
On Thu, Aug 13, 2015 at 5:51 AM, jay vyas jayunit100.apa...@gmail.com
wrote:
yes there certainly is, so long as eclipse has the right plugins and so on
Burak Yavuz created SPARK-9916:
--
Summary: Clear leftover sparkr.zip copies and creations (e.g.
make-distribution.sh)
Key: SPARK-9916
URL: https://issues.apache.org/jira/browse/SPARK-9916
Project: Spark
: cd80cb61a09618d14a60f6d2f719494924624190
https://github.com/phpmyadmin/phpmyadmin/commit/cd80cb61a09618d14a60f6d2f719494924624190
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-08-09 (Sun, 09 Aug 2015)
Changed paths:
M po/tr.po
Log Message:
---
Translated using Weblate
[
https://issues.apache.org/jira/browse/SPARK-9742?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14662260#comment-14662260
]
Burak Yavuz commented on SPARK-9742:
Did the behavior of Option's change for some
[
https://issues.apache.org/jira/browse/SPARK-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14661309#comment-14661309
]
Burak Yavuz commented on SPARK-9614:
It used to work in Spark 1.4, without Tungsten. I
Burak Yavuz created SPARK-9615:
--
Summary: Use rdd.aggregate in FrequentItems
Key: SPARK-9615
URL: https://issues.apache.org/jira/browse/SPARK-9615
Project: Spark
Issue Type: Sub-task
Burak Yavuz created SPARK-9614:
--
Summary: InternalRow representation during
executionPlan.toRdd.aggregete possibly problematic
Key: SPARK-9614
URL: https://issues.apache.org/jira/browse/SPARK-9614
[
https://issues.apache.org/jira/browse/SPARK-9614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14654402#comment-14654402
]
Burak Yavuz commented on SPARK-9614:
cc [~joshrosen]
InternalRow representation
Burak Yavuz created SPARK-9616:
--
Summary: Erroneous result in Frequent Items (SQL) when merging
FrequentItemCounters
Key: SPARK-9616
URL: https://issues.apache.org/jira/browse/SPARK-9616
Project: Spark
Burak Yavuz created SPARK-9603:
--
Summary: Re-enable complex R package test in SparkSubmitSuite
Key: SPARK-9603
URL: https://issues.apache.org/jira/browse/SPARK-9603
Project: Spark
Issue Type
Hi, there was this issue for Scala 2.11.
https://issues.apache.org/jira/browse/SPARK-7944
It should be fixed on master branch. You may be hitting that.
Best,
Burak
On Sun, Aug 2, 2015 at 9:06 PM, Ted Yu yuzhih...@gmail.com wrote:
I tried the following command on master branch:
bin/spark-shell
In addition, you do not need to use --jars with --packages. --packages will
get the jar for you.
Best,
Burak
On Mon, Aug 3, 2015 at 9:01 AM, Burak Yavuz brk...@gmail.com wrote:
Hi, there was this issue for Scala 2.11.
https://issues.apache.org/jira/browse/SPARK-7944
It should be fixed
Hi Yucheng,
Thanks for pointing out the issue. You are correct, in the case that the
final map is completely empty after the merge, we do need to add the final
element to the map, with the correct count (decrement the count with the
max count that was already in the map). I'll submit a fix for
Hey Stephen,
In case these libraries exist on the client as a form of maven library, you
can use --packages to ship the library and all it's dependencies, without
building an uber jar.
Best,
Burak
On Tue, Jul 28, 2015 at 10:23 AM, Marcelo Vanzin van...@cloudera.com
wrote:
Hi Stephen,
There
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: cb77ab1e50fb5e3daa92faf96e34b26a1d2d109b
https://github.com/phpmyadmin/phpmyadmin/commit/cb77ab1e50fb5e3daa92faf96e34b26a1d2d109b
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-07-23 (Thu
Hi Jonathan,
I believe calling persist with StorageLevel.NONE doesn't do anything.
That's why the unpersist has an if statement before it.
Could you give more information about your setup please? Number of cores,
memory, number of partitions of ratings_train?
Thanks,
Burak
On Wed, Jul 22, 2015
Burak Yavuz created SPARK-9263:
--
Summary: Add Spark Submit flag to exclude dependencies when using
--packages
Key: SPARK-9263
URL: https://issues.apache.org/jira/browse/SPARK-9263
Project: Spark
Hi,
Could you please decrease your step size to 0.1, and also try 0.01? You
could also try running L-BFGS, which doesn't have step size tuning, to get
better results.
Best,
Burak
On Tue, Jul 21, 2015 at 2:59 AM, Naveen nav...@formcept.com wrote:
Hi ,
I am trying to use
Would monotonicallyIncreasingId
https://github.com/apache/spark/blob/d4c7a7a3642a74ad40093c96c4bf45a62a470605/sql/core/src/main/scala/org/apache/spark/sql/functions.scala#L637
work for you?
Best,
Burak
On Tue, Jul 21, 2015 at 4:55 PM, Srikanth srikanth...@gmail.com wrote:
Hello,
I'm
shuffling given the blocks co-location?
Best regards, Alexander
*From:* Burak Yavuz [mailto:brk...@gmail.com]
*Sent:* Wednesday, July 15, 2015 3:29 PM
*To:* Ulanov, Alexander
*Cc:* Rakesh Chalasani; dev@spark.apache.org
*Subject:* Re: BlockMatrix multiplication
Hi Alexander,
I just
Hi,
I believe the HiveContext uses a different class loader. It then falls back
to the system class loader if it can't find the classes in the context
class loader. The system class loader contains the classpath passed
through --driver-class-path
and spark.executor.extraClassPath. The JVM is
Hi,
Is this in LibSVM format? If so, the indices should be sorted in increasing
order. It seems like they are not sorted.
Best,
Burak
On Tue, Jul 14, 2015 at 7:31 PM, Vi Ngo Van ngovi.se@gmail.com wrote:
Hi All,
I've met a issue with MLlib when i use LogisticRegressionWithLBFGS
my
Hi Swetha,
IndexedRDD is available as a package on Spark Packages
http://spark-packages.org/package/amplab/spark-indexedrdd.
Best,
Burak
On Tue, Jul 14, 2015 at 5:23 PM, swetha swethakasire...@gmail.com wrote:
Hi Ankur,
Is IndexedRDD available in Spark 1.4.0? We would like to use this in
() - t) / 1e9)
Best regards, Alexander
*From:* Ulanov, Alexander
*Sent:* Tuesday, July 14, 2015 6:24 PM
*To:* 'Burak Yavuz'
*Cc:* Rakesh Chalasani; dev@spark.apache.org
*Subject:* RE: BlockMatrix multiplication
Hi Burak,
Thank you for explanation! I will try to make a diagonal
Hi,
There is no MLlib support in SparkR in 1.4. There will be some support in
1.5. You can check these JIRAs for progress:
https://issues.apache.org/jira/browse/SPARK-6805
https://issues.apache.org/jira/browse/SPARK-6823
Best,
Burak
On Wed, Jul 15, 2015 at 6:00 AM, madhu phatak
Hi Alexander,
From your example code, using the GridPartitioner, you will have 1 column,
and 5 rows. When you perform an A^T^A multiplication, you will generate a
separate GridPartitioner with 5 columns and 5 rows. Therefore you are
observing a huge shuffle. If you would generate a diagonal-block
Hi Dan,
You could zip the indices with the values if you like.
```
val sVec = sparseVector(1).asInstanceOf[
org.apache.spark.mllib.linalg.SparseVector]
val map = sVec.indices.zip(sVec.values).toMap
```
Best,
Burak
On Tue, Jul 14, 2015 at 12:23 PM, Dan Dong dongda...@gmail.com wrote:
Hi,
On Mon, Jul 13, 2015 at 10:28 PM, Burak Yavuz brk...@gmail.com wrote:
Hi,
How are you running K-Means? What is your k? What is the dimension of
your dataset (columns)? Which Spark version are you using?
Thanks,
Burak
On Mon, Jul 13, 2015 at 2:53 AM, Nirmal Fernando nir...@wso2.com wrote
, Nirmal Fernando nir...@wso2.com wrote:
I'm using;
org.apache.spark.mllib.clustering.KMeans.train(data.rdd(), 3, 20);
Cpu cores: 8 (using default Spark conf thought)
On partitions, I'm not sure how to find that.
On Mon, Jul 13, 2015 at 11:30 PM, Burak Yavuz brk...@gmail.com wrote:
What
Hi,
How are you running K-Means? What is your k? What is the dimension of your
dataset (columns)? Which Spark version are you using?
Thanks,
Burak
On Mon, Jul 13, 2015 at 2:53 AM, Nirmal Fernando nir...@wso2.com wrote:
Hi,
For a fairly large dataset, 30MB, KMeansModel.computeCost takes lot
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: e58f0054418b0442a88effe9437e503d7efc339f
https://github.com/phpmyadmin/localized_docs/commit/e58f0054418b0442a88effe9437e503d7efc339f
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-07
I can +1 Holden's spark-testing-base package.
Burak
On Fri, Jul 10, 2015 at 12:23 PM, Holden Karau hol...@pigscanfly.ca wrote:
Somewhat biased of course, but you can also use spark-testing-base from
spark-packages.org as a basis for your unittests.
On Fri, Jul 10, 2015 at 12:03 PM, Daniel
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 92a63d2cc3cff46b9a5ce087162557c2bb5c729e
https://github.com/phpmyadmin/phpmyadmin/commit/92a63d2cc3cff46b9a5ce087162557c2bb5c729e
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-07-10 (Fri
If you use the Pipelines Api with DataFrames, you select which columns you
would like to train on using the VectorAssembler. While using the
VectorAssembler, you can choose not to select some features if you like.
Best,
Burak
On Thu, Jul 9, 2015 at 10:38 AM, Arun Luthra arun.lut...@gmail.com
+1 nonbinding.
On Thu, Jul 9, 2015 at 7:38 AM, Sean Owen so...@cloudera.com wrote:
+1 nonbinding. All previous RC issues appear resolved. All tests pass
with the -Pyarn -Phadoop-2.6 -Phive -Phive-thriftserver invocation.
Signatures et al are OK.
On Thu, Jul 9, 2015 at 6:55 AM, Patrick
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 3a3c4969ada9bb641abc492e497710287ce3241c
https://github.com/phpmyadmin/phpmyadmin/commit/3a3c4969ada9bb641abc492e497710287ce3241c
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-07-08 (Wed
Branch: refs/heads/QA_4_4
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: f4f94fc4fcac477b229feb3eaf30e0aa74d03fb7
https://github.com/phpmyadmin/phpmyadmin/commit/f4f94fc4fcac477b229feb3eaf30e0aa74d03fb7
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-07-08 (Wed
spark-hive is excluded when using --packages, because it can be included in
the spark-assembly by adding -Phive during mvn package or sbt assembly.
Best,
Burak
On Tue, Jul 7, 2015 at 8:06 AM, Hao Ren inv...@gmail.com wrote:
I want to add spark-hive as a dependence to submit my job, but it
[
https://issues.apache.org/jira/browse/SPARK-6442?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Burak Yavuz updated SPARK-6442:
---
Description:
MLlib's local linear algebra package doesn't have any support for any type of
matrix
How many partitions do you have? It might be that one partition is too
large, and there is Integer overflow. Could you double your number of
partitions?
Burak
On Fri, Jul 3, 2015 at 4:41 AM, Danny kont...@dannylinden.de wrote:
hi,
i want to run a multiclass classification with 390 classes
Burak Yavuz created SPARK-8803:
--
Summary: Crosstab element's can't contain null's and back ticks
Key: SPARK-8803
URL: https://issues.apache.org/jira/browse/SPARK-8803
Project: Spark
Issue Type
You can use df.repartition(1) in Spark 1.4. See here
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala#L1396
.
Best,
Burak
On Wed, Jul 1, 2015 at 3:05 AM, Olivier Girardot ssab...@gmail.com wrote:
PySpark or Spark (scala) ?
When you use
, Jun 29, 2015 at 11:33 PM, SLiZn Liu sliznmail...@gmail.com wrote:
Hi Burak,
Is `--package` flag only available for maven, no sbt support?
On Tue, Jun 30, 2015 at 2:26 PM Burak Yavuz brk...@gmail.com wrote:
You can pass `--packages your:comma-separated:maven-dependencies` to
spark submit
You can pass `--packages your:comma-separated:maven-dependencies` to spark
submit if you have Spark 1.3 or greater.
Best regards,
Burak
On Mon, Jun 29, 2015 at 10:46 PM, SLiZn Liu sliznmail...@gmail.com wrote:
Hey Spark Users,
I'm writing a demo with Spark and HBase. What I've done is
[
https://issues.apache.org/jira/browse/SPARK-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605777#comment-14605777
]
Burak Yavuz commented on SPARK-8599:
It would be great if it works for this case
[
https://issues.apache.org/jira/browse/SPARK-8410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605977#comment-14605977
]
Burak Yavuz commented on SPARK-8410:
Hi Joe,
Is it possible to delete those files
[
https://issues.apache.org/jira/browse/SPARK-8475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14605968#comment-14605968
]
Burak Yavuz commented on SPARK-8475:
ping. I think you can go ahead with a PR
[
https://issues.apache.org/jira/browse/SPARK-8410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14606080#comment-14606080
]
Burak Yavuz commented on SPARK-8410:
Hi Joe,
Could you please check whether
https
Burak Yavuz created SPARK-8715:
--
Summary: ArrayOutOfBoundsException for DataFrameStatSuite.crosstab
Key: SPARK-8715
URL: https://issues.apache.org/jira/browse/SPARK-8715
Project: Spark
Issue
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/phpmyadmin
Commit: 68116ae860346108aa0850ffcd49c5241759562c
https://github.com/phpmyadmin/phpmyadmin/commit/68116ae860346108aa0850ffcd49c5241759562c
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-06-29 (Mon
Burak Yavuz created SPARK-8681:
--
Summary: crosstab column names in wrong order
Key: SPARK-8681
URL: https://issues.apache.org/jira/browse/SPARK-8681
Project: Spark
Issue Type: Sub-task
Branch: refs/heads/master
Home: https://github.com/phpmyadmin/localized_docs
Commit: 1e0bb6be29e91e6e47ad7f95169c4d1c9d92cfe7
https://github.com/phpmyadmin/localized_docs/commit/1e0bb6be29e91e6e47ad7f95169c4d1c9d92cfe7
Author: Burak Yavuz hitowerdi...@hotmail.com
Date: 2015-06
Burak Yavuz created SPARK-8608:
--
Summary: After initializing a DataFrame with random columns and a
seed, df.show should return same value
Key: SPARK-8608
URL: https://issues.apache.org/jira/browse/SPARK-8608
Burak Yavuz created SPARK-8609:
--
Summary: After initializing a DataFrame with random columns and a
seed, ordering by that random column should return same sorted order
Key: SPARK-8609
URL: https://issues.apache.org
[
https://issues.apache.org/jira/browse/SPARK-8599?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14600315#comment-14600315
]
Burak Yavuz commented on SPARK-8599:
cc [~marmbrus] [~rxin]
Use a Random operator
Hi Wei,
For example, when a straggler executor gets killed in the middle of a map
operation and it's task is restarted at a different instance, the
accumulator will be updated more than once.
Best,
Burak
On Wed, Jun 24, 2015 at 1:08 PM, Wei Zhou zhweisop...@gmail.com wrote:
Quoting from Spark
the transformation ended up updating accumulator more than
once?
Best,
Wei
2015-06-24 13:23 GMT-07:00 Burak Yavuz brk...@gmail.com:
Hi Wei,
For example, when a straggler executor gets killed in the middle of a map
operation and it's task is restarted at a different instance, the
accumulator
Hi Ryan,
If you can get past the paperwork, I'm sure this can make a great Spark
Package (http://spark-packages.org). People then can use it for
benchmarking purposes, and I'm sure people will be looking for graph
generators!
Best,
Burak
On Wed, Jun 24, 2015 at 7:55 AM, Carr, J. Ryan
601 - 700 of 1060 matches
Mail list logo