Is there any way to select columns of Dataset in addition to the combination of `expr` and `as`?

2015-12-18 Thread Yu Ishikawa
uot;).show :34: error: type mismatch; found : String("id") required: org.apache.spark.sql.TypedColumn[Person,?] ds.select("id").show ``` Best, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Is-there-any-way-to-selec

How do we convert a Dataset includes timestamp columns to RDD?

2015-12-16 Thread Yu Ishikawa
erialize(JavaSerializer.scala:100) at org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:301) ... 68 more ``` Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/How-do-we-convert-a-Dataset-includes

Re: [ANNOUNCE] Announcing Spark 1.5.0

2015-09-09 Thread Yu Ishikawa
Great work, everyone! - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/ANNOUNCE-Announcing-Spark-1-5-0-tp14013p14015.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

[SparkR] lint script for SpakrR

2015-09-01 Thread Yu Ishikawa
Shivaram and Josh, I couldn't have done it without you. Thanks Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/SparkR-lint-script-for-SpakrR-tp13923.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

Is `dev/lint-python` broken?

2015-07-27 Thread Yu Ishikawa
: `easy_install -d $PYLINT_HOME pylint==1.4.4 $PYLINT_INSTALL_INFO' ``` If the redirect is a syntax error, I'll send a PR to fix. Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Is-dev-lint-python-broken-tp13439.html

Re: Is `dev/lint-python` broken?

2015-07-27 Thread Yu Ishikawa
Hi Sean, Thank you for answering my question. It seems that I used an old version bash which is the default Mac bash. ``` $ bash --version GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin14) Copyright (C) 2007 Free Software Foundation, Inc. share_history ``` Thanks, Yu - -- Yu

Re: Is `dev/lint-python` broken?

2015-07-27 Thread Yu Ishikawa
I'm using 10.10.4. And Xcode is version 6.4. Maybe, it isn't old. I guess the old bash version causes the problem. I'll try to install another bash with brew. - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Is-dev-lint-python

Re: What is the difference between SlowSparkPullRequestBuilder and SparkPullRequestBuilder?

2015-07-22 Thread Yu Ishikawa
Hi Andrew, I understand that there is no difference currently. Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/What-is-the-difference-between-SlowSparkPullRequestBuilder-and-SparkPullRequestBuilder-tp13377p13380.html

What is the difference between SlowSparkPullRequestBuilder and SparkPullRequestBuilder?

2015-07-21 Thread Yu Ishikawa
Hi all, When we send a PR, it seems that two requests to run tests are thrown to the Jenkins sometimes. What is the difference between SparkPullRequestBuilder and SlowSparkPullRequestBuilder? Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers

Re: [pyspark] What is the best way to run a minimum unit testing related to our developing module?

2015-07-01 Thread Yu Ishikawa
Thanks! --Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/pyspark-What-is-the-best-way-to-run-a-minimum-unit-testing-related-to-our-developing-module-tp12987p12989.html Sent from the Apache Spark Developers List mailing list

Re: [pyspark] What is the best way to run a minimum unit testing related to our developing module?

2015-07-01 Thread Yu ISHIKAWA
Thanks! --Yu 2015-07-02 13:13 GMT+09:00 Reynold Xin r...@databricks.com: Run ./python/run-tests --help and you will see. :) On Wed, Jul 1, 2015 at 9:10 PM, Yu Ishikawa yuu.ishikawa+sp...@gmail.com wrote: Hi all, When I develop pyspark modules, such as adding a spark.ml API in Python

[pyspark] What is the best way to run a minimum unit testing related to our developing module?

2015-07-01 Thread Yu Ishikawa
is the best way to run a minimum unit testing related to our developing modules under the current version? Of course, I think it would be nice to be able to identify testing targets with the script like scala's sbt. Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark

[jenkins] ERROR: Publisher 'Publish JUnit test result report' failed: No test report files were found. Configuration error?

2015-06-21 Thread Yu Ishikawa
``` It seems that the unit testing related to the PR passed. However, the Jenkins posted Merged build finished. Test FAILed. to github. https://github.com/apache/spark/pull/6926 Thanks Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com

[pyspark][mllib] What is the best way to treat int and long int between python2.6/python3.4 and Java?

2015-06-20 Thread Yu Ishikawa
are tackling [SPARK-6259] Python API for LDA. We wonder if we should create a wrapper class for the document of LDA or not. Do you have any idea to implement it? https://issues.apache.org/jira/browse/SPARK-6259 Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark

Re: [mllib] Refactoring some spark.mllib model classes in Python not inheriting JavaModelWrapper

2015-06-19 Thread Yu Ishikawa
Hi Xiangrui I got it. I will try to refactor any model class not inheriting JavaModelWrapper and show you it. Thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Refactoring-some-spark-mllib-model-classes-in-Python

Re: Workaround for problems with OS X + JIRA Client

2015-06-19 Thread Yu Ishikawa
Hi Sean, That sounds interesting. I didn't know the client. I will try it later. Thank you for sharing the information. Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Workaround-for-problems-with-OS-X-JIRA-Client

[mllib] Refactoring some spark.mllib model classes in Python not inheriting JavaModelWrapper

2015-06-17 Thread Yu Ishikawa
- -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Refactoring-some-spark-mllib-model-classes-in-Python-not-inheriting-JavaModelWrapper-tp12781.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

[mllib] Deprecate static train and use builder instead for Scala/Java

2015-04-06 Thread Yu Ishikawa
and use builder instead for Scala/Java https://issues.apache.org/jira/browse/SPARK-6682 Thanks Yu Ishikawa - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Deprecate-static-train-and-use-builder-instead-for-Scala-Java-tp11438

Re: [mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-18 Thread Yu Ishikawa
Sorry for the delay in replying. I moved from Tokyo to New York in order to attend Spark Summit East. I verified the snapshot and the difference. https://github.com/scalanlp/breeze/commit/f61d2f61137807651fc860404a244640e213f6d3 Thank you for your great work! Yu Ishikawa - -- Yu Ishikawa

[mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-15 Thread Yu Ishikawa
, 0.0) org.scalatest.exceptions.TestFailedException: DenseVector(0.0, 0.0, 0.0, 0.0, 0.0, 0.0) did not equal DenseVector(0.0, 0.0, 0.0, 0.0, 0.1, 0.0) ``` Thanks, Yu Ishikawa - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3

Re: [mllib] Is there any bugs to divide a Breeze sparse vectors at Spark v1.3.0-rc3?

2015-03-15 Thread Yu Ishikawa
- ASF JIRA https://issues.apache.org/jira/browse/SPARK-6341 Thanks, Yu Ishikawa - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Is-there-any-bugs-to-divide-a-Breeze-sparse-vectors-at-Spark-v1-3-0-rc3-tp11056p11058.html Sent from

Re: [mllib] Which is the correct package to add a new algorithm?

2014-11-30 Thread Yu Ishikawa
algorithms to spark.mllib. thanks, Yu - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Which-is-the-correct-package-to-add-a-new-algorithm-tp9540p9575.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

[mllib] Which is the correct package to add a new algorithm?

2014-11-27 Thread Yu Ishikawa
Hi all, Spark ML alpha version exists in the current master branch on Github. If we want to add new machine learning algorithms or to modify algorithms which already exists, which package should we implement them at org.apache.spark.mllib or org.apache.spark.ml? thanks, Yu - -- Yu

Re: [VOTE] Designating maintainers for some Spark components

2014-11-11 Thread Yu Ishikawa
- Streaming: TD, Matei - GraphX: Ankur, Joey, Reynold I'd like to formally call a [VOTE] on this model, to last 72 hours. The [VOTE] will end on Nov 8, 2014 at 6 PM PST. Matei - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3

Re: JIRA + PR backlog

2014-11-11 Thread Yu Ishikawa
Great jobs! I didn't know Spark PR Dashboard. Thanks Yu Ishikawa - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/JIRA-PR-backlog-tp9157p9282.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

[mllib] Share the simple benchmark result about the cast cost from Spark vector to Breeze vector

2014-10-15 Thread Yu Ishikawa
had expected. For more information, please read the below report, if you are interested in it. https://github.com/yu-iskw/benchmark-breeze-on-spark/blob/master/doc%2Fbenchmark-result.md Best, Yu Ishikawa - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list

Standardized Distance Functions in MLlib

2014-10-08 Thread Yu Ishikawa
. https://github.com/apache/spark/pull/1964#issuecomment-54953348 Best, - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Standardized-Distance-Functions-in-MLlib-tp8697.html Sent from the Apache Spark Developers List mailing list

Re: What is the best way to build my developing Spark for testing on EC2?

2014-10-06 Thread Yu Ishikawa
running my program and collecting output. It's just as you thought. I agree with you. You could have a look at the spark-perf repo if you want something a little better principled/automatic. I overlooked this. I will give it a try. best, - -- Yu Ishikawa -- View this message in context

What is the best way to build my developing Spark for testing on EC2?

2014-10-02 Thread Yu Ishikawa
script for a developing version like spark-ec2 script? Or if you have any good idea to evaluate the performance of a developing MLlib algorithm on a spark cluster like EC2, could you tell me? Best, - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list

Re: MLlib enable extension of the LabeledPoint class

2014-09-25 Thread Yu Ishikawa
on it. For example, ``` abstract class LabeledPoint[T](label: T, features: Vector) ``` thanks - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-enable-extension-of-the-LabeledPoint-class-tp8546p8549.html Sent from the Apache

Re: MLlib enable extension of the LabeledPoint class

2014-09-25 Thread Yu Ishikawa
Hi Egor Pahomov, Thank you for your comment! - -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/MLlib-enable-extension-of-the-LabeledPoint-class-tp8546p8551.html Sent from the Apache Spark Developers List mailing list archive

Re: [mllib] Add multiplying large scale matrices

2014-09-08 Thread Yu Ishikawa
Hi Xiangrui Meng, Thank you for your comment and creating tickets. The ticket which I created would be moved to your tickets. I will close my ticket, and then will link it to yours later. Best, Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3

Re: [mllib] Add multiplying large scale matrices

2014-09-06 Thread Yu Ishikawa
Hi Jeremy, Great work! I'm interested in your work. If there is your code on github, could you let me know? -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Add-multiplying-large-scale-matrices-tp8291p8309.html Sent from

Re: [mllib] Add multiplying large scale matrices

2014-09-06 Thread Yu Ishikawa
Hi Rong, Great job! Thank you for let me know your work. I will read the source code of saury later. Although AMPLab is working to implement them, would you like to merge it into Spark? Best, -- Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3

[mllib] Add multiplying large scale matrices

2014-09-05 Thread Yu Ishikawa
Hi all, It seems that there is a method to multiply a RowMatrix and a (local) Matrix. However, there is not a method to multiply a large scale matrix and another one in Spark. It would be helpful. Does anyone have a plan to add multiplying large scale matrices? Or shouldn't we support it in

Re: [mllib] Add multiplying large scale matrices

2014-09-05 Thread Yu Ishikawa
Hi RJ, Thank you for your comment. I am interested in to have other matrix operations too. I will create a JIRA issue in the first place. thanks, -- View this message in context:

Re: [mllib] Add multiplying large scale matrices

2014-09-05 Thread Yu Ishikawa
Hi Evan, That's sounds interesting. Here is the ticket which I created. https://issues.apache.org/jira/browse/SPARK-3416 thanks, -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/mllib-Add-multiplying-large-scale-matrices-tp8291p8296.html Sent from

Re: Contributing to MLlib: Proposal for Clustering Algorithms

2014-08-13 Thread Yu Ishikawa
, please let me know. best, Yu Ishikawa -- View this message in context: http://apache-spark-developers-list.1001551.n3.nabble.com/Contributing-to-MLlib-Proposal-for-Clustering-Algorithms-tp7212p7822.html Sent from the Apache Spark Developers List mailing list archive at Nabble.com

Re: Can I translate the documentations of Spark in Japanese?

2014-07-31 Thread Yu Ishikawa
Hi Kenichi Takagiwa, Thank you for commenting. I am going to proceed with the translation, will you please help me. Further details will be sent later. Best, Yu -- View this message in context:

Re: Can I translate the documentations of Spark in Japanese?

2014-07-31 Thread Yu Ishikawa
Hi Nick, I know some projects get translations crowdsourced via one website or other. Thank you for your comments. I think crowdsourced translation is fit for the translation project on github. Best, Yu -- View this message in context:

Can I translate the documentations of Spark in Japanese?

2014-07-27 Thread Yu Ishikawa
Hi all, I'm Yu Ishikawa, a Japanese. I would like to translate the documentations of Spark 1.0.x officially. If I will translate them and send a pull request, then can you merge it ? And where is the best directory to create the Japanese documentations ? Best, Yu -- View this message