[jira] [Comment Edited] (TOREE-407) Improve Branding on Site

2017-05-02 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993861#comment-15993861 ] Jakob Odersky edited comment on TOREE-407 at 5/2/17 9:55 PM: - That's a valid

[jira] [Commented] (TOREE-407) Improve Branding on Site

2017-05-02 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15993861#comment-15993861 ] Jakob Odersky commented on TOREE-407: - That's a valid point, I also think that the description

[jira] [Commented] (TOREE-402) Installer should support parameterized kernel names

2017-04-05 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957571#comment-15957571 ] Jakob Odersky commented on TOREE-402: - Good point; that's an aspect I didn't think about. Currently

[jira] [Comment Edited] (TOREE-402) Installer should support parameterized kernel names

2017-04-05 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15957443#comment-15957443 ] Jakob Odersky edited comment on TOREE-402 at 4/5/17 6:59 PM: - Thanks

[jira] [Commented] (TOREE-399) Make Spark Kernel work on Windows

2017-03-30 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-399?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15949642#comment-15949642 ] Jakob Odersky commented on TOREE-399: - Hi Aldo, the run.sh script is a launcher script that basically

Re: want to join toree mailing list

2017-03-28 Thread Jakob Odersky
Hi, the list system is automated. Send an email to dev-subscr...@toree.incubator.apache.org to subscribe. Check this page for more information http://toree.apache.org/community/get-involved --Jakob On Sun, Mar 26, 2017 at 2:33 PM, Rajkumar Natarajan wrote: > Hi Team, > >

[jira] [Commented] (TOREE-375) Incorrect fully qualified name for spark context

2017-03-10 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15905637#comment-15905637 ] Jakob Odersky commented on TOREE-375: - Closing this as it is not related to Toree. I'll do some more

[jira] [Closed] (TOREE-375) Incorrect fully qualified name for spark context

2017-03-10 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky closed TOREE-375. --- Resolution: Done > Incorrect fully qualified name for spark cont

[jira] [Resolved] (TOREE-386) spark kernel `--name test` or `--conf spark.app.name=test` parameter to spark_opts is not applied

2017-03-02 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky resolved TOREE-386. - Resolution: Fixed > spark kernel `--name test` or `--conf spark.app.name=test` parame

[jira] [Resolved] (TOREE-383) Fix flaky tests

2017-02-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky resolved TOREE-383. - Resolution: Fixed > Fix flaky tests > --- > > Ke

[jira] [Commented] (TOREE-377) When magic fails, the error is swallowed

2017-02-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-377?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15886512#comment-15886512 ] Jakob Odersky commented on TOREE-377: - Fixed in PR, thanks! > When magic fails, the error is swallo

[jira] [Resolved] (TOREE-377) When magic fails, the error is swallowed

2017-02-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-377?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky resolved TOREE-377. - Resolution: Fixed > When magic fails, the error is swallo

[jira] [Closed] (TOREE-379) Tab completion doesn't replace partial words

2017-02-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-379?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky closed TOREE-379. --- Resolution: Fixed Fixed in PR, thanks! > Tab completion doesn't replace partial wo

[jira] [Resolved] (TOREE-387) Kernel should not store SparkSession

2017-02-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky resolved TOREE-387. - Resolution: Fixed fixed in pr > Kernel should not store SparkSess

[jira] [Assigned] (TOREE-386) spark kernel `--name test` or `--conf spark.app.name=test` parameter to spark_opts is not applied

2017-02-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-386?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky reassigned TOREE-386: --- Assignee: Jakob Odersky > spark kernel `--name test` or `--conf spark.app.name=t

[jira] [Comment Edited] (TOREE-375) Incorrect fully qualified name for spark context

2017-02-21 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15876892#comment-15876892 ] Jakob Odersky edited comment on TOREE-375 at 2/21/17 10:37 PM: --- [~fschueler

Supported Scala versions

2017-02-16 Thread Jakob Odersky
Hi everyone, during a recent discussion with Marius https://github.com/apache/incubator-toree/pull/93, I found out that Toree does not support being built with Scala 2.10 anymore. I am all in favor of dropping the EOL scala version, however considering that there are still various references to

[jira] [Resolved] (TOREE-382) Revamp the sbt build and consolidate dependencies

2017-02-16 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky resolved TOREE-382. - Resolution: Fixed > Revamp the sbt build and consolidate dependenc

[jira] [Commented] (TOREE-386) toree spark kernel --name parameter to spark-submit is not applied

2017-02-15 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-386?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15868606#comment-15868606 ] Jakob Odersky commented on TOREE-386: - Hmm, I'm not sure where the name is coming from, a grep over

[jira] [Updated] (TOREE-375) Incorrect fully qualified name for spark context

2017-02-15 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-375?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky updated TOREE-375: Priority: Critical (was: Major) > Incorrect fully qualified name for spark cont

[jira] [Assigned] (TOREE-383) Fix flaky tests

2017-02-14 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-383?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky reassigned TOREE-383: --- Assignee: Jakob Odersky > Fix flaky tests > --- > >

[jira] [Created] (TOREE-385) Refactor travis build to be runnable as a container (as opposed to a vm)

2017-02-14 Thread Jakob Odersky (JIRA)
Jakob Odersky created TOREE-385: --- Summary: Refactor travis build to be runnable as a container (as opposed to a vm) Key: TOREE-385 URL: https://issues.apache.org/jira/browse/TOREE-385 Project: TOREE

[jira] [Created] (TOREE-383) Fix flaky tests

2017-02-14 Thread Jakob Odersky (JIRA)
Jakob Odersky created TOREE-383: --- Summary: Fix flaky tests Key: TOREE-383 URL: https://issues.apache.org/jira/browse/TOREE-383 Project: TOREE Issue Type: Sub-task Reporter: Jakob

[jira] [Updated] (TOREE-382) Revamp the sbt build and consolidate dependencies

2017-02-14 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky updated TOREE-382: Priority: Minor (was: Major) > Revamp the sbt build and consolidate dependenc

[jira] [Created] (TOREE-382) Revamp the sbt build and consolidate dependencies

2017-02-14 Thread Jakob Odersky (JIRA)
Jakob Odersky created TOREE-382: --- Summary: Revamp the sbt build and consolidate dependencies Key: TOREE-382 URL: https://issues.apache.org/jira/browse/TOREE-382 Project: TOREE Issue Type: Sub

[jira] [Created] (TOREE-381) Revamp the build

2017-02-14 Thread Jakob Odersky (JIRA)
Jakob Odersky created TOREE-381: --- Summary: Revamp the build Key: TOREE-381 URL: https://issues.apache.org/jira/browse/TOREE-381 Project: TOREE Issue Type: Improvement Reporter

[jira] [Closed] (TOREE-372) stream corruption cased by big-endian and little-endian

2017-02-11 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky closed TOREE-372. --- Resolution: Won't Fix The issue isn't related to Toree. However a mixed endian environment may

Linking JIRA with GitHub

2017-02-11 Thread Jakob Odersky
Hi, does anyone how we can link jira with github, so that pull requests with a title of the form [TOREE-X] will close the issue X when merged. Basically I'm looking for something similar to the way Spark handles issues and pull requests. cheers, --Jakob

[jira] [Closed] (TOREE-361) Spark examples that use Spark 2 fail because docker image contains 1.6

2017-02-11 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky closed TOREE-361. --- Resolution: Fixed fixed in pr > Spark examples that use Spark 2 fail because docker image conta

[jira] [Commented] (TOREE-354) Scala Error with Apache Spark when run in Jupyter

2017-02-09 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15860179#comment-15860179 ] Jakob Odersky commented on TOREE-354: - Can you try a `pip install --pre <toree.tar,gz>

[jira] [Closed] (TOREE-354) Scala Error with Apache Spark when run in Jupyter

2017-02-07 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-354?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky closed TOREE-354. --- Resolution: Not A Problem There is a mismatch between the scala versions used by Toree and Spark

[jira] [Commented] (TOREE-363) Syntax Highlighting Breaks

2017-02-07 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15857140#comment-15857140 ] Jakob Odersky commented on TOREE-363: - AFAIK syntax highlighting is not handled by the kernels

[jira] [Closed] (TOREE-281) Fix some typos in comments/docs/testnames.

2017-02-07 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-281?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky closed TOREE-281. --- Resolution: Fixed > Fix some typos in comments/docs/testna

[jira] [Closed] (TOREE-373) scala 2.11 library incompatible with pyspark 2.1.0

2017-02-07 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-373?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky closed TOREE-373. --- Resolution: Not A Bug +1 to Marius' answer. I would be surprised it worked with 2.12 though

[jira] [Comment Edited] (TOREE-374) Variables declared on the Notebook are not garbage collected

2017-02-07 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856880#comment-15856880 ] Jakob Odersky edited comment on TOREE-374 at 2/7/17 9:57 PM: - Thanks

[jira] [Commented] (TOREE-375) Incorrect fully qualified name for spark context

2017-02-07 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15856668#comment-15856668 ] Jakob Odersky commented on TOREE-375: - I checked the implementation of `valueOfTerm` in the Scala

[jira] [Closed] (TOREE-365) Certain interpreter evaluations do not return result strings

2017-02-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky closed TOREE-365. --- Resolution: Won't Fix Closing as discussed in the pull request. See TOREE-368 for continuation

[jira] [Commented] (TOREE-374) Variables declared on the Notebook are not garbage collected

2017-02-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855081#comment-15855081 ] Jakob Odersky commented on TOREE-374: - [~dtaieb] Could you provide some steps to reproduce

[jira] [Comment Edited] (TOREE-374) Variables declared on the Notebook are not garbage collected

2017-02-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855055#comment-15855055 ] Jakob Odersky edited comment on TOREE-374 at 2/7/17 12:47 AM: -- Hmm, I wonder

[jira] [Commented] (TOREE-374) Variables declared on the Notebook are not garbage collected

2017-02-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855055#comment-15855055 ] Jakob Odersky commented on TOREE-374: - Hmm, I wonder if this is related to the Yrepl-class-based

[jira] [Comment Edited] (TOREE-375) Incorrect fully qualified name for spark context

2017-02-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-375?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15855019#comment-15855019 ] Jakob Odersky edited comment on TOREE-375 at 2/7/17 12:16 AM: -- -Yrepl-class

[jira] [Commented] (TOREE-371) $SPARK_HOME environment variable not recognised

2017-02-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-371?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854859#comment-15854859 ] Jakob Odersky commented on TOREE-371: - I'm not sure how changing the default value fixes the issue

[jira] [Commented] (TOREE-372) stream corruption cased by big-endian and little-endian

2017-02-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-372?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15854877#comment-15854877 ] Jakob Odersky commented on TOREE-372: - Spark requires its driver app (SparkContext) and all executors

Development model

2017-01-30 Thread Jakob Odersky
Hi everyone, I was wondering how everyone usually develops toree, specifically how changes to toree are tested with a jupyter notebook? I couldn't find any documentation on the website so I thought I'd ask here. I tried running the various makefile targets, including `make dev` and `make

[jira] [Updated] (TOREE-365) Certain interpreter evaluations do not return result strings

2017-01-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky updated TOREE-365: Description: The scala interpreter currently only returns results for expressions. Import

[jira] [Updated] (TOREE-365) Certain interpreter evaluations do not return result strings

2017-01-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky updated TOREE-365: Description: The scala interpreter currently only returns results for expressions. Import

[jira] [Updated] (TOREE-365) Certain interpreter evaluations do not return result strings

2017-01-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky updated TOREE-365: Description: The scala interpreter currently only returns results for expressions. Import

[jira] [Created] (TOREE-365) Certain interpreter evaluations do not return result strings

2017-01-27 Thread Jakob Odersky (JIRA)
Jakob Odersky created TOREE-365: --- Summary: Certain interpreter evaluations do not return result strings Key: TOREE-365 URL: https://issues.apache.org/jira/browse/TOREE-365 Project: TOREE

[jira] [Updated] (TOREE-333) make sbt-publishM2

2017-01-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky updated TOREE-333: Priority: Minor (was: Major) > make sbt-publishM2 > -- > >

[jira] [Updated] (TOREE-340) Output of the form "a = b" returns "b"

2017-01-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-340?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky updated TOREE-340: Priority: Major (was: Minor) > Output of the form "a = b&quo

[jira] [Comment Edited] (TOREE-262) Resolve LGPL Dependency in project

2017-01-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843653#comment-15843653 ] Jakob Odersky edited comment on TOREE-262 at 1/27/17 11:35 PM: --- The JeroMQ

[jira] [Commented] (TOREE-262) Resolve LGPL Dependency in project

2017-01-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843653#comment-15843653 ] Jakob Odersky commented on TOREE-262: - The JeroMQ project is now released under the Mozilla Public

[jira] [Resolved] (TOREE-322) Building Error with github snapshot

2017-01-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-322?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky resolved TOREE-322. - Resolution: Fixed > Building Error with github snaps

[jira] [Commented] (TOREE-322) Building Error with github snapshot

2017-01-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-322?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843649#comment-15843649 ] Jakob Odersky commented on TOREE-322: - Latest master works, I assume it must have been something

[jira] [Updated] (TOREE-333) make sbt-publishM2

2017-01-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky updated TOREE-333: Priority: Major (was: Critical) > make sbt-publishM2 > -- > >

[jira] [Commented] (TOREE-333) make sbt-publishM2

2017-01-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-333?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843642#comment-15843642 ] Jakob Odersky commented on TOREE-333: - The good news is that despite the error message, the publish

[jira] [Commented] (TOREE-362) How do I define ports specifically?

2017-01-27 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/TOREE-362?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15843631#comment-15843631 ] Jakob Odersky commented on TOREE-362: - I'm not sure I understand your question. The ports always fall

[jira] [Commented] (SPARK-14280) Update change-version.sh and pom.xml to add Scala 2.12 profiles

2017-01-10 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15816398#comment-15816398 ] Jakob Odersky commented on SPARK-14280: --- Twitter chill for scala 2.12 is finally out and I'm

Re: Third party library

2016-12-13 Thread Jakob Odersky
Hi Vineet, great to see you solved the problem! Since this just appeared in my inbox, I wanted to take the opportunity for a shameless plug: https://github.com/jodersky/sbt-jni. In case you're using sbt and also developing the native library, this plugin may help with the pains of building and

Re: Optimization for Processing a million of HTML files

2016-12-12 Thread Jakob Odersky
Assuming the bottleneck is IO, you could try saving your files to HDFS. This will distribute your data and allow for better concurrent reads. On Mon, Dec 12, 2016 at 3:06 PM, Reth RM wrote: > Hi, > > I have millions of html files in a directory, using "wholeTextFiles" api

Re: wholeTextFiles()

2016-12-12 Thread Jakob Odersky
Also, in case the issue was not due to the string length (however it is still valid and may get you later), the issue may be due to some other indexing issues which are currently being worked on here https://issues.apache.org/jira/browse/SPARK-6235 On Mon, Dec 12, 2016 at 8:18 PM, Jakob Odersky

Re: wholeTextFiles()

2016-12-12 Thread Jakob Odersky
Hi Pradeep, I'm afraid you're running into a hard Java issue. Strings are indexed with signed integers and can therefore not be longer than approximately 2 billion characters. Could you use `textFile` as a workaround? It will give you an RDD of the files' lines instead. In general, this guide

[jira] [Updated] (SPARK-14519) Cross-publish Kafka for Scala 2.12

2016-12-12 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14519?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jakob Odersky updated SPARK-14519: -- Summary: Cross-publish Kafka for Scala 2.12 (was: Cross-publish Kafka for Scala 2.12.0-M4

[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2016-12-09 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15736107#comment-15736107 ] Jakob Odersky commented on SPARK-17647: --- I rebased the PR and resolved the conflict. However

Re: Can I add a new method to RDD class?

2016-12-06 Thread Jakob Odersky
uot; add > new RDD methods in. > > How can I specify a custom version? modify version numbers in all the > pom.xml file? > > > > On Dec 5, 2016, at 9:12 PM, Jakob Odersky <ja...@odersky.com> wr

Re: Can I add a new method to RDD class?

2016-12-05 Thread Jakob Odersky
It looks like you're having issues with including your custom spark version (with the extensions) in your test project. To use your local spark version: 1) make sure it has a custom version (let's call it 2.1.0-CUSTOM) 2) publish it to your local machine with `sbt publishLocal` 3) include the

Re: custom generate spark application id

2016-12-05 Thread Jakob Odersky
The app ID is assigned internally by spark's task scheduler https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/TaskScheduler.scala#L35. You could probably change the naming, however I'm pretty sure that the ID will always have to be unique for a context on a

[jira] [Commented] (SPARK-14280) Update change-version.sh and pom.xml to add Scala 2.12 profiles

2016-12-05 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723814#comment-15723814 ] Jakob Odersky commented on SPARK-14280: --- You're welcome pull the changes back into your repo

[jira] [Commented] (SPARK-14280) Update change-version.sh and pom.xml to add Scala 2.12 profiles

2016-12-05 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15723807#comment-15723807 ] Jakob Odersky commented on SPARK-14280: --- Hi [~joshrosen], I rebased your initial work onto

Re: SparkILoop doesn't run

2016-11-21 Thread Jakob Odersky
there are libraries of multiple scala versions on the same classpath. You mention that it worked before, can you recall what libraries you upgraded before it broke? --Jakob On Mon, Nov 21, 2016 at 2:34 PM, Jakob Odersky <ja...@odersky.com> wrote: > Trying it out locally gave me an NPE.

Re: SparkILoop doesn't run

2016-11-21 Thread Jakob Odersky
Trying it out locally gave me an NPE. I'll look into it in more detail, however the SparkILoop.run() method is dead code. It's used nowhere in spark and can be removed without any issues. On Thu, Nov 17, 2016 at 11:16 AM, Mohit Jaggi wrote: > Thanks Holden. I did post to

[jira] [Comment Edited] (SPARK-14222) Cross-publish jackson-module-scala for Scala 2.12

2016-11-03 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634143#comment-15634143 ] Jakob Odersky edited comment on SPARK-14222 at 11/3/16 8:33 PM: Thanks

[jira] [Comment Edited] (SPARK-14222) Cross-publish jackson-module-scala for Scala 2.12

2016-11-03 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634143#comment-15634143 ] Jakob Odersky edited comment on SPARK-14222 at 11/3/16 8:30 PM: Thanks

[jira] [Commented] (SPARK-14222) Cross-publish jackson-module-scala for Scala 2.12

2016-11-03 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634143#comment-15634143 ] Jakob Odersky commented on SPARK-14222: --- Thanks Sean, however I realized that the dependency

[jira] [Commented] (SPARK-14222) Cross-publish jackson-module-scala for Scala 2.12

2016-11-03 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14222?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634117#comment-15634117 ] Jakob Odersky commented on SPARK-14222: --- A newer version of module (vertsion 2.8.4) is available

[jira] [Commented] (SPARK-14220) Build and test Spark against Scala 2.12

2016-11-03 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634027#comment-15634027 ] Jakob Odersky commented on SPARK-14220: --- at least most dependencies will probably make 2.12 builds

[jira] [Comment Edited] (SPARK-14220) Build and test Spark against Scala 2.12

2016-11-03 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15634027#comment-15634027 ] Jakob Odersky edited comment on SPARK-14220 at 11/3/16 7:54 PM: At least

[jira] [Commented] (SPARK-14220) Build and test Spark against Scala 2.12

2016-11-03 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15633762#comment-15633762 ] Jakob Odersky commented on SPARK-14220: --- Scala 2.12 was just officially announced :) > Bu

Re: why spark driver program is creating so many threads? How can I limit this number?

2016-10-31 Thread Jakob Odersky
> how do I tell my spark driver program to not create so many? This may depend on your driver program. Do you spawn any threads in it? Could you share some more information on the driver program, spark version and your environment? It would greatly help others to help you On Mon, Oct 31, 2016

Re: [Spark 2] BigDecimal and 0

2016-10-24 Thread Jakob Odersky
se I > need to go digging further then. Thanks for the quick help. > > On Mon, Oct 24, 2016 at 7:34 PM Jakob Odersky <ja...@odersky.com> wrote: >> >> What you're seeing is merely a strange representation, 0E-18 is zero. >> The E-18 represents the precision

Re: [Spark 2] BigDecimal and 0

2016-10-24 Thread Jakob Odersky
What you're seeing is merely a strange representation, 0E-18 is zero. The E-18 represents the precision that Spark uses to store the decimal On Mon, Oct 24, 2016 at 7:32 PM, Jakob Odersky <ja...@odersky.com> wrote: > An even smaller example that demonstrates the same behaviour: > &g

Re: [Spark 2] BigDecimal and 0

2016-10-24 Thread Jakob Odersky
An even smaller example that demonstrates the same behaviour: Seq(Data(BigDecimal(0))).toDS.head On Mon, Oct 24, 2016 at 7:03 PM, Efe Selcuk wrote: > I’m trying to track down what seems to be a very slight imprecision in our > Spark application; two of our columns, which

[jira] [Commented] (SPARK-18018) Specify alternate escape character in 'LIKE' expression

2016-10-19 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-18018?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15590136#comment-15590136 ] Jakob Odersky commented on SPARK-18018: --- I've started a very early prototype [here|https

[jira] [Created] (SPARK-18018) Specify alternate escape character in 'LIKE' expression

2016-10-19 Thread Jakob Odersky (JIRA)
Jakob Odersky created SPARK-18018: - Summary: Specify alternate escape character in 'LIKE' expression Key: SPARK-18018 URL: https://issues.apache.org/jira/browse/SPARK-18018 Project: Spark

Re: Why the json file used by sparkSession.read.json must be a valid json object per line

2016-10-19 Thread Jakob Odersky
Another reason I could imagine is that files are often read from HDFS, which by default uses line terminators to separate records. It is possible to implement your own hdfs delimiter finder, however for arbitrary json data, finding that delimiter would require stateful parsing of the file and

[jira] [Commented] (SPARK-17368) Scala value classes create encoder problems and break at runtime

2016-10-17 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17368?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15582882#comment-15582882 ] Jakob Odersky commented on SPARK-17368: --- [~arisofala...@gmail.com] Let me explain the fix to what I

[jira] [Commented] (SPARK-15577) Java can't import DataFrame type alias

2016-10-10 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15564302#comment-15564302 ] Jakob Odersky commented on SPARK-15577: --- this cleaning of jiras is really good to see

[jira] [Commented] (SPARK-15577) Java can't import DataFrame type alias

2016-10-10 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563910#comment-15563910 ] Jakob Odersky commented on SPARK-15577: --- This was considered and trade-offs were actively discussed

[jira] [Comment Edited] (SPARK-15577) Java can't import DataFrame type alias

2016-10-10 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-15577?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15563910#comment-15563910 ] Jakob Odersky edited comment on SPARK-15577 at 10/10/16 11:41 PM

Re: ClassCastException while running a simple wordCount

2016-10-10 Thread Jakob Odersky
Just thought of another potential issue: you should use the "provided" scope when depending on spark. I.e in your project's pom: org.apache.spark spark-core_2.11 2.0.1 provided On Mon, Oct 10, 2016 at 2:00 PM, Jakob Odersky <ja..

Re: ClassCastException while running a simple wordCount

2016-10-10 Thread Jakob Odersky
Ho do you submit the application? A version mismatch between the launcher, driver and workers could lead to the bug you're seeing. A common reason for a mismatch is if the SPARK_HOME environment variable is set. This will cause the spark-submit script to use the launcher determined by that

[jira] [Commented] (SPARK-17647) SQL LIKE does not handle backslashes correctly

2016-10-06 Thread Jakob Odersky (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17647?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15553517#comment-15553517 ] Jakob Odersky commented on SPARK-17647: --- Xiao pointed me to this issue, I can take a look

Re: StructuredStreaming Custom Sinks (motivated by Structured Streaming Machine Learning)

2016-10-04 Thread Jakob Odersky
Hi everyone, is there any ongoing discussion/documentation on the redesign of sinks? I think it could be a good thing to abstract away the underlying streaming model, however that isn't directly related to Holden's first point. The way I understand it, is to slightly change the DataStreamWriter

Re: Package org.apache.spark.annotation no longer exist in Spark 2.0?

2016-10-04 Thread Jakob Odersky
It's still there on master. It is in the "spark-tags" module however (under common/tags), maybe something changed in the build environment and it isn't made available as a dependency to your project? What happens if you include the module as a direct dependency? --Jakob On Tue, Oct 4, 2016 at

Re: Running Spark master/slave instances in non Daemon mode

2016-10-03 Thread Jakob Odersky
> command and binds to the output fds from that process, so daemonizing is > causing us minor hardship and seems like an easy thing to make optional. > We'd be happy to make the PR as well. > > --Mike > > On Thu, Sep 29, 2016 at 5:25 PM, Jakob Odersky <ja...@odersky

Re: java.util.NoSuchElementException when serializing Map with default value

2016-10-03 Thread Jakob Odersky
Hi Kabeer, which version of Spark are you using? I can't reproduce the error in latest Spark master. regards, --Jakob - To unsubscribe e-mail: dev-unsubscr...@spark.apache.org

Re: get different results when debugging and running scala program

2016-09-30 Thread Jakob Odersky
There is no image attached, I'm not sure how the apache mailing lists handle them. Can you provide the output as text? best, --Jakob On Fri, Sep 30, 2016 at 8:25 AM, chen yong wrote: > Hello All, > > > > I am using IDEA 15.0.4 to debug a scala program. It is strange to me

Re: Running Spark master/slave instances in non Daemon mode

2016-09-29 Thread Jakob Odersky
I'm curious, what kind of container solutions require foreground processes? Most init systems work fine with "starter" processes that run other processes. IIRC systemd and start-stop-daemon have an option called "fork", that will expect the main process to run another one in the background and

Re: java.util.NoSuchElementException when serializing Map with default value

2016-09-28 Thread Jakob Odersky
I agree with Sean's answer, you can check out the relevant serializer here https://github.com/twitter/chill/blob/develop/chill-scala/src/main/scala/com/twitter/chill/Traversable.scala On Wed, Sep 28, 2016 at 3:11 AM, Sean Owen wrote: > My guess is that Kryo specially handles

Re: Apache Spark JavaRDD pipe() need help

2016-09-22 Thread Jakob Odersky
.pipe() > API. If there is any other way let me know. This code will be executed in > all the nodes in a cluster. > > Hope my requirement is now clear. How to do this? > > Regards, > Shash > > On Thu, Sep 22, 2016 at 4:13 AM, Jakob Odersky <ja...@odersky.com> wrote: >> &

  1   2   3   4   >