[jira] [Commented] (SPARK-21727) Operating on an ArrayType in a SparkR DataFrame throws error

2018-01-07 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16315434#comment-16315434 ] Felix Cheung commented on SPARK-21727: -- I think we should use is.atomic(object) ? > Operat

Re: Kubernetes backend and docker images

2018-01-06 Thread Felix Cheung
+1 Thanks for taking on this. That was my feedback on one of the long comment thread as well, I think we should have one docker image instead of 3 (also pending in the fork are python and R variant, we should consider having one that we official release instead of 9, for example)

[jira] [Commented] (SPARK-16693) Remove R deprecated methods

2018-01-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16310872#comment-16310872 ] Felix Cheung commented on SPARK-16693: -- I thought we did but I couldn't find any record. I suppose

[jira] [Resolved] (SPARK-22933) R Structured Streaming API for withWatermark, trigger, partitionBy

2018-01-03 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22933?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-22933. -- Resolution: Fixed Fix Version/s: 2.3.0 Target Version/s: 2.3.0 > R Structu

[jira] [Commented] (SPARK-16693) Remove R deprecated methods

2018-01-02 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16308367#comment-16308367 ] Felix Cheung commented on SPARK-16693: -- These are all non public methods, so officially not public

[jira] [Updated] (SPARK-14037) count(df) is very slow for dataframe constructed using SparkR::createDataFrame

2018-01-01 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-14037?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-14037: - Summary: count(df) is very slow for dataframe constructed using SparkR::createDataFrame

[jira] [Commented] (SPARK-16366) Time comparison failures in SparkR unit tests

2018-01-01 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16366?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307728#comment-16307728 ] Felix Cheung commented on SPARK-16366: -- not 100% sure, but a a similar timestamp test failure

[jira] [Commented] (SPARK-16693) Remove R deprecated methods

2018-01-01 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-16693?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307725#comment-16307725 ] Felix Cheung commented on SPARK-16693: -- [~shivaram]we didn't do this, not sure if we should to get

[jira] [Commented] (SPARK-17762) invokeJava fails when serialized argument list is larger than INT_MAX (2,147,483,647) bytes

2018-01-01 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17762?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16307724#comment-16307724 ] Felix Cheung commented on SPARK-17762: -- is this still needed after SPARK-17790 is fixed

[jira] [Created] (SPARK-22933) R Structured Streaming API for withWatermark, trigger, partitionBy

2017-12-31 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-22933: Summary: R Structured Streaming API for withWatermark, trigger, partitionBy Key: SPARK-22933 URL: https://issues.apache.org/jira/browse/SPARK-22933 Project: Spark

Re: [Discussion] 0.8.0 Release

2017-12-30 Thread Felix Cheung
+1 From: Jeff Zhang Sent: Wednesday, December 27, 2017 3:36:20 PM To: dev@zeppelin.apache.org Subject: Re: [Discussion] 0.8.0 Release I will update that jira, and anyone can link jiras that he think is critical for 0.8.0, but it is not

Re: [DISCUSS] Increase a few numbers in source code

2017-12-30 Thread Felix Cheung
Thanks! They look reasonable to me. Please feel free to open a PR From: Belousov Maksim Eduardovich Sent: Saturday, December 30, 2017 5:20:02 AM To: dev@zeppelin.apache.org Subject: [DISCUSS] Increase a few numbers in source code Hello,

[jira] [Updated] (SPARK-22925) ml model persistence creates a lot of small files

2017-12-29 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22925: - Description: Today in when calling model.save(), some ML models we do makeRDD(data, 1

[jira] [Updated] (SPARK-22925) ml model persistence creates a lot of small files

2017-12-29 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22925: - Description: Today in when calling model.save(), some ML models we do makeRDD(data, 1

[jira] [Updated] (SPARK-22925) ml model persistence creates a lot of small files

2017-12-29 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22925: - Issue Type: Improvement (was: Bug) > ml model persistence creates a lot of small fi

[jira] [Updated] (SPARK-22925) ml model persistence creates a lot of small files

2017-12-29 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22925: - Description: Today in when calling model.save(), some ML models we do makeRDD(data, 1

[jira] [Commented] (SPARK-22925) ml model persistence creates a lot of small files

2017-12-29 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22925?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16306520#comment-16306520 ] Felix Cheung commented on SPARK-22925: -- [~josephkb][~holdenkarau][~nick.pentre...@gmail.com

[jira] [Created] (SPARK-22925) ml model persistence creates a lot of small files

2017-12-29 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-22925: Summary: ml model persistence creates a lot of small files Key: SPARK-22925 URL: https://issues.apache.org/jira/browse/SPARK-22925 Project: Spark Issue Type

[jira] [Created] (SPARK-22924) R DataFrame API for sortWithinPartitions

2017-12-29 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-22924: Summary: R DataFrame API for sortWithinPartitions Key: SPARK-22924 URL: https://issues.apache.org/jira/browse/SPARK-22924 Project: Spark Issue Type: Bug

[jira] [Resolved] (SPARK-22920) R sql functions for current_date, current_timestamp, rtrim/ltrim/trim with trimString

2017-12-29 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-22920. -- Resolution: Fixed Assignee: Felix Cheung > R sql functions for current_d

[jira] [Updated] (SPARK-22920) R sql functions for current_date, current_timestamp, rtrim/ltrim/trim with trimString

2017-12-29 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22920: - Target Version/s: 2.3.0 Fix Version/s: 2.3.0 > R sql functions for current_d

[jira] [Created] (SPARK-22920) R sql functions for current_date, current_timestamp, rtrim/ltrim/trim with trimString

2017-12-28 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-22920: Summary: R sql functions for current_date, current_timestamp, rtrim/ltrim/trim with trimString Key: SPARK-22920 URL: https://issues.apache.org/jira/browse/SPARK-22920

[jira] [Commented] (SPARK-21616) SparkR 2.3.0 migration guide, release note

2017-12-28 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21616?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305192#comment-16305192 ] Felix Cheung commented on SPARK-21616: -- SPARK-22315 > SparkR 2.3.0 migration guide, release n

[jira] [Commented] (SPARK-21727) Operating on an ArrayType in a SparkR DataFrame throws error

2017-12-28 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16305157#comment-16305157 ] Felix Cheung commented on SPARK-21727: -- [~neilalex] How it is going? > Operating on an ArrayT

Re: [VOTE] Spark 2.2.1 (RC2)

2017-12-27 Thread Felix Cheung
ct: Re: [VOTE] Spark 2.2.1 (RC2) Hi Felix Cheung: When to pulish the new version 2.2.1 of spark doc to the website, now it's still the version 2.2.0. -- Sent from: http://apache-spark-developers-list.1001551.n3.nabb

Re: Passing an array of more than 22 elements in a UDF

2017-12-26 Thread Felix Cheung
7 9:13 PM Subject: Re: Passing an array of more than 22 elements in a UDF To: Felix Cheung <felixcheun...@hotmail.com> Cc: ayan guha <guha.a...@gmail.com>, user <user@spark.apache.org> What's the privilege of using that specific version for this? Please throw some light onto i

Re: Spark 2.2.1 worker invocation

2017-12-26 Thread Felix Cheung
I think you are looking for spark.executor.extraJavaOptions https://spark.apache.org/docs/latest/configuration.html#runtime-environment From: Christopher Piggott Sent: Tuesday, December 26, 2017 8:00:56 AM To: user@spark.apache.org Subject:

Re: Passing an array of more than 22 elements in a UDF

2017-12-24 Thread Felix Cheung
Or use it with Scala 2.11? From: ayan guha Sent: Friday, December 22, 2017 3:15:14 AM To: Aakash Basu Cc: user Subject: Re: Passing an array of more than 22 elements in a UDF Hi I think you are in correct track. You can stuff all your param

[jira] [Updated] (SPARK-22889) CRAN checks can fail if older Spark install exists

2017-12-23 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22889: - Fix Version/s: 2.3.0 2.2.2 > CRAN checks can fail if older Spark inst

[jira] [Resolved] (SPARK-22889) CRAN checks can fail if older Spark install exists

2017-12-23 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22889?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung resolved SPARK-22889. -- Resolution: Fixed Assignee: Shivaram Venkataraman Target Version/s: 2.2.2

Re: [DISCUSS] Change some default settings for avoiding unintended usages

2017-12-23 Thread Felix Cheung
but I don't think it's the part of this issue directly. Let's talk about this issue with another email. I want to talk about enabling authentication by default. If it's enabled, we should login admin/password1 at the beginning. How do you think of it? On Sat, Dec 2, 2017 at 1:57 AM, Felix Cheung

Re: [DISCUSS] Change some default settings for avoiding unintended usages

2017-12-23 Thread Felix Cheung
but I don't think it's the part of this issue directly. Let's talk about this issue with another email. I want to talk about enabling authentication by default. If it's enabled, we should login admin/password1 at the beginning. How do you think of it? On Sat, Dec 2, 2017 at 1:57 AM, Felix Cheung

Re: [DISCUSS] Review process

2017-12-23 Thread Felix Cheung
meant is we don't have to wait for all kind of PRs. On Wed, Dec 20, 2017 at 2:11 AM, Felix Cheung <felixcheun...@hotmail.com> wrote: > +1 > What would be the rough heuristic people will be comfortable with- > what is small and what is big? > > _

[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2017-12-22 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301717#comment-16301717 ] Felix Cheung commented on SPARK-22683: -- I think the challenge here is the ability to determine

[jira] [Comment Edited] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2017-12-22 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301709#comment-16301709 ] Felix Cheung edited comment on SPARK-22683 at 12/22/17 5:35 PM: I

[jira] [Commented] (SPARK-22683) DynamicAllocation wastes resources by allocating containers that will barely be used

2017-12-22 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301709#comment-16301709 ] Felix Cheung commented on SPARK-22683: -- I couldn't find the exact source line, but from running

[jira] [Commented] (SPARK-22870) Dynamic allocation should allow 0 idle time

2017-12-22 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301701#comment-16301701 ] Felix Cheung commented on SPARK-22870: -- +1 yes there is more than the check for the value 0

[jira] [Commented] (SPARK-20007) Make SparkR apply() functions robust to workers that return empty data.frame

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20007?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301015#comment-16301015 ] Felix Cheung commented on SPARK-20007: -- any taker on this for 2.3.0? > Make SparkR ap

[jira] [Commented] (SPARK-21076) R dapply doesn't return array or raw columns when array have different length

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301011#comment-16301011 ] Felix Cheung commented on SPARK-21076: -- any taker on this for 2.3.0? > R dapply doesn't ret

[jira] [Updated] (SPARK-21076) R dapply doesn't return array or raw columns when array have different length

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-21076: - Target Version/s: 2.3.0 > R dapply doesn't return array or raw columns when array have differ

[jira] [Updated] (SPARK-21291) R bucketBy partitionBy API

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-21291: - Target Version/s: 2.3.0 > R bucketBy partitionBy

[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301006#comment-16301006 ] Felix Cheung commented on SPARK-22632: -- how are we on this for 2.3? > Fix the behavior of timest

[jira] [Updated] (SPARK-21940) Support timezone for timestamps in SparkR

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21940?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-21940: - Target Version/s: 2.3.0 > Support timezone for timestamps in Spa

[jira] [Updated] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22632: - Target Version/s: 2.3.0 > Fix the behavior of timestamp values for R's DataFrame to resp

[jira] [Updated] (SPARK-20007) Make SparkR apply() functions robust to workers that return empty data.frame

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20007?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-20007: - Target Version/s: 2.3.0 > Make SparkR apply() functions robust to workers that return em

[jira] [Updated] (SPARK-21208) Ability to "setLocalProperty" from sc, in sparkR

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21208?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-21208: - Target Version/s: 2.3.0 > Ability to "setLocalProperty" from

[jira] [Commented] (SPARK-21208) Ability to "setLocalProperty" from sc, in sparkR

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21208?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301012#comment-16301012 ] Felix Cheung commented on SPARK-21208: -- any taker on this for 2.3.0? > Ability to "setLocal

[jira] [Updated] (SPARK-21030) extend hint syntax to support any expression for Python and R

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-21030: - Target Version/s: 2.3.0 > extend hint syntax to support any expression for Python an

[jira] [Commented] (SPARK-21030) extend hint syntax to support any expression for Python and R

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301014#comment-16301014 ] Felix Cheung commented on SPARK-21030: -- any taker on this for 2.3.0? > extend hint syn

[jira] [Commented] (SPARK-21291) R bucketBy partitionBy API

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301010#comment-16301010 ] Felix Cheung commented on SPARK-21291: -- any taker on this for 2.3.0? > R bucketBy partitionBy

[jira] [Commented] (SPARK-21940) Support timezone for timestamps in SparkR

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301007#comment-16301007 ] Felix Cheung commented on SPARK-21940: -- any taker on this for 2.3.0? > Support timez

[jira] [Updated] (SPARK-22843) R localCheckpoint API

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22843?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22843: - Target Version/s: 2.3.0 > R localCheckpoint API > - > >

[jira] [Commented] (SPARK-22843) R localCheckpoint API

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22843?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301005#comment-16301005 ] Felix Cheung commented on SPARK-22843: -- any taker on this for 2.3.0? > R localCheckpoint

[jira] [Commented] (SPARK-22766) Install R linter package in spark lib directory

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22766?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16301003#comment-16301003 ] Felix Cheung commented on SPARK-22766: -- how is this vs SPARK-22063 ? > Install R lin

[jira] [Commented] (SPARK-21727) Operating on an ArrayType in a SparkR DataFrame throws error

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21727?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16300965#comment-16300965 ] Felix Cheung commented on SPARK-21727: -- Neil McQuarrie is going to work on this > Operat

[jira] [Assigned] (SPARK-21727) Operating on an ArrayType in a SparkR DataFrame throws error

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-21727?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung reassigned SPARK-21727: Assignee: Neil McQuarrie > Operating on an ArrayType in a SparkR DataFrame throws er

[jira] [Commented] (SPARK-22851) Download mirror for spark-2.2.1-bin-hadoop2.7.tgz has file with incorrect checksum

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16300599#comment-16300599 ] Felix Cheung commented on SPARK-22851: -- Ah then it’s the browser - i know Safari does unpack

[jira] [Commented] (SPARK-22851) Download mirror for spark-2.2.1-bin-hadoop2.7.tgz has file with incorrect checksum

2017-12-21 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22851?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16300338#comment-16300338 ] Felix Cheung commented on SPARK-22851: -- That’s odd. Mirror replication is handled transparently

Re: Timeline for Spark 2.3

2017-12-20 Thread Felix Cheung
+1 I think the earlier we cut a branch the better. From: Michael Armbrust Sent: Tuesday, December 19, 2017 4:41:44 PM To: Holden Karau Cc: Sameer Agarwal; Erik Erlandson; dev Subject: Re: Timeline for Spark 2.3 Do people really need to be

[jira] [Created] (SPARK-22843) R localCheckpoint API

2017-12-20 Thread Felix Cheung (JIRA)
Felix Cheung created SPARK-22843: Summary: R localCheckpoint API Key: SPARK-22843 URL: https://issues.apache.org/jira/browse/SPARK-22843 Project: Spark Issue Type: Bug Components

Docker images

2017-12-19 Thread Felix Cheung
Hi! Is there a reason the official docker images https://hub.docker.com/r/_/flink/ Has sources in a different repo https://github.com/docker-flink/docker-flink ?

Re: [DISCUSS] Review process

2017-12-19 Thread Felix Cheung
manage the delay time of > > merging a pull request > > > > El 18 dic. 2017 18:03, "Felix Cheung" <felixcheun...@hotmail.com> > > escribió: > > > > > I think it is still useful to have a time delay after one approve since > > &g

Re: [DISCUSS] Review process

2017-12-18 Thread Felix Cheung
I think it is still useful to have a time delay after one approve since often time there are very feedback and updates after one committer approval. Also github has a tab for all PRs you are subscribed to, it shouldn’t be very hard to review all the approved ones again.

Re: Accessing Spark UI from Zeppelin

2017-12-16 Thread Felix Cheung
You could set to replace http://masternode with your custom http hostname. Perhaps you want that to be set to a known, public (and authenticated?) IP/url? If you do have it it can be set to the Zeppelin config before Zeppelin starts. From: ankit jain

Re: zeppelin build fails with DependencyConvergence error

2017-12-16 Thread Felix Cheung
Instead of exclusion, would it be better to use the version in the cloudera repo? Please do consider contributing these changes back to Zeppelin source. Thanks! _ From: Ruslan Dautkhanov Sent: Monday, December 11, 2017 3:42 PM Subject: Re:

[jira] [Commented] (SPARK-22812) Failing cran-check on master

2017-12-15 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16293648#comment-16293648 ] Felix Cheung commented on SPARK-22812: -- Not exactly... what’s the environment? Seems like something

Re: [RESULT][VOTE] Spark 2.2.1 (RC2)

2017-12-14 Thread Felix Cheung
; access? We definitely need to give you all necessary access if you're the > release manager! > > > On Thu, Dec 14, 2017 at 6:32 AM Felix Cheung <felixche...@apache.org> > wrote: > >> And I don’t have access to publish python. >> >> On Wed, De

Re: [RESULT][VOTE] Spark 2.2.1 (RC2)

2017-12-14 Thread Felix Cheung
pretty ready. >> We should announce the release officially too then. >> >> On Wed, Dec 6, 2017 at 5:00 PM Felix Cheung <felixche...@apache.org> >> wrote: >> >>> I saw the svn move on Monday so I’m working on the website updates. >>> >>> I will lo

Re: [RESULT][VOTE] Spark 2.2.1 (RC2)

2017-12-06 Thread Felix Cheung
ome question about getting a hand in finishing the release process, > including copying artifacts in svn. Was there anything else you're waiting > on someone to do? > > > On Fri, Dec 1, 2017 at 2:10 AM Felix Cheung <felixche...@apache.org> > wrote: > >> This vote pas

[jira] [Updated] (SPARK-20201) Flaky Test: org.apache.spark.sql.catalyst.expressions.OrderingSuite

2017-12-02 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-20201?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-20201: - Target Version/s: (was: 2.2.1) > Flaky T

Re: [DISCUSS] Change some default settings for avoiding unintended usages

2017-12-01 Thread Felix Cheung
cript through python and Scala. Shell is just an example. Using docker looks good but it cannot avoid unindented usage of resources like mining coin. On Fri, Dec 1, 2017 at 2:36 PM, Felix Cheung <felixcheun...@hotmail.com<mailto:felixcheun...@hotmail.com>> wrote: > I don’t

Re: [DISCUSS] Change some default settings for avoiding unintended usages

2017-12-01 Thread Felix Cheung
run any script through python and Scala. Shell is just an example. Using docker looks good but it cannot avoid unindented usage of resources like mining coin. On Fri, Dec 1, 2017 at 2:36 PM, Felix Cheung <felixcheun...@hotmail.com<mailto:felixcheun...@hotmail.com>> wrote: > I don’t

[RESULT][VOTE] Spark 2.2.1 (RC2)

2017-12-01 Thread Felix Cheung
This vote passes. Thanks everyone for testing this release. +1: Sean Owen (binding) Herman van Hövell tot Westerflier (binding) Wenchen Fan (binding) Shivaram Venkataraman (binding) Felix Cheung Henry Robinson Hyukjin Kwon Dongjoon Hyun Kazuaki Ishizaki Holden Karau Weichen Xu 0

[jira] [Comment Edited] (SPARK-22472) Datasets generate random values for null primitive types

2017-11-30 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274042#comment-16274042 ] Felix Cheung edited comment on SPARK-22472 at 12/1/17 7:09 AM: --- I guess

[jira] [Commented] (SPARK-22472) Datasets generate random values for null primitive types

2017-11-30 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274042#comment-16274042 ] Felix Cheung commented on SPARK-22472: -- I guess it's too late to add to http://spark.apache.org

[jira] [Commented] (SPARK-22632) Fix the behavior of timestamp values for R's DataFrame to respect session timezone

2017-11-30 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16274029#comment-16274029 ] Felix Cheung commented on SPARK-22632: -- interesting re: timezone on macOS https://cran.r-project.org

Re: [DISCUSS] Change some default settings for avoiding unintended usages

2017-11-30 Thread Felix Cheung
I don’t think that’s limited to the shell interpreter. You can run any arbitrary program or script from python or Scala (or java) as well. _ From: Jeff Zhang Sent: Wednesday, November 29, 2017 4:00 PM Subject: Re: [DISCUSS] Change some default

Re: [DISCUSS] Change some default settings for avoiding unintended usages

2017-11-30 Thread Felix Cheung
I don’t think that’s limited to the shell interpreter. You can run any arbitrary program or script from python or Scala (or java) as well. _ From: Jeff Zhang Sent: Wednesday, November 29, 2017 4:00 PM Subject: Re: [DISCUSS] Change some default

[jira] [Updated] (SPARK-22637) CatalogImpl.refresh() has quadratic complexity for a view

2017-11-28 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22637: - Target Version/s: 2.2.2, 2.3.0 (was: 2.3.0) > CatalogImpl.refresh() has quadratic complex

[jira] [Commented] (SPARK-22472) Datasets generate random values for null primitive types

2017-11-28 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22472?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270178#comment-16270178 ] Felix Cheung commented on SPARK-22472: -- This is going out in 2.2.1 - do we need a rel note

[jira] [Commented] (SPARK-22627) Fix formatting of headers in configuration.html page

2017-11-28 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22627?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16270169#comment-16270169 ] Felix Cheung commented on SPARK-22627: -- quite possibly this is related to jekyll changes.. yap

Re: [Spark R]: dapply only works for very small datasets

2017-11-28 Thread Felix Cheung
; Sent: Tuesday, November 28, 2017 3:11 AM Subject: AW: [Spark R]: dapply only works for very small datasets To: Felix Cheung <felixcheun...@hotmail.com>, <user@spark.apache.org> Thanks for the fast reply. I tried it locally, with 1 - 8 slots on a 8 core machine w/ 25GB memory as w

Re: [VOTE] Spark 2.2.1 (RC2)

2017-11-28 Thread Felix Cheung
the latest Debian, so +1 for this release. > > (I committed the change to set -Xss4m for tests consistently, but this > shouldn't block a release.) > > > On Sat, Nov 25, 2017 at 12:47 PM Felix Cheung <felixche...@apache.org> > wrote: > >> Ah sorry digging through the

Re: [Spark R]: dapply only works for very small datasets

2017-11-27 Thread Felix Cheung
What's the number of executor and/or number of partitions you are working with? I'm afraid most of the problem is with the serialization deserialization overhead between JVM and R... From: Kunft, Andreas Sent: Monday, November 27,

Re: [VOTE] Spark 2.2.1 (RC2)

2017-11-25 Thread Felix Cheung
, Nov 25, 2017 at 10:36 AM Felix Cheung <felixche...@apache.org> wrote: > Thanks Sean. > > For the second one, it looks like the HiveExternalCatalogVersionsSuite is > trying to download the release tgz from the official Apache mirror, which > won’t work unless the release

Re: [VOTE] Spark 2.2.1 (RC2)

2017-11-25 Thread Felix Cheung
d status 1 > > tar: Error is not recoverable: exiting now > > *** RUN ABORTED *** > > java.io.IOException: Cannot run program "./bin/spark-submit" (in > directory "/tmp/test-spark/spark-2.0.2"): error=2, No such file or directory > > On Sat, Nov 25

[VOTE] Spark 2.2.1 (RC2)

2017-11-24 Thread Felix Cheung
Please vote on releasing the following candidate as Apache Spark version 2.2.1. The vote is open until Friday December 1, 2017 at 8:00:00 am UTC and passes if a majority of at least 3 PMC +1 votes are cast. [ ] +1 Release this package as Apache Spark 2.2.1 [ ] -1 Do not release this package

[jira] [Updated] (SPARK-22402) Allow fetcher URIs to be downloaded to specific locations relative to Mesos Sandbox

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22402?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22402: - Affects Version/s: (was: 2.2.2) 2.2.0 > Allow fetcher U

[jira] [Commented] (SPARK-22402) Allow fetcher URIs to be downloaded to specific locations relative to Mesos Sandbox

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22402?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16265584#comment-16265584 ] Felix Cheung commented on SPARK-22402: -- Any update on this one? > Allow fetcher U

[jira] [Updated] (SPARK-22495) Fix setup of SPARK_HOME variable on Windows

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22495?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22495: - Fix Version/s: 2.2.1 > Fix setup of SPARK_HOME variable on Wind

[jira] [Updated] (SPARK-22595) flaky test: CastSuite.SPARK-22500: cast for struct should not generate codes beyond 64KB

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22595?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22595: - Target Version/s: 2.2.1, 2.3.0 (was: 2.3.0, 2.2.2) > flaky test: CastSuite.SPARK-22500: c

[jira] [Updated] (SPARK-22591) GenerateOrdering shouldn't change ctx.INPUT_ROW

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22591?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22591: - Target Version/s: 2.2.1, 2.3.0 (was: 2.3.0, 2.2.2) > GenerateOrdering shouldn't cha

[jira] [Updated] (SPARK-22548) Incorrect nested AND expression pushed down to JDBC data source

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22548?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22548: - Target Version/s: 2.2.1, 2.3.0 (was: 2.3.0, 2.2.2) > Incorrect nested AND expression pus

[jira] [Updated] (SPARK-17920) HiveWriterContainer passes null configuration to serde.initialize, causing NullPointerException in AvroSerde when using avro.schema.url

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-17920?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-17920: - Target Version/s: 2.2.1, 2.3.0 (was: 2.3.0, 2.2.2) > HiveWriterContainer passes n

[jira] [Updated] (SPARK-22500) 64KB JVM bytecode limit problem with cast

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22500?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22500: - Target Version/s: 2.2.1, 2.3.0 (was: 2.2.0, 2.3.0) > 64KB JVM bytecode limit problem with c

[jira] [Updated] (SPARK-22549) 64KB JVM bytecode limit problem with concat_ws

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22549?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22549: - Target Version/s: 2.2.1, 2.3.0 (was: 2.3.0, 2.2.2) > 64KB JVM bytecode limit prob

[jira] [Updated] (SPARK-22550) 64KB JVM bytecode limit problem with elt

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22550: - Target Version/s: 2.2.1, 2.3.0 > 64KB JVM bytecode limit problem with

[jira] [Updated] (SPARK-22508) 64KB JVM bytecode limit problem with GenerateUnsafeRowJoiner.create()

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22508?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22508: - Target Version/s: 2.2.1, 2.3.0 (was: 2.3.0, 2.2.2) > 64KB JVM bytecode limit prob

[jira] [Updated] (SPARK-22544) FileStreamSource should use its own hadoop conf to call globPathIfNecessary

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22544?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22544: - Target Version/s: 2.2.1, 2.3.0 (was: 2.3.0, 2.2.2) Fix Version/s: (was: 2.2.2

[jira] [Updated] (SPARK-22538) SQLTransformer.transform(inputDataFrame) uncaches inputDataFrame

2017-11-24 Thread Felix Cheung (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-22538?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Felix Cheung updated SPARK-22538: - Target Version/s: 2.2.1, 2.3.0 (was: 2.3.0, 2.2.2) > SQLTransformer.transform(inputDataFr

<    5   6   7   8   9   10   11   12   13   14   >