[GitHub] spark pull request #13666: [SPARK-15934] [SQL] Return binary mode in ThriftS...
Github user epahomov closed the pull request at: https://github.com/apache/spark/pull/13666 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13667: [SPARK-15934] [SQL] Return binary mode in ThriftS...
GitHub user epahomov opened a pull request: https://github.com/apache/spark/pull/13667 [SPARK-15934] [SQL] Return binary mode in ThriftServer Returning binary mode to ThriftServer for backward compatibility. Tested with Squirrel and Tableau. You can merge this pull request into a Git repository by running: $ git pull https://github.com/epahomov/spark SPARK-15095-2.0 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13667.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13667 commit 2213b205a0efa04350d8fa24033c43a34cb235a6 Author: Egor Pakhomov <e...@anchorfree.com> Date: 2016-06-14T17:49:33Z [SPARK-15934] [SQL] Return binary mode in ThriftServer --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #13666: [SPARK-15934] [SQL] Return binary mode in ThriftServer
Github user epahomov commented on the issue: https://github.com/apache/spark/pull/13666 How better to do it for branch-2.0. checkout branch-2.0, create branch, cherry pick and create pull request? Or there is a more easy way? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request #13666: [SPARK-15934] [SQL] Return binary mode in ThriftS...
GitHub user epahomov opened a pull request: https://github.com/apache/spark/pull/13666 [SPARK-15934] [SQL] Return binary mode in ThriftServer Returning binary mode to ThriftServer for backward compatibility. Tested with Squirrel and Tableau. You can merge this pull request into a Git repository by running: $ git pull https://github.com/epahomov/spark SPARK-15934 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/13666.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #13666 commit 7340c042d3a75615a576075583dd42df9fc47c82 Author: Egor Pakhomov <e...@anchorfree.com> Date: 2016-06-14T17:49:33Z [SPARK-15934] [SQL] Return binary mode in ThriftServer --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark issue #12876: [SPARK-15095] [SQL] drop binary mode in ThriftServer
Github user epahomov commented on the issue: https://github.com/apache/spark/pull/12876 I've created a ticket to revert these changes - https://issues.apache.org/jira/browse/SPARK-15934 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3878][MLlib] Benchmarks and common test...
GitHub user epahomov opened a pull request: https://github.com/apache/spark/pull/4963 [SPARK-3878][MLlib] Benchmarks and common tests for mllib algorithm Motivation: * Set of benchmarks which help user understand what kind of cases algorithm works on * Eliminate duplicating test code * Declarative testing style * For developer easy to understand how to test new code I'm not insisting on exactly this architecture, but I would like to start a discussion about how implement it. You can merge this pull request into a Git repository by running: $ git pull https://github.com/epahomov/spark SPARK-3878 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4963.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4963 commit 6110a474cc622e73c84596f252d82d8ffd24679d Author: epahomov pahomov.e...@gmail.com Date: 2015-03-10T14:52:38Z [SPARK-3878][MLlib] Benchmarks and common tests for mllib algorithm --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3830][MLlib] Implement genetic algorith...
Github user epahomov closed the pull request at: https://github.com/apache/spark/pull/2731 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3830][MLlib] Implement genetic algorith...
Github user epahomov commented on the pull request: https://github.com/apache/spark/pull/2731#issuecomment-77836797 My PR is too old for current architecture and I already found too much to improve in it. I'll do better and resubmit. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3830] Implement genetic algorithms in M...
GitHub user epahomov opened a pull request: https://github.com/apache/spark/pull/2731 [SPARK-3830] Implement genetic algorithms in MLLib You can merge this pull request into a Git repository by running: $ git pull https://github.com/epahomov/spark SPARK-3830 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2731.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2731 commit bb00e96ed10d4d01d35f4506babe967bc438e877 Author: epahomov pahomov.e...@gmail.com Date: 2014-10-09T10:15:01Z [SPARK-3830] Implement genetic algorithms in MLLib --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-3525] Adding gradient boosting
Github user epahomov closed the pull request at: https://github.com/apache/spark/pull/2394 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-3525] Adding gradient boosting
Github user epahomov commented on the pull request: https://github.com/apache/spark/pull/2394#issuecomment-57077001 Sorry for such messy pull request, I didn't review my student code close enough. Would try my best next time. We'll fix everything by the middle of the week. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3690] Closing shuffle writers we swallo...
GitHub user epahomov opened a pull request: https://github.com/apache/spark/pull/2537 [SPARK-3690] Closing shuffle writers we swallow more important exception You can merge this pull request into a Git repository by running: $ git pull https://github.com/epahomov/spark SPARK-3690 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2537.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2537 commit a0b7de4d52bf2aa23d6b183d180263b21f933ef9 Author: epahomov pahomov.e...@gmail.com Date: 2014-09-25T15:31:44Z [SPARK-3690] Closing shuffle writers we swallow more important exception --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3507] Adding RegressionLearner
Github user epahomov commented on the pull request: https://github.com/apache/spark/pull/2371#issuecomment-55565128 Closed, because currently there is similar work in Databricks --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [Spark-3525] Adding gradient boosting
GitHub user epahomov opened a pull request: https://github.com/apache/spark/pull/2394 [Spark-3525] Adding gradient boosting You can merge this pull request into a Git repository by running: $ git pull https://github.com/epahomov/spark SPARK-3525 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2394.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2394 commit d0dfb7b632715c60ef78964ea4d20aaa7712d2e2 Author: olgaoskina olgaosk...@yandex-team.ru Date: 2014-09-04T06:51:45Z Added stochastic gradient boosting algorithm commit 11c247a72e1681661cef4314fec5d1b4283b087f Author: olgaoskina olgaosk...@yandex-team.ru Date: 2014-09-04T06:52:05Z Added stochastic gradient boosting algorithm commit fdfc88e046a29202058b8f45168d624ed91f6d16 Author: olgaoskina olgaosk...@yandex-team.ru Date: 2014-09-05T12:25:41Z Code refactor commit b91b372c951db8bd1be6bd4d2308bc509bc1b44f Author: olgaoskina olgaosk...@yandex-team.ru Date: 2014-09-06T09:02:51Z Added test 'StochasticGradientBoostingSuite' commit 223f0907b6accaa0bf08c7948b2e6c1d728dab18 Author: olgaoskina olgaosk...@yandex-team.ru Date: 2014-09-10T08:08:30Z Added new test commit da13706bd8101ec8a2b648ce6ddc9777516e121f Author: olgaoskina olgaosk...@yandex-team.ru Date: 2014-09-14T15:33:52Z Refactor tests commit eafa0b75785b2ac570ddbc26a80b08b328f7b29c Author: Egor Pakhomov pahomov.e...@gmail.com Date: 2014-09-15T07:42:53Z Merge branch 'gradient_boosting' of https://github.com/olgaoskina/spark into olgaoskina-gradient_boosting commit 3c56f4ef65fb0df80804b0f4b9436f0623582be7 Author: Egor Pakhomov pahomov.e...@gmail.com Date: 2014-09-15T08:46:43Z Merge branch 'olgaoskina-gradient_boosting' into SPARK-3525 commit ce1934a329783629a12f615cbeac3d7e1a05a791 Author: Egor Pakhomov pahomov.e...@gmail.com Date: 2014-09-15T08:32:48Z [SPARK-3525] Fixing GradientBoostingSuite --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3507] Adding RegressionLearner
GitHub user epahomov opened a pull request: https://github.com/apache/spark/pull/2371 [SPARK-3507] Adding RegressionLearner You can merge this pull request into a Git repository by running: $ git pull https://github.com/epahomov/spark SPARK-3507 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/2371.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2371 commit 9827e9a6ad509f9b66b66534a4830e732663b038 Author: Egor Pakhomov pahomov.e...@gmail.com Date: 2014-09-12T15:04:54Z [SPARK-3507] Adding RegressionLearner --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3507] Adding RegressionLearner
Github user epahomov commented on the pull request: https://github.com/apache/spark/pull/2371#issuecomment-55416626 Purpose of this patch is set architecture for further RegressionLearner work and creating tests for regression algorithms. This patch allow to parallelize: * Creating new test cases for regression * Creating new suites for measuring quality * b Implementing regression testing of new algorithm in 3 lines of code. /b --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [WIP] SPARK-2157 Ability to write tight firewa...
Github user epahomov commented on a diff in the pull request: https://github.com/apache/spark/pull/1107#discussion_r13908570 --- Diff: repl/src/main/scala/org/apache/spark/repl/SparkIMain.scala --- @@ -102,7 +102,8 @@ import org.apache.spark.util.Utils val virtualDirectory = new PlainFile(outputDir) // directory for classfiles /** Jetty server that will serve our classes to worker nodes */ -val classServer = new HttpServer(outputDir, new SecurityManager(conf)) +val classServerListenPort: Int= conf.getInt(spark.replClassServer.port, 0) --- End diff -- val classServerListenPort: Int - word port make it obvious, that it's Int. There is no need for such specification --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] SPARK-2157 Ability to write tight firewa...
Github user epahomov commented on a diff in the pull request: https://github.com/apache/spark/pull/1107#discussion_r13908643 --- Diff: core/src/main/scala/org/apache/spark/network/ConnectionManager.scala --- @@ -102,7 +102,24 @@ private[spark] class ConnectionManager(port: Int, conf: SparkConf, serverChannel.socket.setReuseAddress(true) serverChannel.socket.setReceiveBufferSize(256 * 1024) - serverChannel.socket.bind(new InetSocketAddress(port)) + def bindWithIncrement(port: Int, maxTries: Int = 3) { +for( offset - 0 until maxTries ) { + try { +serverChannel.socket.bind(new InetSocketAddress(port + offset)) +return --- End diff -- there is no need for return --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] SPARK-2157 Ability to write tight firewa...
Github user epahomov commented on a diff in the pull request: https://github.com/apache/spark/pull/1107#discussion_r13908739 --- Diff: core/src/main/scala/org/apache/spark/HttpServer.scala --- @@ -41,45 +41,73 @@ private[spark] class ServerStateException(message: String) extends Exception(mes * as well as classes created by the interpreter when the user types in code. This is just a wrapper * around a Jetty server. */ -private[spark] class HttpServer(resourceBase: File, securityManager: SecurityManager) -extends Logging { +private[spark] class HttpServer(resourceBase: File, +securityManager: SecurityManager, +localPort: Int = 0) extends Logging { private var server: Server = null - private var port: Int = -1 + private var port: Int = localPort + + private def startOnPort(startPort: Int): Tuple2[Server,Int] = { +val server = new Server() +val connector = new SocketConnector +connector.setMaxIdleTime(60*1000) +connector.setSoLingerTime(-1) +connector.setPort(startPort) +server.addConnector(connector) + +val threadPool = new QueuedThreadPool +threadPool.setDaemon(true) +server.setThreadPool(threadPool) +val resHandler = new ResourceHandler +resHandler.setResourceBase(resourceBase.getAbsolutePath) + +val handlerList = new HandlerList +handlerList.setHandlers(Array(resHandler, new DefaultHandler)) + +if (securityManager.isAuthenticationEnabled()) { + logDebug(HttpServer is using security) + val sh = setupSecurityHandler(securityManager) + // make sure we go through security handler to get resources + sh.setHandler(handlerList) + server.setHandler(sh) +} else { + logDebug(HttpServer is not using security) + server.setHandler(handlerList) +} + +server.start() +val actualPort = server.getConnectors()(0).getLocalPort() + +return (server, actualPort) + } + + private def startWithIncrements(startPort: Int, maxRetries: Int): Tuple2[Server,Int] = { +for( offset - 0 to maxRetries) { --- End diff -- code about selecting port to start I can see twice. Let encapsulate it in single method --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [WIP] SPARK-2157 Ability to write tight firewa...
Github user epahomov commented on a diff in the pull request: https://github.com/apache/spark/pull/1107#discussion_r13908948 --- Diff: core/src/main/scala/org/apache/spark/network/ConnectionManager.scala --- @@ -102,7 +102,24 @@ private[spark] class ConnectionManager(port: Int, conf: SparkConf, serverChannel.socket.setReuseAddress(true) serverChannel.socket.setReceiveBufferSize(256 * 1024) - serverChannel.socket.bind(new InetSocketAddress(port)) + def bindWithIncrement(port: Int, maxTries: Int = 3) { +for( offset - 0 until maxTries ) { + try { +serverChannel.socket.bind(new InetSocketAddress(port + offset)) +return --- End diff -- Agree. My bad. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1174] Adding port configuration for Htt...
Github user epahomov commented on the pull request: https://github.com/apache/spark/pull/81#issuecomment-40873951 Yep. In our organization to connect 2 machines in network you need to specify target ip and host(security issues). If host constantly changing you can not run spark. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1259] Make RDD locally iterable
Github user epahomov commented on the pull request: https://github.com/apache/spark/pull/156#issuecomment-39630047 Sure, I like this approach, I will change it on Sunday. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1259] Make RDD locally iterable
Github user epahomov commented on the pull request: https://github.com/apache/spark/pull/156#issuecomment-39538362 Doc's changed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] spark pull request: [SPARK-1259] Make RDD locally iterable
Github user epahomov commented on the pull request: https://github.com/apache/spark/pull/156#issuecomment-38143969 Hi, what do you think about new changes? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---