[GitHub] spark pull request: [SPARK-4697][YARN]System properties should ove...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3557#issuecomment-69538386 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25402/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4697][YARN]System properties should ove...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3557#issuecomment-69538383 [Test build #25402 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25402/consoleFull) for PR 3557 at commit [`836b9ef`](https://github.com/apache/spark/commit/836b9ef13ef442109ba978da23309f7679405e2f). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5165][SQL] Add support for rollup and c...
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/3964#issuecomment-69539148 Hi @chenghao-intel , actually this syntax is referred to TPC-DC benchmark. If we want to follow hiveQL syntax in sqlcontext i agree with you, but i am not sure this is a good idea. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5202] [SQL] Add hql variable substituti...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4003#issuecomment-69539130 [Test build #25407 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25407/consoleFull) for PR 4003 at commit [`70c3508`](https://github.com/apache/spark/commit/70c3508796e7a47f018734b3b7c1ecafc7151ae3). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5204][SQL] Column case need to be consi...
Github user OopsOutOfMemory commented on the pull request: https://github.com/apache/spark/pull/4005#issuecomment-69543215 @marmbrus /cc @scwf --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3120#discussion_r22777883 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -44,7 +44,14 @@ private[spark] class CacheManager(blockManager: BlockManager) extends Logging { blockManager.get(key) match { case Some(blockResult) = // Partition is already materialized, so just return its values +val existingMetrics = context.taskMetrics.inputMetrics +val prevBytesRead = existingMetrics + .filter(_.readMethod == blockResult.inputMetrics.readMethod) + .map(_.bytesRead) + .getOrElse(0L) --- End diff -- So what happens if we have input types that intermix here? For instance, what if they interleave between two input sources... will they just keep clobbering over eachother? It might be better to just chose a single input metric and stick with it, i.e. if we happen to be reading a block that wasn't derived from the same input as the one before it, just ignore it. ``` val blockInput = blockResult.inputMetrics context.taskMetrics.inputMetrics match { case Some(existingInput) = if (existingInput.readMethod == blockInput.readMethod) { existingInput.bytesRead += blockInput.bytesRead } // NOTE: If we have interleaving of two input types in one task, we currently ignore blocks associated // with all but one type (whichever type was seen first). See SPARK-XXX. case None = context.taskMetrics.inputMetrics = Some(blockInput) } ``` It's easier to document that behavior and also add a unit test for it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5202] [SQL] Add hql variable substituti...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4003#issuecomment-69544784 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25407/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5204][SQL] Column case need to be consi...
Github user scwf commented on the pull request: https://github.com/apache/spark/pull/4005#issuecomment-69545578 we can't normalize the column name and make it LowerCase, actually sqlcontext is caseSensitive by default. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3711#discussion_r22777939 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala --- @@ -344,12 +354,20 @@ private[spark] class MesosSchedulerBackend( override def frameworkMessage(d: SchedulerDriver, e: ExecutorID, s: SlaveID, b: Array[Byte]) {} + /** + * Remove executor associated with slaveId in a thread safe manner. + */ + private def removeExecutor(slaveId: String) = { +synchronized { --- End diff -- Gotcha, can you just add a TODO in the comment saying to review the sychronization? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...
Github user pwendell commented on the pull request: https://github.com/apache/spark/pull/3711#issuecomment-69538284 LGTM - just had a minor comment that can also be addressed on merge. Jenkins, test this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3120#discussion_r22778368 --- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala --- @@ -153,34 +157,19 @@ class NewHadoopRDD[K, V]( throw new java.util.NoSuchElementException(End of stream) } havePair = false - -// Update bytes read metric every few records -if (recordsSinceMetricsUpdate == HadoopRDD.RECORDS_BETWEEN_BYTES_READ_METRIC_UPDATES --- End diff -- This was done intentionally to help keep the callback updates out of the `InputMetrics` class and isolate it to Hadoop RDD. This notion of callbacks makes the InputMetrics class more complicated and mutable. Since it's an exposed class we really wanted to keep the interface clean and simple, even if it meant some extra engineering in HadoopRDD. So could this part of the change be reverted back to how it was before (and you don't change the InputMetrics/TaskMetrics classes?). --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4908][SQL][hotfix]narrow the scope of s...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4001#issuecomment-69539617 [Test build #25403 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25403/consoleFull) for PR 4001 at commit [`4bfa306`](https://github.com/apache/spark/commit/4bfa3067f6d1494c770d49375498cf1b4adbaa45). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] some comments fix for GROUPING SETS
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4000#issuecomment-69541868 [Test build #25408 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25408/consoleFull) for PR 4000 at commit [`9c24fc4`](https://github.com/apache/spark/commit/9c24fc4e521b1eec86ebe9c55ec22a5a3dadcb8a). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3711#issuecomment-69543920 [Test build #25406 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25406/consoleFull) for PR 3711 at commit [`946d2c5`](https://github.com/apache/spark/commit/946d2c52f0e1db32c4a041b2f62e8b0a71fd9fec). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3711#issuecomment-69543929 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25406/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3120#issuecomment-69538897 [Test build #25404 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25404/consoleFull) for PR 3120 at commit [`a2ca793`](https://github.com/apache/spark/commit/a2ca793a97fb1fc0edafe417151bf674701fe7a4). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3120#issuecomment-69538904 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25404/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5203][SQL] fix union with different dec...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4004#issuecomment-69541015 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4945] [SQL] Add overwrite option suppor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3780#issuecomment-69541870 [Test build #25409 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25409/consoleFull) for PR 3780 at commit [`72c4a4b`](https://github.com/apache/spark/commit/72c4a4b0109845033070d243e5544d02a551cd8b). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5204][SQL] Column case need to be consi...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4005#issuecomment-69543517 Can one of the admins verify this patch? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3120#discussion_r22778546 --- Diff: core/src/main/scala/org/apache/spark/CacheManager.scala --- @@ -44,7 +44,14 @@ private[spark] class CacheManager(blockManager: BlockManager) extends Logging { blockManager.get(key) match { case Some(blockResult) = // Partition is already materialized, so just return its values +val existingMetrics = context.taskMetrics.inputMetrics +val prevBytesRead = existingMetrics + .filter(_.readMethod == blockResult.inputMetrics.readMethod) + .map(_.bytesRead) + .getOrElse(0L) --- End diff -- Actually after looking at Hadoop RDD - it might be necessary to just clobber here to preserve consistency with that case. But it could still be nicer to write this with a match. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4092] [CORE] Fix InputMetrics for coale...
Github user pwendell commented on a diff in the pull request: https://github.com/apache/spark/pull/3120#discussion_r22778563 --- Diff: core/src/main/scala/org/apache/spark/rdd/NewHadoopRDD.scala --- @@ -109,18 +109,22 @@ class NewHadoopRDD[K, V]( logInfo(Input split: + split.serializableHadoopSplit) val conf = confBroadcast.value.value - val inputMetrics = new InputMetrics(DataReadMethod.Hadoop) + val readMethod = DataReadMethod.Hadoop --- End diff -- Minor, but I find it slightly easier to follow as a match: ``` val hadoopReadMethod = DataReadMethod.Hadoop val newMetrics = context.taskMetrics.inputMetrics match { case Some(InputMetrics(hadoopReadMethod)) = context.taskMetrics.inputMetrics case _ = // Note that this may clobber some other input metric (see SPARK-XXX) hadoopReadMethod } context.taskMetrics.inputMetrics = Some(newMetrics) ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5202] [SQL] Add hql variable substituti...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4003#issuecomment-69544779 [Test build #25407 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25407/consoleFull) for PR 4003 at commit [`70c3508`](https://github.com/apache/spark/commit/70c3508796e7a47f018734b3b7c1ecafc7151ae3). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3711#issuecomment-69538514 [Test build #25406 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25406/consoleFull) for PR 3711 at commit [`946d2c5`](https://github.com/apache/spark/commit/946d2c52f0e1db32c4a041b2f62e8b0a71fd9fec). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4908][SQL][hotfix]narrow the scope of s...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4001#issuecomment-69539620 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25403/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5203][SQL] fix union with different dec...
GitHub user guowei2 opened a pull request: https://github.com/apache/spark/pull/4004 [SPARK-5203][SQL] fix union with different decimal type When union non-decimal types with decimals, we use the following rules: - FIRST `intTypeToFixed`, then fixed union decimals with precision/scale p1/s2 and p2/s2 will be promoted to DecimalType(max(p1, p2), max(s1, s2)) - FLOAT and DOUBLE cause fixed-length decimals to turn into DOUBLE (this is the same as Hive, but note that unlimited decimals are considered bigger than doubles in WidenTypes) You can merge this pull request into a Git repository by running: $ git pull https://github.com/guowei2/spark SPARK-5203 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4004.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4004 commit cb14704214b3d114d3fe4341885637c04db0f944 Author: guowei2 guow...@asiainfo.com Date: 2015-01-12T08:32:43Z fix union with different decimal type --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] some comments fix for GROUPING SETS
Github user adrian-wang commented on the pull request: https://github.com/apache/spark/pull/4000#issuecomment-69541549 retest this please. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4857] [CORE] Adds Executor membership e...
Github user nitin2goyal commented on a diff in the pull request: https://github.com/apache/spark/pull/3711#discussion_r22779931 --- Diff: core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala --- @@ -213,6 +216,7 @@ class CoarseGrainedSchedulerBackend(scheduler: TaskSchedulerImpl, val actorSyste totalCoreCount.addAndGet(-executorInfo.totalCores) totalRegisteredExecutors.addAndGet(-1) scheduler.executorLost(executorId, SlaveLost(reason)) + listenerBus.post(SparkListenerExecutorRemoved(executorId)) --- End diff -- Any reason why we are doing this here instead of in executorLost method of DAGScheduler.scala ? (similarly for SparkListenerExecutorAdded event above) --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5204][SQL] Column case need to be consi...
GitHub user OopsOutOfMemory opened a pull request: https://github.com/apache/spark/pull/4005 [SPARK-5204][SQL] Column case need to be consistent with Hive You can merge this pull request into a Git repository by running: $ git pull https://github.com/OopsOutOfMemory/spark column_lowercase Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4005.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4005 commit 0b9a7e0f684010c395235612e33e0d611952b450 Author: OopsOutOfMemory victorshen...@126.com Date: 2015-01-12T08:47:49Z lower case column name commit 7c1c245c880ba39f9baf4f260d393eb68de85ca6 Author: OopsOutOfMemory victorshen...@126.com Date: 2015-01-12T08:55:41Z add test suite --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MLLIB][SPARK-3278] Monotone (Isotonic) regres...
Github user zapletal-martin commented on the pull request: https://github.com/apache/spark/pull/3519#issuecomment-69544311 @mengxr I have updated the PR with requested changes. Because of the issue with scala primitives used as generic type as discussed above I had to implement two functions that simplify java coop. `def train(input: JavaPairRDD[java.lang.Double, java.lang.Double], isotonic: Boolean): IsotonicRegressionModel` and `def predict(testData: JavaRDD[java.lang.Double]): JavaRDD[java.lang.Double] = testData.rdd.map(_.doubleValue()).map(predict).map(new java.lang.Double(_))` Let me know if we want those or if you can think of a better way to make the java api less obscure. There is one test failing (task exceeding max allowed framesize). I am looking at ways to resolve it, but please comment on the rest of the PR. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-69555409 [Test build #25410 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25410/consoleFull) for PR 3841 at commit [`67bcb46`](https://github.com/apache/spark/commit/67bcb46c99e645e158ca1f67122ce943daa7030b). * This patch **does not merge cleanly**. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-69570069 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25414/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5102][Core]subclass of MapStatus needs ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4007#issuecomment-69570985 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25416/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5102][Core]subclass of MapStatus needs ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4007#issuecomment-69570978 [Test build #25416 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25416/consoleFull) for PR 4007 at commit [`9d2238a`](https://github.com/apache/spark/commit/9d2238a052186b5ce9396fa9afc4084b5ee3eb7e). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user WangTaoTheTonic commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-69577651 @andrewor14 Updated and tested simply, it works. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4697][YARN]System properties should ove...
Github user tgravescs commented on the pull request: https://github.com/apache/spark/pull/3557#issuecomment-69578562 So it used to be (0.8 and 0.9) that spark on yarn was ran like this. We kept this work in the 1.0 release as well for backwards compatibility. spark-class org.apache.spark.deploy.yarn.Client \ --jar spark-examples.jar \ --class org.apache.spark.examples.SparkPi --args yarn-standalone --num-workers 3 --worker-memory 2g --queue unfunded does moving those to spark submit break the app name for this? I don't know of anyone still using that but if we break it we should consciously decide that is unsupported. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-1825] Make Windows Spark client work fi...
Github user sarutak commented on the pull request: https://github.com/apache/spark/pull/3943#issuecomment-69576680 @tsudukim Recently, YARN module was refactored so could you rebase your change? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Refactor LiveLis...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4006#issuecomment-69577579 [Test build #25418 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25418/consoleFull) for PR 4006 at commit [`a9dccd3`](https://github.com/apache/spark/commit/a9dccd372328fd7de739ab2529935563b7c2165d). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4799] Use IP address instead of local h...
Github user smola commented on the pull request: https://github.com/apache/spark/pull/3645#issuecomment-69569760 @pwendell Right. The problem is that there is no way to force the use of a given IP (ignoring reverse lookups or any other hostname/ip detection mechanisms). I get this on Docker, the default set up is something like this: Spark worker: - IP: 172.17.0.11 - Hostname: hashone Spark driver: - IP: 172.17.0.12 - Hostname: hashtwo Spark worker cannot resolve `hashtwo` and Spark driver cannot resolve `hashone`. At some point, Spark worker throws an exception because it's trying to resolve `hashtwo` instead of just contacting `172.17.0.12`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4962] [CORE] Put TaskScheduler.start ba...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3810#issuecomment-69569742 [Test build #25417 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25417/consoleFull) for PR 3810 at commit [`577614f`](https://github.com/apache/spark/commit/577614f40702d6c2aea0d57bf8b52b744e69def3). * This patch **does not merge cleanly**. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5102][Core]subclass of MapStatus needs ...
Github user darabos commented on the pull request: https://github.com/apache/spark/pull/4007#issuecomment-69580933 Thanks for the quick fix! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4945] [SQL] Add overwrite option suppor...
Github user chenghao-intel commented on a diff in the pull request: https://github.com/apache/spark/pull/3780#discussion_r22795027 --- Diff: sql/core/src/main/scala/org/apache/spark/sql/SchemaRDDLike.scala --- @@ -68,12 +68,29 @@ private[sql] trait SchemaRDDLike { /** * Saves the contents of this `SchemaRDD` as a parquet file, preserving the schema. Files that * are written out using this method can be read back in as a SchemaRDD using the `parquetFile` - * function. + * function. It will raise exception if the specified path already existed. * + * @param path The destination path. * @group schema */ def saveAsParquetFile(path: String): Unit = { -sqlContext.executePlan(WriteToFile(path, logicalPlan)).toRdd +// We provide override functions for the ability of default function argument value, +// which is not naturely supported by Java +saveAsParquetFile(path, false) + } + + /** + * Saves the contents of this `SchemaRDD` as a parquet file, preserving the schema. Files that + * are written out using this method can be read back in as a SchemaRDD using the `parquetFile` + * function. + * @param path The destination path. + * @param overwrite If it's false, an exception will raise if the path already existed, --- End diff -- @marmbrus this comment actually is my intention. Not like the API `insertInto`, which work as `overwrite` or `append` mode, `saveAsParquetFile` originally throws exception when the specified path is existed, but in some ETL application, we need to overwrite the existed path without any caution, that's why I add a new API for this purpose. And you're right, the implementation that you were reviewing contains a bug, now it has been fixed. @rxin We save the entire schema rdd as multiple files under the specified path. And from the API design, I am not sure if we need to merge these 2 APIs `saveAsParquetFile` and `insertInto`, by specifying the mode (`append`,`overwrite`) and file types (parquet format, rcfile etc.). What do you think? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5174][SPARK-5175] provide more APIs in ...
Github user CodingCat commented on the pull request: https://github.com/apache/spark/pull/3984#issuecomment-69567317 anyone can take a review of this? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-69570061 [Test build #25414 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25414/consoleFull) for PR 3841 at commit [`bc6e1ec`](https://github.com/apache/spark/commit/bc6e1ec379671e1b0b7edc6bc465dd0d890e0c7c). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5196][SQL] Support `comment` in Create ...
Github user OopsOutOfMemory commented on the pull request: https://github.com/apache/spark/pull/3999#issuecomment-69569261 ping. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5102][Core]subclass of MapStatus needs ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4007#issuecomment-69572528 [Test build #25415 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25415/consoleFull) for PR 4007 at commit [`05a285d`](https://github.com/apache/spark/commit/05a285dee2cc91ea12a814a254b1024bd00ae3cb). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5102][Core]subclass of MapStatus needs ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4007#issuecomment-69572536 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25415/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Refactor LiveLis...
GitHub user zsxwing opened a pull request: https://github.com/apache/spark/pull/4006 [SPARK-4859][Core][Streaming] Refactor LiveListenerBus and StreamingListenerBus This PR refactors LiveListenerBus and StreamingListenerBus and extracts the common codes to a parent class `ListenerBus`. It also includes bug fixes in #3710. You can merge this pull request into a Git repository by running: $ git pull https://github.com/zsxwing/spark SPARK-4859-refactor Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4006.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4006 commit 7cc04c385945b490369f9ab0b5831c68ee45a0dc Author: zsxwing zsxw...@gmail.com Date: 2015-01-12T12:04:25Z Refactor LiveListenerBus and StreamingListenerBus and make them share same code base --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-69562956 [Test build #25414 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25414/consoleFull) for PR 3841 at commit [`bc6e1ec`](https://github.com/apache/spark/commit/bc6e1ec379671e1b0b7edc6bc465dd0d890e0c7c). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-4159 [BUILD] Addendum: improve running o...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3993#issuecomment-69563556 @sryza SBT won't run the Java tests, so that is unchanged. Wiki edit looks good to me. We do need this PR merged too or else excluding all Java tests will cause Surefire to barf. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5102][Core]subclass of MapStatus needs ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4007#issuecomment-69564862 [Test build #25415 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25415/consoleFull) for PR 4007 at commit [`05a285d`](https://github.com/apache/spark/commit/05a285dee2cc91ea12a814a254b1024bd00ae3cb). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [MLLib]SPARK-5027:add SVMWithLBFGS interface i...
Github user loachli commented on the pull request: https://github.com/apache/spark/pull/3890#issuecomment-69555498 I have tested QWLQN in spark 1.1. Based on org.apache.spark.mllib.optimization.LBFGS, I create another class org.apache.spark.mllib.optimization.QWLQN. The main change is as follows: // val lbfgs = new BreezeLBFGS[BDV[Double]](maxNumIterations, numCorrections, convergenceTol) val lbfgs = new BreezeOWLQN[BDV[Double]](maxNumIterations, numCorrections, convergenceTol) I used the same environment and the the same logic of the SPARK-5027's comparsion test, only changed the optimizer,and get the follow result. algorithm timeaccuracy SVMWithLBFGS 1441s 86.22% SVMWithQWLQN1678s 86.5% SVMWithQWLQN in spark 1.1 increases the accuracy by 0.32% in this test,but the speed will be decreased by 16.4% I also tested SVMWithQWLQN in spark 1.2, and spark 1.2 use different version of breeze and the API of QWLQN is changed. // val lbfgs = new BreezeLBFGS[BDV[Double]](maxNumIterations, numCorrections, convergenceTol) val lbfgs = new BreezeOWLQN[Int, BDV[Double]](maxNumIterations, numCorrections, convergenceTol) In spark 1.2 SVMWithQWLQN get the same accuracy as in spark 1.1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5201][CORE] deal with int overflow in t...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4002#discussion_r22786476 --- Diff: core/src/main/scala/org/apache/spark/rdd/ParallelCollectionRDD.scala --- @@ -127,18 +127,12 @@ private object ParallelCollectionRDD { }) } seq match { - case r: Range.Inclusive = { -val sign = if (r.step 0) { - -1 -} else { - 1 -} -slice(new Range( - r.start, r.end + sign, r.step).asInstanceOf[Seq[T]], numSlices) - } case r: Range = { -positions(r.length, numSlices).map({ - case (start, end) = +val sign = r.isInclusive (r.end == Int.MaxValue || r.end == Int.MinValue) --- End diff -- Maybe I'm missing it but why is `IntMinValue` a special case here? Also the `({` on the next line is redundant. Just one is needed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Refactor LiveLis...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/4006#issuecomment-69561690 cc @JoshRosen @tdas @andrewor14 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: SPARK-5136 [DOCS] Improve documentation around...
Github user srowen commented on the pull request: https://github.com/apache/spark/pull/3952#issuecomment-69562599 I merged my proposed wiki edits into the wiki page. I hope that's OK. https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark#ContributingtoSpark-IntelliJ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5102][Core]subclass of MapStatus needs ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4007#issuecomment-69565338 [Test build #25416 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25416/consoleFull) for PR 4007 at commit [`9d2238a`](https://github.com/apache/spark/commit/9d2238a052186b5ce9396fa9afc4084b5ee3eb7e). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5204][SQL] Column case need to be consi...
Github user OopsOutOfMemory closed the pull request at: https://github.com/apache/spark/pull/4005 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] some comments fix for GROUPING SETS
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4000#issuecomment-69548342 [Test build #25408 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25408/consoleFull) for PR 4000 at commit [`9c24fc4`](https://github.com/apache/spark/commit/9c24fc4e521b1eec86ebe9c55ec22a5a3dadcb8a). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5204][SQL] Column case need to be consi...
Github user OopsOutOfMemory commented on the pull request: https://github.com/apache/spark/pull/4005#issuecomment-69548251 Thanks, Got it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4945] [SQL] Add overwrite option suppor...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3780#issuecomment-69551534 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25409/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Improve LiveList...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/3710#issuecomment-69561617 @andrewor14 instead of making the behavior of StreamingListenerBus to match that of the LiveListenerBus, I sent #4006 to make them share the same codes in the parent class `ListenerBus`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Improve LiveList...
Github user zsxwing closed the pull request at: https://github.com/apache/spark/pull/3710 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Refactor LiveLis...
Github user zsxwing commented on the pull request: https://github.com/apache/spark/pull/4006#issuecomment-69562564 Should I open another JIRA, or just update SPARK-4859 jira? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5201][CORE] deal with int overflow in t...
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/4002#discussion_r22787232 --- Diff: core/src/main/scala/org/apache/spark/rdd/ParallelCollectionRDD.scala --- @@ -127,18 +127,12 @@ private object ParallelCollectionRDD { }) } seq match { - case r: Range.Inclusive = { -val sign = if (r.step 0) { - -1 -} else { - 1 -} -slice(new Range( - r.start, r.end + sign, r.step).asInstanceOf[Seq[T]], numSlices) - } case r: Range = { -positions(r.length, numSlices).map({ - case (start, end) = +val sign = r.isInclusive (r.end == Int.MaxValue || r.end == Int.MinValue) --- End diff -- Ah right, the range can go backwards. Yeah, something like `needsInclusiveRange` or `exceptionalBoundary` or something. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4945] [SQL] Add overwrite option suppor...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3780#issuecomment-69551518 [Test build #25409 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25409/consoleFull) for PR 3780 at commit [`72c4a4b`](https://github.com/apache/spark/commit/72c4a4b0109845033070d243e5544d02a551cd8b). * This patch **passes all tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5186] [MLLIB] Vector.equals and Vector....
Github user srowen commented on a diff in the pull request: https://github.com/apache/spark/pull/3997#discussion_r22783753 --- Diff: mllib/src/main/scala/org/apache/spark/mllib/linalg/Vectors.scala --- @@ -449,6 +449,16 @@ class SparseVector( override def toString: String = (%s,%s,%s).format(size, indices.mkString([, ,, ]), values.mkString([, ,, ])) --- End diff -- You must override `hashCode` too! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2765#issuecomment-69556749 [Test build #25411 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25411/consoleFull) for PR 2765 at commit [`4c0e261`](https://github.com/apache/spark/commit/4c0e261485139e3c24ff6a942bb646d3766f8d9d). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-2199] [mllib] topic modeling
Github user akopich commented on the pull request: https://github.com/apache/spark/pull/1269#issuecomment-69560296 @jkbradley, @mengxr, please, include @IlyaKozlov as author too. He's helped a lot with the implementation. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4962] [CORE] Put TaskScheduler.start ba...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3810#issuecomment-69561948 [Test build #25413 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25413/consoleFull) for PR 3810 at commit [`bf18f5f`](https://github.com/apache/spark/commit/bf18f5f67937348ffc55370b0bf0bc6a62adb063). * This patch **does not merge cleanly**. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Improve LiveList...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/3710#discussion_r22786673 --- Diff: core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala --- @@ -118,7 +127,7 @@ private[spark] class LiveListenerBus extends SparkListenerBus with Logging { * Log an error message to indicate that the event queue is full. Do this only once. */ private def logQueueFullErrorMessage(): Unit = { -if (!queueFullErrorMessageLogged) { +if (queueFullErrorMessageLogged.compareAndSet(false, true)) { --- End diff -- Right. to avoid outputing multiple logs. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Improve LiveList...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/3710#discussion_r22786686 --- Diff: core/src/main/scala/org/apache/spark/scheduler/LiveListenerBus.scala --- @@ -49,7 +52,8 @@ private[spark] class LiveListenerBus extends SparkListenerBus with Logging { // Atomically remove and process this event LiveListenerBus.this.synchronized { val event = eventQueue.poll - if (event == SparkListenerShutdown) { + if (event == null) { +assert(stopped) --- End diff -- Added it in #4006 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/2765#issuecomment-69563246 [Test build #25411 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25411/consoleFull) for PR 2765 at commit [`4c0e261`](https://github.com/apache/spark/commit/4c0e261485139e3c24ff6a942bb646d3766f8d9d). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3586][streaming]Support nested director...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/2765#issuecomment-69563254 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25411/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-69564800 [Test build #25410 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25410/consoleFull) for PR 3841 at commit [`67bcb46`](https://github.com/apache/spark/commit/67bcb46c99e645e158ca1f67122ce943daa7030b). * This patch **passes all tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5102][Core]subclass of MapStatus needs ...
GitHub user lianhuiwang opened a pull request: https://github.com/apache/spark/pull/4007 [SPARK-5102][Core]subclass of MapStatus needs to be registered with Kryo CompressedMapStatus and HighlyCompressedMapStatus needs to be registered with Kryo, because they are subclass of MapStatus. You can merge this pull request into a Git repository by running: $ git pull https://github.com/lianhuiwang/spark SPARK-5102 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4007.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4007 commit 05a285dee2cc91ea12a814a254b1024bd00ae3cb Author: lianhuiwang lianhuiwan...@gmail.com Date: 2015-01-12T12:42:00Z subclass of MapStatus needs to be registered with Kryo --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5006][Deploy]spark.port.maxRetries does...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3841#issuecomment-69564804 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25410/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4994][network]Cleanup removed executors...
Github user lianhuiwang commented on the pull request: https://github.com/apache/spark/pull/3828#issuecomment-69558032 @aarondav can you look at this PR? thanks. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4962] [CORE] Put TaskScheduler.start ba...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3810#issuecomment-69562038 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25413/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4962] [CORE] Put TaskScheduler.start ba...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3810#issuecomment-69562036 [Test build #25413 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25413/consoleFull) for PR 3810 at commit [`bf18f5f`](https://github.com/apache/spark/commit/bf18f5f67937348ffc55370b0bf0bc6a62adb063). * This patch **fails Scala style tests**. * This patch **does not merge cleanly**. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Improve LiveList...
Github user zsxwing commented on a diff in the pull request: https://github.com/apache/spark/pull/3710#discussion_r22786638 --- Diff: streaming/src/main/scala/org/apache/spark/streaming/scheduler/StreamingListenerBus.scala --- @@ -17,20 +17,28 @@ package org.apache.spark.streaming.scheduler +import java.util.concurrent.atomic.AtomicBoolean +import java.util.concurrent.{LinkedBlockingQueue, CopyOnWriteArrayList} + +import scala.util.control.NonFatal + import org.apache.spark.Logging -import scala.collection.mutable.{SynchronizedBuffer, ArrayBuffer} -import java.util.concurrent.LinkedBlockingQueue +import org.apache.spark.util.Utils /** Asynchronously passes StreamingListenerEvents to registered StreamingListeners. */ private[spark] class StreamingListenerBus() extends Logging { - private val listeners = new ArrayBuffer[StreamingListener]() -with SynchronizedBuffer[StreamingListener] + // `listeners` will be set up during the initialization of the whole system and the number of + // listeners is small, so the copying cost of CopyOnWriteArrayList will be little. With the help + // of CopyOnWriteArrayList, we can eliminate a lock during processing every event comparing to + // SynchronizedBuffer. + private val listeners = new CopyOnWriteArrayList[StreamingListener]() --- End diff -- It's usually not safe to have a lock when posting events to outer listeners. I elimiated all locks during posting events to listeners in #4006 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5201][CORE] deal with int overflow in t...
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/4002#discussion_r22787018 --- Diff: core/src/main/scala/org/apache/spark/rdd/ParallelCollectionRDD.scala --- @@ -127,18 +127,12 @@ private object ParallelCollectionRDD { }) } seq match { - case r: Range.Inclusive = { -val sign = if (r.step 0) { - -1 -} else { - 1 -} -slice(new Range( - r.start, r.end + sign, r.step).asInstanceOf[Seq[T]], numSlices) - } case r: Range = { -positions(r.length, numSlices).map({ - case (start, end) = +val sign = r.isInclusive (r.end == Int.MaxValue || r.end == Int.MinValue) --- End diff -- Try to convert this inclusive range ``` -2 to Int.MinValue by -1 ``` to exclusive range will be ``` -2L until -1L + Int.MinValue by -1 ``` -1 + Int.MinValue will overflow. As for sign, which name would you recommend? How about inclusiveRangeWithIntBoundary? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5201][CORE] deal with int overflow in t...
Github user advancedxy commented on a diff in the pull request: https://github.com/apache/spark/pull/4002#discussion_r22787610 --- Diff: core/src/main/scala/org/apache/spark/rdd/ParallelCollectionRDD.scala --- @@ -127,18 +127,12 @@ private object ParallelCollectionRDD { }) } seq match { - case r: Range.Inclusive = { -val sign = if (r.step 0) { - -1 -} else { - 1 -} -slice(new Range( - r.start, r.end + sign, r.step).asInstanceOf[Seq[T]], numSlices) - } case r: Range = { -positions(r.length, numSlices).map({ - case (start, end) = +val sign = r.isInclusive (r.end == Int.MaxValue || r.end == Int.MinValue) --- End diff -- ok. Will change the name. As for redundant ```({```, there is a infix operator ```toSeq```, so I prefer the redundant one. And the previous code used ```({``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SQL] some comments fix for GROUPING SETS
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4000#issuecomment-69548350 Test PASSed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25408/ Test PASSed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Refactor LiveLis...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4006#issuecomment-69561477 [Test build #25412 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25412/consoleFull) for PR 4006 at commit [`7cc04c3`](https://github.com/apache/spark/commit/7cc04c385945b490369f9ab0b5831c68ee45a0dc). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Refactor LiveLis...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4006#issuecomment-69566482 [Test build #25412 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25412/consoleFull) for PR 4006 at commit [`7cc04c3`](https://github.com/apache/spark/commit/7cc04c385945b490369f9ab0b5831c68ee45a0dc). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Refactor LiveLis...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4006#issuecomment-69566490 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25412/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5196][SQL] Support `comment` in Create ...
Github user yhuai commented on a diff in the pull request: https://github.com/apache/spark/pull/3999#discussion_r22802272 --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/TableScanSuite.scala --- @@ -314,4 +314,20 @@ class TableScanSuite extends DataSourceTest { sql(SELECT * FROM oneToTenDef), (1 to 10).map(Row(_)).toSeq) } + + test(schema field with comment) { +sql( + + |CREATE TEMPORARY TABLE student(name string comment the name of a student) --- End diff -- Can we update the schema to have 2 or more columns (just want to make sure the comma separator works well) and to have columns with and without comments? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4962] [CORE] Put TaskScheduler.start ba...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3810#issuecomment-69587238 **[Test build #25417 timed out](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25417/consoleFull)** for PR 3810 at commit [`577614f`](https://github.com/apache/spark/commit/577614f40702d6c2aea0d57bf8b52b744e69def3) after a configured wait of `120m`. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Refactor LiveLis...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4006#issuecomment-69588632 [Test build #25418 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25418/consoleFull) for PR 4006 at commit [`a9dccd3`](https://github.com/apache/spark/commit/a9dccd372328fd7de739ab2529935563b7c2165d). * This patch **fails MiMa tests**. * This patch merges cleanly. * This patch adds the following public classes _(experimental)_: * `abstract class ListenerBus[L : AnyRef, E](name: String) extends ListenerHelper[L, E] ` * `trait ListenerHelper[L : AnyRef, E] extends Logging ` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3518#issuecomment-69591770 [Test build #25420 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25420/consoleFull) for PR 3518 at commit [`a32f0ac`](https://github.com/apache/spark/commit/a32f0ac7b00023a04101de5a0acc985a209e90bd). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4387][PySpark] Refactoring python profi...
Github user udnay closed the pull request at: https://github.com/apache/spark/pull/3255 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5201][CORE] deal with int overflow in t...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/4002#issuecomment-69591751 [Test build #25419 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25419/consoleFull) for PR 4002 at commit [`651c959`](https://github.com/apache/spark/commit/651c959529a59a6f31032ba5ba8dcc459d509453). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5196][SQL] Support `comment` in Create ...
Github user yhuai commented on the pull request: https://github.com/apache/spark/pull/3999#issuecomment-69597975 ok to test --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5196][SQL] Support `comment` in Create ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3999#issuecomment-69598398 [Test build #25421 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25421/consoleFull) for PR 3999 at commit [`81b8431`](https://github.com/apache/spark/commit/81b8431382be8106603bc6317c86f635e27eee96). * This patch merges cleanly. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5205][Streaming]:Inconsistent behaviour...
GitHub user uncleGen opened a pull request: https://github.com/apache/spark/pull/4008 [SPARK-5205][Streaming]:Inconsistent behaviour between Streaming job and others, when click kill link in WebUI The kill link is used to kill a stage in job. It works in any kinds of Spark job but Spark Streaming. To be specific, we can only kill the stage which is used to run Receiver, but not kill the Receivers. Well, the stage can be killed and cleaned from the ui, but the receivers are still alive and receiving data. I think it dose not fit with the common sense. IMHO, killing the receiver stage means kill the receivers and stopping receiving data. You can merge this pull request into a Git repository by running: $ git pull https://github.com/uncleGen/spark master-clean-150112 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/spark/pull/4008.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #4008 commit 9c77f403084c9971a82ae98d5f28704fa62c5269 Author: uncleGen husty...@gmail.com Date: 2015-01-12T15:57:57Z kill streaming receiver stage through ui commit cf0342f987282679cc65ef4765658e0f31d929c9 Author: uncleGen husty...@gmail.com Date: 2015-01-12T16:01:54Z fix commit 5f22ea4cbc15660a2444e566bde0f34ad69e Author: uncleGen husty...@gmail.com Date: 2015-01-12T16:17:51Z description --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4387][PySpark] Refactoring python profi...
Github user udnay commented on the pull request: https://github.com/apache/spark/pull/3255#issuecomment-69591907 Replaced by #3901 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-3694] RDD and Task serialization debugg...
Github user ilganeli commented on the pull request: https://github.com/apache/spark/pull/3518#issuecomment-69591920 Hi @JoshRosen, #3638 has now been merged and I've resolved the minor merge conflicts and pushed the updates. If you could please review this at your convenience, I'd love to have it merged in as well. Thanks! --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5196][SQL] Support `comment` in Create ...
Github user SparkQA commented on the pull request: https://github.com/apache/spark/pull/3999#issuecomment-69600606 [Test build #25421 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/25421/consoleFull) for PR 3999 at commit [`81b8431`](https://github.com/apache/spark/commit/81b8431382be8106603bc6317c86f635e27eee96). * This patch **fails Spark unit tests**. * This patch merges cleanly. * This patch adds no public classes. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-5196][SQL] Support `comment` in Create ...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/3999#issuecomment-69600617 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25421/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] spark pull request: [SPARK-4859][Core][Streaming] Refactor LiveLis...
Github user AmplabJenkins commented on the pull request: https://github.com/apache/spark/pull/4006#issuecomment-69588641 Test FAILed. Refer to this link for build results (access rights to CI server needed): https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/25418/ Test FAILed. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. --- - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org