[jira] [Commented] (SPARK-23135) Accumulators don't show up properly in the Stages page anymore
[ https://issues.apache.org/jira/browse/SPARK-23135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329362#comment-16329362 ] Marcelo Vanzin commented on SPARK-23135: (By fine I mean the table renders correctly; the accumulator still shows "Some(blah)" which is wrong, though.) > Accumulators don't show up properly in the Stages page anymore > -- > > Key: SPARK-23135 > URL: https://issues.apache.org/jira/browse/SPARK-23135 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.3.0 > Environment: > > >Reporter: Burak Yavuz >Priority: Blocker > Attachments: webUIAccumulatorRegression.png > > > Didn't do a lot of digging but may be caused by: > [https://github.com/apache/spark/commit/1c70da3bfbb4016e394de2c73eb0db7cdd9a6968#diff-0d37752c6ec3d902aeff701771b4e932] > > !webUIAccumulatorRegression.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23135) Accumulators don't show up properly in the Stages page anymore
[ https://issues.apache.org/jira/browse/SPARK-23135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329353#comment-16329353 ] Marcelo Vanzin commented on SPARK-23135: I'll try to take a look at the code, but do you have code to replicate the issue? I tried the following and the page renders fine: {code} scala> val acc1 = sc.longAccumulator("acc1") acc1: org.apache.spark.util.LongAccumulator = LongAccumulator(id: 0, name: Some(acc1), value: 0) scala> val acc2 = sc.longAccumulator("acc2") acc2: org.apache.spark.util.LongAccumulator = LongAccumulator(id: 1, name: Some(acc2), value: 0) scala> sc.parallelize(1 to 10, 10).map { i => | acc1.add(i) | acc2.add(i * 2) | i | }.collect() {code} > Accumulators don't show up properly in the Stages page anymore > -- > > Key: SPARK-23135 > URL: https://issues.apache.org/jira/browse/SPARK-23135 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.3.0 > Environment: > > >Reporter: Burak Yavuz >Priority: Blocker > Attachments: webUIAccumulatorRegression.png > > > Didn't do a lot of digging but may be caused by: > [https://github.com/apache/spark/commit/1c70da3bfbb4016e394de2c73eb0db7cdd9a6968#diff-0d37752c6ec3d902aeff701771b4e932] > > !webUIAccumulatorRegression.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23020) Re-enable Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329306#comment-16329306 ] Apache Spark commented on SPARK-23020: -- User 'vanzin' has created a pull request for this issue: https://github.com/apache/spark/pull/20297 > Re-enable Flaky Test: > org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher > > > Key: SPARK-23020 > URL: https://issues.apache.org/jira/browse/SPARK-23020 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.3.0 >Reporter: Sameer Agarwal >Assignee: Marcelo Vanzin >Priority: Blocker > > https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.3-test-maven-hadoop-2.7/42/testReport/junit/org.apache.spark.launcher/SparkLauncherSuite/testInProcessLauncher/history/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23135) Accumulators don't show up properly in the Stages page anymore
[ https://issues.apache.org/jira/browse/SPARK-23135?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329293#comment-16329293 ] Burak Yavuz commented on SPARK-23135: - cc [~vanzin] > Accumulators don't show up properly in the Stages page anymore > -- > > Key: SPARK-23135 > URL: https://issues.apache.org/jira/browse/SPARK-23135 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.3.0 > Environment: > > >Reporter: Burak Yavuz >Priority: Blocker > Attachments: webUIAccumulatorRegression.png > > > Didn't do a lot of digging but may be caused by: > [https://github.com/apache/spark/commit/1c70da3bfbb4016e394de2c73eb0db7cdd9a6968#diff-0d37752c6ec3d902aeff701771b4e932] > > !webUIAccumulatorRegression.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23135) Accumulators don't show up properly in the Stages page anymore
[ https://issues.apache.org/jira/browse/SPARK-23135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-23135: Environment: was: Didn't do a lot of digging but may be caused by: [https://github.com/apache/spark/commit/1c70da3bfbb4016e394de2c73eb0db7cdd9a6968#diff-0d37752c6ec3d902aeff701771b4e932] > Accumulators don't show up properly in the Stages page anymore > -- > > Key: SPARK-23135 > URL: https://issues.apache.org/jira/browse/SPARK-23135 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.3.0 > Environment: > > >Reporter: Burak Yavuz >Priority: Blocker > Attachments: webUIAccumulatorRegression.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23135) Accumulators don't show up properly in the Stages page anymore
[ https://issues.apache.org/jira/browse/SPARK-23135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-23135: Attachment: webUIAccumulatorRegression.png > Accumulators don't show up properly in the Stages page anymore > -- > > Key: SPARK-23135 > URL: https://issues.apache.org/jira/browse/SPARK-23135 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.3.0 > Environment: > > >Reporter: Burak Yavuz >Priority: Blocker > Attachments: webUIAccumulatorRegression.png > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23135) Accumulators don't show up properly in the Stages page anymore
[ https://issues.apache.org/jira/browse/SPARK-23135?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Burak Yavuz updated SPARK-23135: Description: Didn't do a lot of digging but may be caused by: [https://github.com/apache/spark/commit/1c70da3bfbb4016e394de2c73eb0db7cdd9a6968#diff-0d37752c6ec3d902aeff701771b4e932] !webUIAccumulatorRegression.png! > Accumulators don't show up properly in the Stages page anymore > -- > > Key: SPARK-23135 > URL: https://issues.apache.org/jira/browse/SPARK-23135 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.3.0 > Environment: > > >Reporter: Burak Yavuz >Priority: Blocker > Attachments: webUIAccumulatorRegression.png > > > Didn't do a lot of digging but may be caused by: > [https://github.com/apache/spark/commit/1c70da3bfbb4016e394de2c73eb0db7cdd9a6968#diff-0d37752c6ec3d902aeff701771b4e932] > > !webUIAccumulatorRegression.png! -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23135) Accumulators don't show up properly in the Stages page anymore
Burak Yavuz created SPARK-23135: --- Summary: Accumulators don't show up properly in the Stages page anymore Key: SPARK-23135 URL: https://issues.apache.org/jira/browse/SPARK-23135 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 2.3.0 Environment: Didn't do a lot of digging but may be caused by: [https://github.com/apache/spark/commit/1c70da3bfbb4016e394de2c73eb0db7cdd9a6968#diff-0d37752c6ec3d902aeff701771b4e932] Reporter: Burak Yavuz -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-23133) Spark options are not passed to the Executor in Docker context
[ https://issues.apache.org/jira/browse/SPARK-23133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23133: Assignee: (was: Apache Spark) > Spark options are not passed to the Executor in Docker context > -- > > Key: SPARK-23133 > URL: https://issues.apache.org/jira/browse/SPARK-23133 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 > Environment: Running Spark on K8s using supplied Docker image. >Reporter: Andrew Korzhuev >Priority: Minor > Original Estimate: 1h > Remaining Estimate: 1h > > Reproduce: > # Build image with `bin/docker-image-tool.sh`. > # Submit application to k8s. Set executor options, e.g. ` --conf > "spark.executor. > extraJavaOptions=-Djava.security.auth.login.config=./jaas.conf"` > # Visit Spark UI on executor and notice that option is not set. > Expected behavior: options from spark-submit should be correctly passed to > executor. > Cause: > `SPARK_EXECUTOR_JAVA_OPTS` is not defined in `entrypoint.sh` > https://github.com/apache/spark/blob/branch-2.3/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh#L70 > [https://github.com/apache/spark/blob/branch-2.3/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh#L44-L45] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-23133) Spark options are not passed to the Executor in Docker context
[ https://issues.apache.org/jira/browse/SPARK-23133?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23133: Assignee: Apache Spark > Spark options are not passed to the Executor in Docker context > -- > > Key: SPARK-23133 > URL: https://issues.apache.org/jira/browse/SPARK-23133 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 > Environment: Running Spark on K8s using supplied Docker image. >Reporter: Andrew Korzhuev >Assignee: Apache Spark >Priority: Minor > Original Estimate: 1h > Remaining Estimate: 1h > > Reproduce: > # Build image with `bin/docker-image-tool.sh`. > # Submit application to k8s. Set executor options, e.g. ` --conf > "spark.executor. > extraJavaOptions=-Djava.security.auth.login.config=./jaas.conf"` > # Visit Spark UI on executor and notice that option is not set. > Expected behavior: options from spark-submit should be correctly passed to > executor. > Cause: > `SPARK_EXECUTOR_JAVA_OPTS` is not defined in `entrypoint.sh` > https://github.com/apache/spark/blob/branch-2.3/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh#L70 > [https://github.com/apache/spark/blob/branch-2.3/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh#L44-L45] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23133) Spark options are not passed to the Executor in Docker context
[ https://issues.apache.org/jira/browse/SPARK-23133?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329189#comment-16329189 ] Apache Spark commented on SPARK-23133: -- User 'andrusha' has created a pull request for this issue: https://github.com/apache/spark/pull/20296 > Spark options are not passed to the Executor in Docker context > -- > > Key: SPARK-23133 > URL: https://issues.apache.org/jira/browse/SPARK-23133 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 2.3.0 > Environment: Running Spark on K8s using supplied Docker image. >Reporter: Andrew Korzhuev >Priority: Minor > Original Estimate: 1h > Remaining Estimate: 1h > > Reproduce: > # Build image with `bin/docker-image-tool.sh`. > # Submit application to k8s. Set executor options, e.g. ` --conf > "spark.executor. > extraJavaOptions=-Djava.security.auth.login.config=./jaas.conf"` > # Visit Spark UI on executor and notice that option is not set. > Expected behavior: options from spark-submit should be correctly passed to > executor. > Cause: > `SPARK_EXECUTOR_JAVA_OPTS` is not defined in `entrypoint.sh` > https://github.com/apache/spark/blob/branch-2.3/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh#L70 > [https://github.com/apache/spark/blob/branch-2.3/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh#L44-L45] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23011) Support alternative function form with group aggregate pandas UDF
[ https://issues.apache.org/jira/browse/SPARK-23011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329177#comment-16329177 ] Apache Spark commented on SPARK-23011: -- User 'icexelloss' has created a pull request for this issue: https://github.com/apache/spark/pull/20295 > Support alternative function form with group aggregate pandas UDF > - > > Key: SPARK-23011 > URL: https://issues.apache.org/jira/browse/SPARK-23011 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 2.3.0 >Reporter: Li Jin >Priority: Major > > The current semantics of groupby apply is that the output schema of groupby > apply is the same as the output schema of the UDF. Because grouping column is > usually useful to users, users often need to output grouping columns in the > UDF. To further explain, consider the following example: > {code:java} > import statsmodels.api as sm > # df has four columns: id, y, x1, x2 > group_column = 'id' > y_column = 'y' > x_columns = ['x1', 'x2'] > schema = df.select(group_column, *x_columns).schema > @pandas_udf(schema, PandasUDFType.GROUP_MAP) > # Input/output are both a pandas.DataFrame > def ols(pdf): > group_key = pdf[group_column].iloc[0] > y = pdf[y_column] > X = pdf[x_columns] > X = sm.add_constant(X) > model = sm.OLS(y, X).fit() > return pd.DataFrame([[group_key] + [model.params[i] for i in > x_columns]], columns=[group_column] + x_columns) > beta = df.groupby(group_column).apply(ols) > {code} > Although the UDF (linear regression) has nothing to do with the grouping > column, the user needs to deal with grouping column in the UDF. In other > words, the UDF is tightly coupled with the grouping column. > > With discussion in > [https://github.com/apache/spark/pull/20211#discussion_r160524679,] we > reached consensus for supporting an alternative function form: > {code:java} > def foo(key, pdf): > key # this is a grouping key. > pdf # this is the Pandas DataFrame > pudf = pandas_udf(f=foo, returnType="id int, v double", > functionType=GROUP_MAP) > df.groupby(group_column).apply(pudf){code} > {code:java} > import statsmodels.api as sm > # df has four columns: id, y, x1, x2 > group_column = 'id' > y_column = 'y' > x_columns = ['x1', 'x2'] > schema = df.select(group_column, *x_columns).schema > @pandas_udf(schema, PandasUDFType.GROUP_MAP) > # Input/output are both a pandas.DataFrame > def ols(key, pdf): > y = pdf[y_column] > X = pdf[x_columns] > X = sm.add_constant(X) > model = sm.OLS(y, X).fit() > return pd.DataFrame([key + [model.params[i] for i in x_columns]]) > beta = df.groupby(group_column).apply(ols) > {code} > > In summary: > * Support alternative form f(key, pdf). The current form f(pdf) will still > be supported. (Through function inspection) > * In both cases, the udf output schema will be the final output schema of > the spark DataFrame. > * Key will be passed to user as a python tuple. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-8682) Range Join for Spark SQL
[ https://issues.apache.org/jira/browse/SPARK-8682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329150#comment-16329150 ] Ruslan Dautkhanov commented on SPARK-8682: -- Range joins need some serious optimization in Spark. Even range-joining small datasets in Spark 2.2 is exceptionally slow which uses Broadcast Nested Loop Join. Like, 30mx30k join producing 30m records (range join matches always to one record in this case) takes 6 minutes when using tens of vcores. Folks even came up with interesting approaches using Python udf, Python intersect module and broadcast variables : [https://stackoverflow.com/a/37955947/470583] to solve this riddle - I actually quite liked this approach from [~zero323]. Would be great if something similar would been implemented in Spark natively. > Range Join for Spark SQL > > > Key: SPARK-8682 > URL: https://issues.apache.org/jira/browse/SPARK-8682 > Project: Spark > Issue Type: Improvement > Components: SQL >Reporter: Herman van Hovell >Priority: Major > Attachments: perf_testing.scala > > > Currently Spark SQL uses a Broadcast Nested Loop join (or a filtered > Cartesian Join) when it has to execute the following range query: > {noformat} > SELECT A.*, >B.* > FROM tableA A >JOIN tableB B > ON A.start <= B.end > AND A.end > B.start > {noformat} > This is horribly inefficient. The performance of this query can be greatly > improved, when one of the tables can be broadcasted, by creating a range > index. A range index is basically a sorted map containing the rows of the > smaller table, indexed by both the high and low keys. using this structure > the complexity of the query would go from O(N * M) to O(N * 2 * LOG(M)), N = > number of records in the larger table, M = number of records in the smaller > (indexed) table. > I have created a pull request for this. According to the [Spark SQL: > Relational Data Processing in > Spark|http://people.csail.mit.edu/matei/papers/2015/sigmod_spark_sql.pdf] > paper similar work (page 11, section 7.2) has already been done by the ADAM > project (cannot locate the code though). > Any comments and/or feedback are greatly appreciated. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23134) WebUI is showing the cache table details even after cache idle timeout
[ https://issues.apache.org/jira/browse/SPARK-23134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shahid K I updated SPARK-23134: --- Description: After cachedExecutorIdleTimeout, WebUI shows the cached partition details in the storage tab. It should be the same scenario as in the case of uncache table, where the storage tab of the web UI shows "RDD not found". (was: After cachedExecutorIdleTimeout, WebUI shows the cached partition details in storage tab. It should be the same scenario as in the case of uncache table.) > WebUI is showing the cache table details even after cache idle timeout > -- > > Key: SPARK-23134 > URL: https://issues.apache.org/jira/browse/SPARK-23134 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.1.0, 2.2.0, 2.2.1 > Environment: Run Cache command with below configuration to cache the > RDD blocks > spark.dynamicAllocation.cachedExecutorIdleTimeout=120s > spark.dynamicAllocation.executorIdleTimeout=60s > spark.dynamicAllocation.enabled=true > spark.dynamicAllocation.minExecutors=0 > spark.dynamicAllocation.maxExecutors=8 > > > >Reporter: Shahid K I >Priority: Major > > After cachedExecutorIdleTimeout, WebUI shows the cached partition details in > the storage tab. It should be the same scenario as in the case of uncache > table, where the storage tab of the web UI shows "RDD not found". -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23134) WebUI is showing the cache table details even after cache idle timeout
[ https://issues.apache.org/jira/browse/SPARK-23134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shahid K I updated SPARK-23134: --- Environment: Run Cache command with below configuration to cache the RDD blocks spark.dynamicAllocation.cachedExecutorIdleTimeout=120s spark.dynamicAllocation.executorIdleTimeout=60s spark.dynamicAllocation.enabled=true spark.dynamicAllocation.minExecutors=0 spark.dynamicAllocation.maxExecutors=8 was: Run Cache command with below configuration to cache the RDD blocks spark.dynamicAllocation.cachedExecutorIdleTimeout=120s spark.dynamicAllocation.executorIdleTimeout=60s spark.dynamicAllocation.enabled=true spark.dynamicAllocation.minExecutors=0 spark.dynamicAllocation.maxExecutors=8 > WebUI is showing the cache table details even after cache idle timeout > -- > > Key: SPARK-23134 > URL: https://issues.apache.org/jira/browse/SPARK-23134 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.1.0, 2.2.0, 2.2.1 > Environment: Run Cache command with below configuration to cache the > RDD blocks > > spark.dynamicAllocation.cachedExecutorIdleTimeout=120s > spark.dynamicAllocation.executorIdleTimeout=60s > spark.dynamicAllocation.enabled=true > spark.dynamicAllocation.minExecutors=0 > spark.dynamicAllocation.maxExecutors=8 > > > >Reporter: Shahid K I >Priority: Major > > After cachedExecutorIdleTimeout, WebUI shows the cached partition details in > storage tab. It should be the same scenario as in the case of uncache table. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23134) WebUI is showing the cache table details even after cache idle timeout
[ https://issues.apache.org/jira/browse/SPARK-23134?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Shahid K I updated SPARK-23134: --- Environment: Run Cache command with below configuration to cache the RDD blocks spark.dynamicAllocation.cachedExecutorIdleTimeout=120s spark.dynamicAllocation.executorIdleTimeout=60s spark.dynamicAllocation.enabled=true spark.dynamicAllocation.minExecutors=0 spark.dynamicAllocation.maxExecutors=8 was: Run Cache command with below configuration to cache the RDD blocks spark.dynamicAllocation.cachedExecutorIdleTimeout=120s spark.dynamicAllocation.executorIdleTimeout=60s spark.dynamicAllocation.enabled=true spark.dynamicAllocation.minExecutors=0 spark.dynamicAllocation.maxExecutors=8 > WebUI is showing the cache table details even after cache idle timeout > -- > > Key: SPARK-23134 > URL: https://issues.apache.org/jira/browse/SPARK-23134 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.1.0, 2.2.0, 2.2.1 > Environment: Run Cache command with below configuration to cache the > RDD blocks > spark.dynamicAllocation.cachedExecutorIdleTimeout=120s > spark.dynamicAllocation.executorIdleTimeout=60s > spark.dynamicAllocation.enabled=true > spark.dynamicAllocation.minExecutors=0 > spark.dynamicAllocation.maxExecutors=8 > > > >Reporter: Shahid K I >Priority: Major > > After cachedExecutorIdleTimeout, WebUI shows the cached partition details in > storage tab. It should be the same scenario as in the case of uncache table. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23134) WebUI is showing the cache table details even after cache idle timeout
Shahid K I created SPARK-23134: -- Summary: WebUI is showing the cache table details even after cache idle timeout Key: SPARK-23134 URL: https://issues.apache.org/jira/browse/SPARK-23134 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 2.2.1, 2.2.0, 2.1.0 Environment: Run Cache command with below configuration to cache the RDD blocks spark.dynamicAllocation.cachedExecutorIdleTimeout=120s spark.dynamicAllocation.executorIdleTimeout=60s spark.dynamicAllocation.enabled=true spark.dynamicAllocation.minExecutors=0 spark.dynamicAllocation.maxExecutors=8 Reporter: Shahid K I After cachedExecutorIdleTimeout, WebUI shows the cached partition details in storage tab. It should be the same scenario as in the case of uncache table. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23131) Stackoverflow using ML and Kryo serializer
[ https://issues.apache.org/jira/browse/SPARK-23131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peigen updated SPARK-23131: --- Description: When trying to use GeneralizedLinearRegression model and set SparkConf to use KryoSerializer(JavaSerializer is fine) It causes StackOverflowException {quote}Exception in thread "dispatcher-event-loop-34" java.lang.StackOverflowError at java.util.HashMap.hash(HashMap.java:338) at java.util.HashMap.get(HashMap.java:556) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) {quote} This is very likely to be [https://github.com/EsotericSoftware/kryo/issues/341] Upgrade Kryo to 4.0+ probably could fix this Wish for upgrade Kryo version for spark was: When trying to use GeneralizedLinearRegression model and set SparkConf to use KryoSerializer(JavaSerializer is fine) It causes StackOverflowException {quote} Exception in thread "dispatcher-event-loop-34" java.lang.StackOverflowError at java.util.HashMap.hash(HashMap.java:338) at java.util.HashMap.get(HashMap.java:556) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) {quote} This is very likely to be https://github.com/EsotericSoftware/kryo/issues/341 Upgrade Kryo to 4.0+ probably could fix this Issue Type: Wish (was: Bug) > Stackoverflow using ML and Kryo serializer > -- > > Key: SPARK-23131 > URL: https://issues.apache.org/jira/browse/SPARK-23131 > Project: Spark > Issue Type: Wish > Components: ML >Affects Versions: 2.2.0 >Reporter: Peigen >Priority: Minor > > When trying to use GeneralizedLinearRegression model and set SparkConf to use > KryoSerializer(JavaSerializer is fine) > It causes StackOverflowException > {quote}Exception in thread "dispatcher-event-loop-34" > java.lang.StackOverflowError > at java.util.HashMap.hash(HashMap.java:338) > at java.util.HashMap.get(HashMap.java:556) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Ge
[jira] [Commented] (SPARK-23131) Stackoverflow using ML and Kryo serializer
[ https://issues.apache.org/jira/browse/SPARK-23131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329100#comment-16329100 ] Peigen commented on SPARK-23131: I realize this happens when I try to serialize the model using ObjectOutputStream like {quote} def modelToString(model: Model): (String, String) = \{ val os: ByteArrayOutputStream = new ByteArrayOutputStream() val zos = new GZIPOutputStream(os) val oo: ObjectOutputStream = new ObjectOutputStream(zos) oo.writeObject(model) oo.close() zos.close() os.close() (model.id, DatatypeConverter.printBase64Binary(os.toByteArray)) \} {quote} not a standard way to save a model but fitting our production requirement. So this might just be a wish to upgrade kryo, I will adapt the ticket :) > Stackoverflow using ML and Kryo serializer > -- > > Key: SPARK-23131 > URL: https://issues.apache.org/jira/browse/SPARK-23131 > Project: Spark > Issue Type: Bug > Components: ML >Affects Versions: 2.2.0 >Reporter: Peigen >Priority: Minor > > When trying to use GeneralizedLinearRegression model and set SparkConf to use > KryoSerializer(JavaSerializer is fine) > It causes StackOverflowException > {quote} > Exception in thread "dispatcher-event-loop-34" java.lang.StackOverflowError > at java.util.HashMap.hash(HashMap.java:338) > at java.util.HashMap.get(HashMap.java:556) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > {quote} > This is very likely to be https://github.com/EsotericSoftware/kryo/issues/341 > Upgrade Kryo to 4.0+ probably could fix this -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23133) Spark options are not passed to the Executor in Docker context
Andrew Korzhuev created SPARK-23133: --- Summary: Spark options are not passed to the Executor in Docker context Key: SPARK-23133 URL: https://issues.apache.org/jira/browse/SPARK-23133 Project: Spark Issue Type: Bug Components: Kubernetes Affects Versions: 2.3.0 Environment: Running Spark on K8s using supplied Docker image. Reporter: Andrew Korzhuev Reproduce: # Build image with `bin/docker-image-tool.sh`. # Submit application to k8s. Set executor options, e.g. ` --conf "spark.executor. extraJavaOptions=-Djava.security.auth.login.config=./jaas.conf"` # Visit Spark UI on executor and notice that option is not set. Expected behavior: options from spark-submit should be correctly passed to executor. Cause: `SPARK_EXECUTOR_JAVA_OPTS` is not defined in `entrypoint.sh` https://github.com/apache/spark/blob/branch-2.3/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh#L70 [https://github.com/apache/spark/blob/branch-2.3/resource-managers/kubernetes/docker/src/main/dockerfiles/spark/entrypoint.sh#L44-L45] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-23020) Re-enable Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23020: Assignee: Apache Spark (was: Marcelo Vanzin) > Re-enable Flaky Test: > org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher > > > Key: SPARK-23020 > URL: https://issues.apache.org/jira/browse/SPARK-23020 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.3.0 >Reporter: Sameer Agarwal >Assignee: Apache Spark >Priority: Blocker > > https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.3-test-maven-hadoop-2.7/42/testReport/junit/org.apache.spark.launcher/SparkLauncherSuite/testInProcessLauncher/history/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-23020) Re-enable Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23020: Assignee: Marcelo Vanzin (was: Apache Spark) > Re-enable Flaky Test: > org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher > > > Key: SPARK-23020 > URL: https://issues.apache.org/jira/browse/SPARK-23020 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.3.0 >Reporter: Sameer Agarwal >Assignee: Marcelo Vanzin >Priority: Blocker > > https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.3-test-maven-hadoop-2.7/42/testReport/junit/org.apache.spark.launcher/SparkLauncherSuite/testInProcessLauncher/history/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23020) Re-enable Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16329065#comment-16329065 ] Marcelo Vanzin commented on SPARK-23020: Bummer. I'll try to take another look later today. > Re-enable Flaky Test: > org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher > > > Key: SPARK-23020 > URL: https://issues.apache.org/jira/browse/SPARK-23020 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.3.0 >Reporter: Sameer Agarwal >Assignee: Marcelo Vanzin >Priority: Blocker > > https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.3-test-maven-hadoop-2.7/42/testReport/junit/org.apache.spark.launcher/SparkLauncherSuite/testInProcessLauncher/history/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22980) Using pandas_udf when inputs are not Pandas's Series or DataFrame
[ https://issues.apache.org/jira/browse/SPARK-22980?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328936#comment-16328936 ] Li Jin commented on SPARK-22980: I agree with [~cloud_fan]. I think it's enough to document each args to the user function is a pandas Series. > Using pandas_udf when inputs are not Pandas's Series or DataFrame > - > > Key: SPARK-22980 > URL: https://issues.apache.org/jira/browse/SPARK-22980 > Project: Spark > Issue Type: Sub-task > Components: PySpark >Affects Versions: 2.3.0 >Reporter: Xiao Li >Priority: Major > Fix For: 2.3.0 > > > {noformat} > from pyspark.sql.functions import pandas_udf > from pyspark.sql.functions import col, lit > from pyspark.sql.types import LongType > df = spark.range(3) > f = pandas_udf(lambda x, y: len(x) + y, LongType()) > df.select(f(lit('text'), col('id'))).show() > {noformat} > {noformat} > from pyspark.sql.functions import udf > from pyspark.sql.functions import col, lit > from pyspark.sql.types import LongType > df = spark.range(3) > f = udf(lambda x, y: len(x) + y, LongType()) > df.select(f(lit('text'), col('id'))).show() > {noformat} > The results of pandas_udf are different from udf. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS
[ https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328928#comment-16328928 ] Steve Loughran commented on SPARK-21697: No, it's spark's ability to have hdfs:// URLs on the classpath. The classpath is being scanned for commons logging properties, which is forcing in HDFS which is then NPEing as the logging code is being called before commons-logging is fully set up. Kind of a recursive class init problem triggered by a scan for commons-logging.properties. > NPE & ExceptionInInitializerError trying to load UTF from HDFS > -- > > Key: SPARK-21697 > URL: https://issues.apache.org/jira/browse/SPARK-21697 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.1 > Environment: Spark Client mode, Hadoop 2.6.0 >Reporter: Steve Loughran >Priority: Minor > > Reported on [the > PR|https://github.com/apache/spark/pull/17342#issuecomment-321438157] for > SPARK-12868: trying to load a UDF of HDFS is triggering an > {{ExceptionInInitializerError}}, caused by an NPE which should only happen if > the commons-logging {{LOG}} log is null. > Hypothesis: the commons logging scan for {{commons-logging.properties}} is > happening in the classpath with the HDFS JAR; this is triggering a D/L of the > JAR, which needs to force in commons-logging, and, as that's not inited yet, > NPEs -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22884) ML test for StructuredStreaming: spark.ml.clustering
[ https://issues.apache.org/jira/browse/SPARK-22884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328898#comment-16328898 ] Sandor Murakozi commented on SPARK-22884: - Is there anybody working on this? If not I'm happy to pick it up. > ML test for StructuredStreaming: spark.ml.clustering > > > Key: SPARK-22884 > URL: https://issues.apache.org/jira/browse/SPARK-22884 > Project: Spark > Issue Type: Test > Components: ML, Tests >Affects Versions: 2.3.0 >Reporter: Joseph K. Bradley >Priority: Major > > Task for adding Structured Streaming tests for all Models/Transformers in a > sub-module in spark.ml > For an example, see LinearRegressionSuite.scala in > https://github.com/apache/spark/pull/19843 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22886) ML test for StructuredStreaming: spark.ml.recommendation
[ https://issues.apache.org/jira/browse/SPARK-22886?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328874#comment-16328874 ] Gabor Somogyi commented on SPARK-22886: --- I would like to work on this. Please notify me if somebody already started. > ML test for StructuredStreaming: spark.ml.recommendation > > > Key: SPARK-22886 > URL: https://issues.apache.org/jira/browse/SPARK-22886 > Project: Spark > Issue Type: Test > Components: ML, Tests >Affects Versions: 2.3.0 >Reporter: Joseph K. Bradley >Priority: Major > > Task for adding Structured Streaming tests for all Models/Transformers in a > sub-module in spark.ml > For an example, see LinearRegressionSuite.scala in > https://github.com/apache/spark/pull/19843 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23131) Stackoverflow using ML and Kryo serializer
[ https://issues.apache.org/jira/browse/SPARK-23131?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328868#comment-16328868 ] Sean Owen commented on SPARK-23131: --- This requires updating Twitter Chill too, really, to 0.9.2. Have you tried it? I am running tests now and looks pretty good. Although that would mean a major version update to Kryo, I don't think we guarantee any interoperability at runtime across minor releases of Spark, so any change in its formats (and they changed) could be acceptable for Spark 2.4.0. Certainly for 3.0. Kryo 3.0.3 is almost 3 years old now so yeah may be time. > Stackoverflow using ML and Kryo serializer > -- > > Key: SPARK-23131 > URL: https://issues.apache.org/jira/browse/SPARK-23131 > Project: Spark > Issue Type: Bug > Components: ML >Affects Versions: 2.2.0 >Reporter: Peigen >Priority: Minor > > When trying to use GeneralizedLinearRegression model and set SparkConf to use > KryoSerializer(JavaSerializer is fine) > It causes StackOverflowException > {quote} > Exception in thread "dispatcher-event-loop-34" java.lang.StackOverflowError > at java.util.HashMap.hash(HashMap.java:338) > at java.util.HashMap.get(HashMap.java:556) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > {quote} > This is very likely to be https://github.com/EsotericSoftware/kryo/issues/341 > Upgrade Kryo to 4.0+ probably could fix this -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23076) When we call cache() on RDD which depends on ShuffleRowRDD, we will get an error result
[ https://issues.apache.org/jira/browse/SPARK-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328801#comment-16328801 ] Sean Owen commented on SPARK-23076: --- You're relying on behavior that this class doesn't provide. It doesn't provide it in order to optimize its internal execution, so there's a cost to changing that. I would agree this can't be considered a bug. > When we call cache() on RDD which depends on ShuffleRowRDD, we will get an > error result > --- > > Key: SPARK-23076 > URL: https://issues.apache.org/jira/browse/SPARK-23076 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.1.0 >Reporter: zhoukang >Priority: Major > Attachments: shufflerowrdd-cache.png > > > For query below: > {code:java} > select * from csv_demo limit 3; > {code} > The correct result should be: > 0: jdbc:hive2://10.108.230.228:1/> select * from csv_demo limit 3; > ++++-- > |_c0|_c1| > ++++-- > |Joe|20| > |Tom|30| > |Hyukjin|25| > ++++-- > However,when we call cache on MapPartitionsRDD below: > !shufflerowrdd-cache.png! > Then result will be error: > 0: jdbc:hive2://xxx/> select * from csv_demo limit 3; > ++++-- > |_c0|_c1| > ++++-- > |Hyukjin|25| > |Hyukjin|25| > |Hyukjin|25| > ++++-- > The reason why this happen is that: > UnsafeRow which generated by ShuffleRowRDD#compute will use the same under > byte buffer > I print some log below to explain this: > Modify UnsafeRow.toString() > {code:java} > // This is for debugging > @Override > public String toString() { > StringBuilder build = new StringBuilder("["); > for (int i = 0; i < sizeInBytes; i += 8) { > if (i != 0) build.append(','); > build.append(java.lang.Long.toHexString(Platform.getLong(baseObject, > baseOffset + i))); > } > build.append(","+baseObject+","+baseOffset+']'); // Print baseObject and > baseOffset here > return build.toString(); > }{code} > {code:java} > 2018-01-12,22:08:47,441 INFO org.apache.spark.sql.execution.ShuffledRowRDD: > Read value: [0,180003,22,656f4a,3032,[B@6225ec90,16] > 2018-01-12,22:08:47,445 INFO org.apache.spark.sql.execution.ShuffledRowRDD: > Read value: [0,180003,22,6d6f54,3033,[B@6225ec90,16] > 2018-01-12,22:08:47,448 INFO org.apache.spark.sql.execution.ShuffledRowRDD: > Read value: [0,180007,22,6e696a6b757948,3532,[B@6225ec90,16] > {code} > we can fix this by add a config,and copy UnsafeRow read by ShuffleRowRDD > iterator when config is true,like below: > {code:java} > reader.read().asInstanceOf[Iterator[Product2[Int, InternalRow]]].map(x => { > if (SparkEnv.get.conf.get(StaticSQLConf.UNSAFEROWRDD_CACHE_ENABLE) > && x._2.isInstanceOf[UnsafeRow]) { > (x._2).asInstanceOf[UnsafeRow].copy() > } else { > x._2 > } > }) > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-23076) When we call cache() on RDD which depends on ShuffleRowRDD, we will get an error result
[ https://issues.apache.org/jira/browse/SPARK-23076?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23076. --- Resolution: Not A Problem > When we call cache() on RDD which depends on ShuffleRowRDD, we will get an > error result > --- > > Key: SPARK-23076 > URL: https://issues.apache.org/jira/browse/SPARK-23076 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.1.0 >Reporter: zhoukang >Priority: Major > Attachments: shufflerowrdd-cache.png > > > For query below: > {code:java} > select * from csv_demo limit 3; > {code} > The correct result should be: > 0: jdbc:hive2://10.108.230.228:1/> select * from csv_demo limit 3; > ++++-- > |_c0|_c1| > ++++-- > |Joe|20| > |Tom|30| > |Hyukjin|25| > ++++-- > However,when we call cache on MapPartitionsRDD below: > !shufflerowrdd-cache.png! > Then result will be error: > 0: jdbc:hive2://xxx/> select * from csv_demo limit 3; > ++++-- > |_c0|_c1| > ++++-- > |Hyukjin|25| > |Hyukjin|25| > |Hyukjin|25| > ++++-- > The reason why this happen is that: > UnsafeRow which generated by ShuffleRowRDD#compute will use the same under > byte buffer > I print some log below to explain this: > Modify UnsafeRow.toString() > {code:java} > // This is for debugging > @Override > public String toString() { > StringBuilder build = new StringBuilder("["); > for (int i = 0; i < sizeInBytes; i += 8) { > if (i != 0) build.append(','); > build.append(java.lang.Long.toHexString(Platform.getLong(baseObject, > baseOffset + i))); > } > build.append(","+baseObject+","+baseOffset+']'); // Print baseObject and > baseOffset here > return build.toString(); > }{code} > {code:java} > 2018-01-12,22:08:47,441 INFO org.apache.spark.sql.execution.ShuffledRowRDD: > Read value: [0,180003,22,656f4a,3032,[B@6225ec90,16] > 2018-01-12,22:08:47,445 INFO org.apache.spark.sql.execution.ShuffledRowRDD: > Read value: [0,180003,22,6d6f54,3033,[B@6225ec90,16] > 2018-01-12,22:08:47,448 INFO org.apache.spark.sql.execution.ShuffledRowRDD: > Read value: [0,180007,22,6e696a6b757948,3532,[B@6225ec90,16] > {code} > we can fix this by add a config,and copy UnsafeRow read by ShuffleRowRDD > iterator when config is true,like below: > {code:java} > reader.read().asInstanceOf[Iterator[Product2[Int, InternalRow]]].map(x => { > if (SparkEnv.get.conf.get(StaticSQLConf.UNSAFEROWRDD_CACHE_ENABLE) > && x._2.isInstanceOf[UnsafeRow]) { > (x._2).asInstanceOf[UnsafeRow].copy() > } else { > x._2 > } > }) > } > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-21783) Turn on ORC filter push-down by default
[ https://issues.apache.org/jira/browse/SPARK-21783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan reassigned SPARK-21783: --- Assignee: Dongjoon Hyun > Turn on ORC filter push-down by default > --- > > Key: SPARK-21783 > URL: https://issues.apache.org/jira/browse/SPARK-21783 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.2.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 2.3.0 > > > Like Parquet (SPARK-9207), it would be great to turn on ORC option, too. > This option was turned off by default from the begining, SPARK-2883 > - > https://github.com/apache/spark/commit/aa31e431fc09f0477f1c2351c6275769a31aca90#diff-41ef65b9ef5b518f77e2a03559893f4dR149 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-21783) Turn on ORC filter push-down by default
[ https://issues.apache.org/jira/browse/SPARK-21783?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wenchen Fan resolved SPARK-21783. - Resolution: Fixed Fix Version/s: 2.3.0 Issue resolved by pull request 20265 [https://github.com/apache/spark/pull/20265] > Turn on ORC filter push-down by default > --- > > Key: SPARK-21783 > URL: https://issues.apache.org/jira/browse/SPARK-21783 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.2.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 2.3.0 > > > Like Parquet (SPARK-9207), it would be great to turn on ORC option, too. > This option was turned off by default from the begining, SPARK-2883 > - > https://github.com/apache/spark/commit/aa31e431fc09f0477f1c2351c6275769a31aca90#diff-41ef65b9ef5b518f77e2a03559893f4dR149 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23125) Offset commit failed when spark-streaming batch time is more than kafkaParams session timeout.
[ https://issues.apache.org/jira/browse/SPARK-23125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328787#comment-16328787 ] Sean Owen commented on SPARK-23125: --- The error message you cite, which is from the version in use, suggests a config change. Does that not work? This doesn't sound like a bug fix that Spark needs to pick up, but a configuration issue. It sounds like it's only an issue when changing from defaults. Maybe a doc comment somewhere? > Offset commit failed when spark-streaming batch time is more than kafkaParams > session timeout. > -- > > Key: SPARK-23125 > URL: https://issues.apache.org/jira/browse/SPARK-23125 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.2.0 >Reporter: zhaoshijie >Priority: Major > > I find DirectKafkaInputDStream(kafka010) Offset commit failed when batch time > is more than kafkaParams session.timeout.ms .log as fellow: > {code:java} > 2018-01-16 05:40:00,002 ERROR > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator: Offset > commit failed. > org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be > completed since the group has already rebalanced and assigned the partitions > to another member. This means that the time between subsequent calls to > poll() was longer than the configured session.timeout.ms, which typically > implies that the poll loop is spending too much time message processing. You > can address this either by increasing the session timeout or by reducing the > maximum size of batches returned in poll() with max.poll.records. > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:600) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:541) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658) > at > org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167) > at > org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133) > at > org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitPendingRequests(ConsumerNetworkClient.java:260) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:222) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.ensurePartitionAssignment(ConsumerCoordinator.java:366) > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:978) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:161) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:180) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:207) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) > at > org.apache.spark.stream
[jira] [Commented] (SPARK-23125) Offset commit failed when spark-streaming batch time is more than kafkaParams session timeout.
[ https://issues.apache.org/jira/browse/SPARK-23125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328782#comment-16328782 ] zhaoshijie commented on SPARK-23125: spark 2.2 use kafka version is 0.10.0.1 and I don't think config in version 0.10.0.1 can solve this problem.I run a UTtest by config kafka param max.poll.interval.ms(only in kafka version 0.10.1.0 and above) and session.timeout.ms can commit offset successed,but in kafka version 0.10.0.1 has not this param。 > Offset commit failed when spark-streaming batch time is more than kafkaParams > session timeout. > -- > > Key: SPARK-23125 > URL: https://issues.apache.org/jira/browse/SPARK-23125 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.2.0 >Reporter: zhaoshijie >Priority: Major > > I find DirectKafkaInputDStream(kafka010) Offset commit failed when batch time > is more than kafkaParams session.timeout.ms .log as fellow: > {code:java} > 2018-01-16 05:40:00,002 ERROR > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator: Offset > commit failed. > org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be > completed since the group has already rebalanced and assigned the partitions > to another member. This means that the time between subsequent calls to > poll() was longer than the configured session.timeout.ms, which typically > implies that the poll loop is spending too much time message processing. You > can address this either by increasing the session timeout or by reducing the > maximum size of batches returned in poll() with max.poll.records. > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:600) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:541) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658) > at > org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167) > at > org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133) > at > org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitPendingRequests(ConsumerNetworkClient.java:260) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:222) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.ensurePartitionAssignment(ConsumerCoordinator.java:366) > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:978) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:161) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:180) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:207) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) > at > org
[jira] [Commented] (SPARK-21697) NPE & ExceptionInInitializerError trying to load UTF from HDFS
[ https://issues.apache.org/jira/browse/SPARK-21697?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328777#comment-16328777 ] Sean Owen commented on SPARK-21697: --- Isn't this an HDFS problem? what could Spark do about it? > NPE & ExceptionInInitializerError trying to load UTF from HDFS > -- > > Key: SPARK-21697 > URL: https://issues.apache.org/jira/browse/SPARK-21697 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.1 > Environment: Spark Client mode, Hadoop 2.6.0 >Reporter: Steve Loughran >Priority: Minor > > Reported on [the > PR|https://github.com/apache/spark/pull/17342#issuecomment-321438157] for > SPARK-12868: trying to load a UDF of HDFS is triggering an > {{ExceptionInInitializerError}}, caused by an NPE which should only happen if > the commons-logging {{LOG}} log is null. > Hypothesis: the commons logging scan for {{commons-logging.properties}} is > happening in the classpath with the HDFS JAR; this is triggering a D/L of the > JAR, which needs to force in commons-logging, and, as that's not inited yet, > NPEs -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-23126) I used the Project operator and modified the source. After compiling successfully, and testing the jars, I got the exception. Maybe the phenomenon is related with impli
[ https://issues.apache.org/jira/browse/SPARK-23126?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23126. --- Resolution: Invalid Fix Version/s: (was: 2.2.0) Target Version/s: (was: 2.2.0) This is a question about Scala and your code; it doesn't belong here. > I used the Project operator and modified the source. After compiling > successfully, and testing the jars, I got the exception. Maybe the phenomenon > is related with implicits > > > Key: SPARK-23126 > URL: https://issues.apache.org/jira/browse/SPARK-23126 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 > Environment: Centos >Reporter: xuetao >Priority: Major > Labels: security > Original Estimate: 96h > Remaining Estimate: 96h > > I used the Project operator and modified the source. After compiling > successfully, and testing the jars, I got the exception as follow: > org.apache.spark.sql.AnalysisException: Try to map > struct to Tuple3, but failed as the number of fields > does not line up.; -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-23123) Unable to run Spark Job with Hadoop NameNode Federation using ViewFS
[ https://issues.apache.org/jira/browse/SPARK-23123?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23123. --- Resolution: Not A Problem > Unable to run Spark Job with Hadoop NameNode Federation using ViewFS > > > Key: SPARK-23123 > URL: https://issues.apache.org/jira/browse/SPARK-23123 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 1.6.3 >Reporter: Nihar Nayak >Priority: Major > Labels: Hadoop, Spark > > Added following to core-site.xml in order to make use of ViewFS in a NameNode > federated cluster. > {noformat} > > fs.defaultFS > viewfs:/// > > > fs.viewfs.mounttable.default.link./apps > hdfs://nameservice1/apps > > > fs.viewfs.mounttable.default.link./app-logs > hdfs://nameservice2/app-logs > > > fs.viewfs.mounttable.default.link./tmp > hdfs://nameservice2/tmp > > > fs.viewfs.mounttable.default.link./user > hdfs://nameservice2/user > > > fs.viewfs.mounttable.default.link./ns1/user > hdfs://nameservice1/user > > > fs.viewfs.mounttable.default.link./ns2/user > hdfs://nameservice2/user > > {noformat} > Got the following error . > {noformat} > spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client > --num-executors 3 --driver-memory 512m --executor-memory 512m > --executor-cores 1 ${SPARK_HOME}/lib/spark-examples*.jar 10 > 18/01/17 02:14:45 INFO spark.SparkContext: Added JAR > file:/home/nayak/hdp26_c4000_stg/spark2/lib/spark-examples_2.11-2.1.1.2.6.2.0-205.jar > at spark://x:35633/jars/spark-examples_2.11-2.1.1.2.6.2.0-205.jar with > timestamp 1516155285534 > 18/01/17 02:14:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to rm2 > 18/01/17 02:14:46 INFO yarn.Client: Requesting a new application from cluster > with 26 NodeManagers > 18/01/17 02:14:46 INFO yarn.Client: Verifying our application has not > requested more than the maximum memory capability of the cluster (13800 MB > per container) > 18/01/17 02:14:46 INFO yarn.Client: Will allocate AM container, with 896 MB > memory including 384 MB overhead > 18/01/17 02:14:46 INFO yarn.Client: Setting up container launch context for > our AM > 18/01/17 02:14:46 INFO yarn.Client: Setting up the launch environment for our > AM container > 18/01/17 02:14:46 INFO yarn.Client: Preparing resources for our AM container > 18/01/17 02:14:46 INFO security.HDFSCredentialProvider: getting token for > namenode: viewfs:/user/nayak > 18/01/17 02:14:46 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token > 22488202 for nayak on ha-hdfs:nameservice1 > 18/01/17 02:14:46 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 50 > for nayak on ha-hdfs:nameservice2 > 18/01/17 02:14:47 INFO hive.metastore: Trying to connect to metastore with > URI thrift://:9083 > 18/01/17 02:14:47 INFO hive.metastore: Connected to metastore. > 18/01/17 02:14:49 INFO security.HiveCredentialProvider: Get Token from hive > metastore: Kind: HIVE_DELEGATION_TOKEN, Service: , Ident: 00 29 6e 61 79 61 > 6b 6e 69 68 61 72 72 61 30 31 40 53 54 47 32 30 30 30 2e 48 41 44 4f 4f 50 2e > 52 41 4b 55 54 45 4e 2e 43 4f 4d 04 68 69 76 65 00 8a 01 61 01 e5 be 03 8a 01 > 61 25 f2 42 03 8d 02 21 bb 8e 02 b7 > 18/01/17 02:14:49 WARN yarn.Client: Neither spark.yarn.jars nor > spark.yarn.archive is set, falling back to uploading libraries under > SPARK_HOME. > 18/01/17 02:14:50 INFO yarn.Client: Uploading resource > file:/tmp/spark-7498ee81-d22b-426e-9466-3a08f7c827b1/__spark_libs__6643608006679813597.zip > -> > viewfs:/user/nayak/.sparkStaging/application_1515035441414_275503/__spark_libs__6643608006679813597.zip > 18/01/17 02:14:55 INFO yarn.Client: Uploading resource > file:/tmp/spark-7498ee81-d22b-426e-9466-3a08f7c827b1/__spark_conf__405432153902988742.zip > -> > viewfs:/user/nayak/.sparkStaging/application_1515035441414_275503/__spark_conf__.zip > 18/01/17 02:14:55 INFO spark.SecurityManager: Changing view acls to: nayak > 18/01/17 02:14:55 INFO spark.SecurityManager: Changing modify acls to: > nayak > 18/01/17 02:14:55 INFO spark.SecurityManager: Changing view acls groups to: > 18/01/17 02:14:55 INFO spark.SecurityManager: Changing modify acls groups to: > 18/01/17 02:14:55 INFO spark.SecurityManager: SecurityManager: authentication > disabled; ui acls disabled; users with view permissions: Set(nayak); > groups with view permissions: Set(); users with modify permissions: > Set(nayak); groups with modify permissions: Set() > 18/01/17 02:14:55 INFO yarn.Client: Submitting application > application_1515035441414_275503 to ResourceManager > 18/01/17 02:14:55 INFO impl.YarnClientImpl: Submitted application > application_1515035441414_275503 > 18/01/17 02:14:55 I
[jira] [Resolved] (SPARK-23125) Offset commit failed when spark-streaming batch time is more than kafkaParams session timeout.
[ https://issues.apache.org/jira/browse/SPARK-23125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-23125. --- Resolution: Not A Problem Probably, but this is a Kafka config issue. If you're not using matched Kafka versions, that's the issue too. If you are, and the config is different across versions, I presume you just need to set a different config. > Offset commit failed when spark-streaming batch time is more than kafkaParams > session timeout. > -- > > Key: SPARK-23125 > URL: https://issues.apache.org/jira/browse/SPARK-23125 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.2.0 >Reporter: zhaoshijie >Priority: Major > > I find DirectKafkaInputDStream(kafka010) Offset commit failed when batch time > is more than kafkaParams session.timeout.ms .log as fellow: > {code:java} > 2018-01-16 05:40:00,002 ERROR > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator: Offset > commit failed. > org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be > completed since the group has already rebalanced and assigned the partitions > to another member. This means that the time between subsequent calls to > poll() was longer than the configured session.timeout.ms, which typically > implies that the poll loop is spending too much time message processing. You > can address this either by increasing the session timeout or by reducing the > maximum size of batches returned in poll() with max.poll.records. > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:600) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:541) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658) > at > org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167) > at > org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133) > at > org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426) > at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192) > at > org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitPendingRequests(ConsumerNetworkClient.java:260) > at > org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:222) > at > org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.ensurePartitionAssignment(ConsumerCoordinator.java:366) > at > org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:978) > at > org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:161) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:180) > at > org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:207) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336) > at > org.apache.spark.
[jira] [Resolved] (SPARK-15401) Spark Thrift server creates empty directories in tmp directory on the driver
[ https://issues.apache.org/jira/browse/SPARK-15401?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-15401. - Resolution: Duplicate > Spark Thrift server creates empty directories in tmp directory on the driver > > > Key: SPARK-15401 > URL: https://issues.apache.org/jira/browse/SPARK-15401 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Christophe Préaud >Priority: Minor > > Each connection to the Spark thrift server (e.g. using beeline) creates two > empty directories in the tmp directory on the driver which are never removed: > cd > ls -ltd *_resources | wc -l && /opt/spark/bin/beeline -u > jdbc:hive2://dc1-kdp-prod-hadoop-00.prod.dc1.kelkoo.net:1 -n kookel -e > '!quit' && ls -ltd *_resources | wc -l > 9080 > Connecting to jdbc:hive2://dc1-kdp-prod-hadoop-00.prod.dc1.kelkoo.net:1 > Connected to: Spark SQL (version 1.6.1) > Driver: Spark Project Core (version 1.6.1) > Transaction isolation: TRANSACTION_REPEATABLE_READ > Closing: 0: jdbc:hive2://dc1-kdp-prod-hadoop-00.prod.dc1.kelkoo.net:1 > Beeline version 1.6.1 by Apache Hive > 9082 > Those directories accumulates over time and are not removed: > ls -ld *_resources | wc -l > 9064 > And they are indeed empty: > find *_resources -type f | wc -l > 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15401) Spark Thrift server creates empty directories in tmp directory on the driver
[ https://issues.apache.org/jira/browse/SPARK-15401?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328738#comment-16328738 ] Marco Gaido commented on SPARK-15401: - this should have been fixed in SPARK-22793. > Spark Thrift server creates empty directories in tmp directory on the driver > > > Key: SPARK-15401 > URL: https://issues.apache.org/jira/browse/SPARK-15401 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Christophe Préaud >Priority: Minor > > Each connection to the Spark thrift server (e.g. using beeline) creates two > empty directories in the tmp directory on the driver which are never removed: > cd > ls -ltd *_resources | wc -l && /opt/spark/bin/beeline -u > jdbc:hive2://dc1-kdp-prod-hadoop-00.prod.dc1.kelkoo.net:1 -n kookel -e > '!quit' && ls -ltd *_resources | wc -l > 9080 > Connecting to jdbc:hive2://dc1-kdp-prod-hadoop-00.prod.dc1.kelkoo.net:1 > Connected to: Spark SQL (version 1.6.1) > Driver: Spark Project Core (version 1.6.1) > Transaction isolation: TRANSACTION_REPEATABLE_READ > Closing: 0: jdbc:hive2://dc1-kdp-prod-hadoop-00.prod.dc1.kelkoo.net:1 > Beeline version 1.6.1 by Apache Hive > 9082 > Those directories accumulates over time and are not removed: > ls -ld *_resources | wc -l > 9064 > And they are indeed empty: > find *_resources -type f | wc -l > 0 -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-23130) Spark Thrift does not clean-up temporary files (/tmp/*_resources and /tmp/hive/*.pipeout)
[ https://issues.apache.org/jira/browse/SPARK-23130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Marco Gaido resolved SPARK-23130. - Resolution: Duplicate > Spark Thrift does not clean-up temporary files (/tmp/*_resources and > /tmp/hive/*.pipeout) > - > > Key: SPARK-23130 > URL: https://issues.apache.org/jira/browse/SPARK-23130 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.3, 2.1.0, 2.2.0 > Environment: * Hadoop distributions: HDP 2.5 - 2.6.3.0 > * OS: Seen on SLES12, RHEL 7.3 & RHEL 7.4 >Reporter: Sean Roberts >Priority: Major > Labels: thrift > > Spark Thrift is not cleaning up /tmp for files & directories named like: > /tmp/hive/*.pipeout > /tmp/*_resources > There are such a large number that /tmp quickly runs out of inodes *causing > the partition to be unusable and many services to crash*. This is even true > when the only jobs submitted are routine service checks. > Used `strace` to show that Spark Thrift is responsible: > {code:java} > strace.out.118864:04:53:49 > open("/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout", > O_RDWR|O_CREAT|O_EXCL, 0666) = 134 > strace.out.118864:04:53:49 > mkdir("/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources", 0777) = 0 > {code} > *Those files were left behind, even days later.* > > Example files: > {code:java} > # stat > /tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout > File: > ‘/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout’ > Size: 0 Blocks: 0 IO Block: 4096 regular empty file > Device: fe09h/65033d Inode: 678 Links: 1 > Access: (0644/-rw-r--r--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) > Access: 2017-12-19 04:53:49.126777260 -0600 > Modify: 2017-12-19 04:53:49.126777260 -0600 > Change: 2017-12-19 04:53:49.126777260 -0600 > Birth: - > # stat /tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources > File: ‘/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources’ > Size: 4096 Blocks: 8 IO Block: 4096 directory > Device: fe09h/65033d Inode: 668 Links: 2 > Access: (0700/drwx--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) > Access: 2017-12-19 04:57:38.458937635 -0600 > Modify: 2017-12-19 04:53:49.062777216 -0600 > Change: 2017-12-19 04:53:49.066777218 -0600 > Birth: - > {code} > Showing the large number: > {code:java} > # find /tmp/ -name '*_resources' | wc -l > 68340 > # find /tmp/hive -name "*.pipeout" | wc -l > 51837 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23130) Spark Thrift does not clean-up temporary files (/tmp/*_resources and /tmp/hive/*.pipeout)
[ https://issues.apache.org/jira/browse/SPARK-23130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328734#comment-16328734 ] Marco Gaido commented on SPARK-23130: - The "_resources" files leak should have been fixed by SPARK-22793. As far as HIVE-6091 is regarded, I think there is already an ongoing effort to make Spark non dependent on a specific Hive version. Therefore I am closing this as there is nothing to do. Please feel free to reopen it if I missed something. > Spark Thrift does not clean-up temporary files (/tmp/*_resources and > /tmp/hive/*.pipeout) > - > > Key: SPARK-23130 > URL: https://issues.apache.org/jira/browse/SPARK-23130 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.3, 2.1.0, 2.2.0 > Environment: * Hadoop distributions: HDP 2.5 - 2.6.3.0 > * OS: Seen on SLES12, RHEL 7.3 & RHEL 7.4 >Reporter: Sean Roberts >Priority: Major > Labels: thrift > > Spark Thrift is not cleaning up /tmp for files & directories named like: > /tmp/hive/*.pipeout > /tmp/*_resources > There are such a large number that /tmp quickly runs out of inodes *causing > the partition to be unusable and many services to crash*. This is even true > when the only jobs submitted are routine service checks. > Used `strace` to show that Spark Thrift is responsible: > {code:java} > strace.out.118864:04:53:49 > open("/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout", > O_RDWR|O_CREAT|O_EXCL, 0666) = 134 > strace.out.118864:04:53:49 > mkdir("/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources", 0777) = 0 > {code} > *Those files were left behind, even days later.* > > Example files: > {code:java} > # stat > /tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout > File: > ‘/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout’ > Size: 0 Blocks: 0 IO Block: 4096 regular empty file > Device: fe09h/65033d Inode: 678 Links: 1 > Access: (0644/-rw-r--r--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) > Access: 2017-12-19 04:53:49.126777260 -0600 > Modify: 2017-12-19 04:53:49.126777260 -0600 > Change: 2017-12-19 04:53:49.126777260 -0600 > Birth: - > # stat /tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources > File: ‘/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources’ > Size: 4096 Blocks: 8 IO Block: 4096 directory > Device: fe09h/65033d Inode: 668 Links: 2 > Access: (0700/drwx--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) > Access: 2017-12-19 04:57:38.458937635 -0600 > Modify: 2017-12-19 04:53:49.062777216 -0600 > Change: 2017-12-19 04:53:49.066777218 -0600 > Birth: - > {code} > Showing the large number: > {code:java} > # find /tmp/ -name '*_resources' | wc -l > 68340 > # find /tmp/hive -name "*.pipeout" | wc -l > 51837 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-23132) Run ml.image doctests in tests
[ https://issues.apache.org/jira/browse/SPARK-23132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23132: Assignee: Apache Spark > Run ml.image doctests in tests > -- > > Key: SPARK-23132 > URL: https://issues.apache.org/jira/browse/SPARK-23132 > Project: Spark > Issue Type: Test > Components: ML, PySpark >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Assignee: Apache Spark >Priority: Minor > > Seems currently we don't actually run the doctests in {{ml.image.py}}. It'd > be better to run it to show the running examples. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-23132) Run ml.image doctests in tests
[ https://issues.apache.org/jira/browse/SPARK-23132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23132: Assignee: (was: Apache Spark) > Run ml.image doctests in tests > -- > > Key: SPARK-23132 > URL: https://issues.apache.org/jira/browse/SPARK-23132 > Project: Spark > Issue Type: Test > Components: ML, PySpark >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Priority: Minor > > Seems currently we don't actually run the doctests in {{ml.image.py}}. It'd > be better to run it to show the running examples. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23132) Run ml.image doctests in tests
[ https://issues.apache.org/jira/browse/SPARK-23132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328714#comment-16328714 ] Apache Spark commented on SPARK-23132: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/20294 > Run ml.image doctests in tests > -- > > Key: SPARK-23132 > URL: https://issues.apache.org/jira/browse/SPARK-23132 > Project: Spark > Issue Type: Test > Components: ML, PySpark >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Priority: Minor > > Seems currently we don't actually run the doctests in {{ml.image.py}}. It'd > be better to run it to show the running examples. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23132) Run ml.image doctests in tests
[ https://issues.apache.org/jira/browse/SPARK-23132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-23132: - Environment: (was: Seems currently we don't actually run the doctests in \{{ml.image.py}}. It'd be better to run it to show the running examples.) > Run ml.image doctests in tests > -- > > Key: SPARK-23132 > URL: https://issues.apache.org/jira/browse/SPARK-23132 > Project: Spark > Issue Type: Test > Components: ML, PySpark >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23132) Run ml.image doctests in tests
Hyukjin Kwon created SPARK-23132: Summary: Run ml.image doctests in tests Key: SPARK-23132 URL: https://issues.apache.org/jira/browse/SPARK-23132 Project: Spark Issue Type: Test Components: ML, PySpark Affects Versions: 2.3.0 Environment: Seems currently we don't actually run the doctests in \{{ml.image.py}}. It'd be better to run it to show the running examples. Reporter: Hyukjin Kwon -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23132) Run ml.image doctests in tests
[ https://issues.apache.org/jira/browse/SPARK-23132?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hyukjin Kwon updated SPARK-23132: - Description: Seems currently we don't actually run the doctests in {{ml.image.py}}. It'd be better to run it to show the running examples. > Run ml.image doctests in tests > -- > > Key: SPARK-23132 > URL: https://issues.apache.org/jira/browse/SPARK-23132 > Project: Spark > Issue Type: Test > Components: ML, PySpark >Affects Versions: 2.3.0 >Reporter: Hyukjin Kwon >Priority: Minor > > Seems currently we don't actually run the doctests in {{ml.image.py}}. It'd > be better to run it to show the running examples. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23125) Offset commit failed when spark-streaming batch time is more than kafkaParams session timeout.
[ https://issues.apache.org/jira/browse/SPARK-23125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoshijie updated SPARK-23125: --- Description: I find DirectKafkaInputDStream(kafka010) Offset commit failed when batch time is more than kafkaParams session.timeout.ms .log as fellow: {code:java} 2018-01-16 05:40:00,002 ERROR org.apache.kafka.clients.consumer.internals.ConsumerCoordinator: Offset commit failed. org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records. at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:600) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:541) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658) at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167) at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133) at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitPendingRequests(ConsumerNetworkClient.java:260) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:222) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.ensurePartitionAssignment(ConsumerCoordinator.java:366) at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:978) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:161) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:180) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:207) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334) at scala.Option.orElse(Option.scala:289) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immu
[jira] [Updated] (SPARK-23131) Stackoverflow using ML and Kryo serializer
[ https://issues.apache.org/jira/browse/SPARK-23131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peigen updated SPARK-23131: --- Priority: Minor (was: Critical) > Stackoverflow using ML and Kryo serializer > -- > > Key: SPARK-23131 > URL: https://issues.apache.org/jira/browse/SPARK-23131 > Project: Spark > Issue Type: Bug > Components: ML >Affects Versions: 2.2.0 >Reporter: Peigen >Priority: Minor > > When trying to use GeneralizedLinearRegression model and set SparkConf to use > KryoSerializer(JavaSerializer is fine) > It causes StackOverflowException > {quote} > Exception in thread "dispatcher-event-loop-34" java.lang.StackOverflowError > at java.util.HashMap.hash(HashMap.java:338) > at java.util.HashMap.get(HashMap.java:556) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > {quote} > This is very likely to be https://github.com/EsotericSoftware/kryo/issues/341 > Upgrade Kryo to 4.0+ probably could fix this -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23131) Stackoverflow using ML and Kryo serializer
[ https://issues.apache.org/jira/browse/SPARK-23131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peigen updated SPARK-23131: --- Description: When trying to use GeneralizedLinearRegression model and set SparkConf to use KryoSerializer(JavaSerializer is fine) It causes StackOverflowException {quote} Exception in thread "dispatcher-event-loop-34" java.lang.StackOverflowError at java.util.HashMap.hash(HashMap.java:338) at java.util.HashMap.get(HashMap.java:556) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) {quote} This is very likely to be https://github.com/EsotericSoftware/kryo/issues/341 Upgrade Kryo to 4.0+ probably could fix this was: When trying to use GeneralizedLinearRegression model and set SparkConf to use KryoSerializer(JavaSerializer don't have this problem) It causes StackOverflowException {quote} Exception in thread "dispatcher-event-loop-34" java.lang.StackOverflowError at java.util.HashMap.hash(HashMap.java:338) at java.util.HashMap.get(HashMap.java:556) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) {quote} This is very likely to be https://github.com/EsotericSoftware/kryo/issues/341 Upgrade Kryo to 4.0+ probably could fix this > Stackoverflow using ML and Kryo serializer > -- > > Key: SPARK-23131 > URL: https://issues.apache.org/jira/browse/SPARK-23131 > Project: Spark > Issue Type: Bug > Components: ML >Affects Versions: 2.2.0 >Reporter: Peigen >Priority: Critical > > When trying to use GeneralizedLinearRegression model and set SparkConf to use > KryoSerializer(JavaSerializer is fine) > It causes StackOverflowException > {quote} > Exception in thread "dispatcher-event-loop-34" java.lang.StackOverflowError > at java.util.HashMap.hash(HashMap.java:338) > at java.util.HashMap.get(HashMap.java:556) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericso
[jira] [Updated] (SPARK-23131) Stackoverflow using ML and Kryo serializer
[ https://issues.apache.org/jira/browse/SPARK-23131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Peigen updated SPARK-23131: --- Environment: (was: When trying to use GeneralizedLinearRegression model and set SparkConf to use KryoSerializer(JavaSerializer don't have this problem) It causes StackOverflowException {quote} Exception in thread "dispatcher-event-loop-34" java.lang.StackOverflowError at java.util.HashMap.hash(HashMap.java:338) at java.util.HashMap.get(HashMap.java:556) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) {quote} This is very likely to be https://github.com/EsotericSoftware/kryo/issues/341 Upgrade Kryo to 4.0+ probably could fix this) Description: When trying to use GeneralizedLinearRegression model and set SparkConf to use KryoSerializer(JavaSerializer don't have this problem) It causes StackOverflowException {quote} Exception in thread "dispatcher-event-loop-34" java.lang.StackOverflowError at java.util.HashMap.hash(HashMap.java:338) at java.util.HashMap.get(HashMap.java:556) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) {quote} This is very likely to be https://github.com/EsotericSoftware/kryo/issues/341 Upgrade Kryo to 4.0+ probably could fix this > Stackoverflow using ML and Kryo serializer > -- > > Key: SPARK-23131 > URL: https://issues.apache.org/jira/browse/SPARK-23131 > Project: Spark > Issue Type: Bug > Components: ML >Affects Versions: 2.2.0 >Reporter: Peigen >Priority: Critical > > When trying to use GeneralizedLinearRegression model and set SparkConf to use > KryoSerializer(JavaSerializer don't have this problem) > It causes StackOverflowException > {quote} > Exception in thread "dispatcher-event-loop-34" java.lang.StackOverflowError > at java.util.HashMap.hash(HashMap.java:338) > at java.util.HashMap.get(HashMap.java:556) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) > at com.esotericsoftware.kryo.Generics.getConcre
[jira] [Created] (SPARK-23131) Stackoverflow using ML and Kryo serializer
Peigen created SPARK-23131: -- Summary: Stackoverflow using ML and Kryo serializer Key: SPARK-23131 URL: https://issues.apache.org/jira/browse/SPARK-23131 Project: Spark Issue Type: Bug Components: ML Affects Versions: 2.2.0 Environment: When trying to use GeneralizedLinearRegression model and set SparkConf to use KryoSerializer(JavaSerializer don't have this problem) It causes StackOverflowException {quote} Exception in thread "dispatcher-event-loop-34" java.lang.StackOverflowError at java.util.HashMap.hash(HashMap.java:338) at java.util.HashMap.get(HashMap.java:556) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:61) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) at com.esotericsoftware.kryo.Generics.getConcreteClass(Generics.java:62) {quote} This is very likely to be https://github.com/EsotericSoftware/kryo/issues/341 Upgrade Kryo to 4.0+ probably could fix this Reporter: Peigen -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23123) Unable to run Spark Job with Hadoop NameNode Federation using ViewFS
[ https://issues.apache.org/jira/browse/SPARK-23123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328651#comment-16328651 ] Steve Loughran commented on SPARK-23123: I've never looked at ViewFS internals before, so treat my commentary here with caution # Something (probably yarn Node Manager/Resource Localizer) is trying to D/L the JAR from a viewfs URL # it can't init viewfs as it's not finding the conf entry for the mount table, which, is *probably* {{fs.viewfs.mounttable.default}}. (ie. it will be that unless overridden # Yet the spark-submit client can see it, which is why it manages to delete the staging dir. # Which would imply that the NM isn't getting the core-site.xml values configuring viewfs. Like Saisai says, I wouldn't blame Spark here; I don't think it's a spark process I'd try and work out which node this failed on and see what the NM logs say. If it's failing for this job submit, it's likely to be failing for other things too. Then try restarting it to see if the problem "goes away"...it would indicate that the settings in its local /etc/conf/hadoop/core-site.xml don't have the binding info. If you are confident that it is in that file, and you've restarted the NM, and it's still failing in its logs, then file a YARN bug attaching the log. But you'd need to provide that evidence that it wasn't a local config problem before anyone would look at it > Unable to run Spark Job with Hadoop NameNode Federation using ViewFS > > > Key: SPARK-23123 > URL: https://issues.apache.org/jira/browse/SPARK-23123 > Project: Spark > Issue Type: Bug > Components: Spark Submit >Affects Versions: 1.6.3 >Reporter: Nihar Nayak >Priority: Major > Labels: Hadoop, Spark > > Added following to core-site.xml in order to make use of ViewFS in a NameNode > federated cluster. > {noformat} > > fs.defaultFS > viewfs:/// > > > fs.viewfs.mounttable.default.link./apps > hdfs://nameservice1/apps > > > fs.viewfs.mounttable.default.link./app-logs > hdfs://nameservice2/app-logs > > > fs.viewfs.mounttable.default.link./tmp > hdfs://nameservice2/tmp > > > fs.viewfs.mounttable.default.link./user > hdfs://nameservice2/user > > > fs.viewfs.mounttable.default.link./ns1/user > hdfs://nameservice1/user > > > fs.viewfs.mounttable.default.link./ns2/user > hdfs://nameservice2/user > > {noformat} > Got the following error . > {noformat} > spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client > --num-executors 3 --driver-memory 512m --executor-memory 512m > --executor-cores 1 ${SPARK_HOME}/lib/spark-examples*.jar 10 > 18/01/17 02:14:45 INFO spark.SparkContext: Added JAR > file:/home/nayak/hdp26_c4000_stg/spark2/lib/spark-examples_2.11-2.1.1.2.6.2.0-205.jar > at spark://x:35633/jars/spark-examples_2.11-2.1.1.2.6.2.0-205.jar with > timestamp 1516155285534 > 18/01/17 02:14:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over > to rm2 > 18/01/17 02:14:46 INFO yarn.Client: Requesting a new application from cluster > with 26 NodeManagers > 18/01/17 02:14:46 INFO yarn.Client: Verifying our application has not > requested more than the maximum memory capability of the cluster (13800 MB > per container) > 18/01/17 02:14:46 INFO yarn.Client: Will allocate AM container, with 896 MB > memory including 384 MB overhead > 18/01/17 02:14:46 INFO yarn.Client: Setting up container launch context for > our AM > 18/01/17 02:14:46 INFO yarn.Client: Setting up the launch environment for our > AM container > 18/01/17 02:14:46 INFO yarn.Client: Preparing resources for our AM container > 18/01/17 02:14:46 INFO security.HDFSCredentialProvider: getting token for > namenode: viewfs:/user/nayak > 18/01/17 02:14:46 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token > 22488202 for nayak on ha-hdfs:nameservice1 > 18/01/17 02:14:46 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 50 > for nayak on ha-hdfs:nameservice2 > 18/01/17 02:14:47 INFO hive.metastore: Trying to connect to metastore with > URI thrift://:9083 > 18/01/17 02:14:47 INFO hive.metastore: Connected to metastore. > 18/01/17 02:14:49 INFO security.HiveCredentialProvider: Get Token from hive > metastore: Kind: HIVE_DELEGATION_TOKEN, Service: , Ident: 00 29 6e 61 79 61 > 6b 6e 69 68 61 72 72 61 30 31 40 53 54 47 32 30 30 30 2e 48 41 44 4f 4f 50 2e > 52 41 4b 55 54 45 4e 2e 43 4f 4d 04 68 69 76 65 00 8a 01 61 01 e5 be 03 8a 01 > 61 25 f2 42 03 8d 02 21 bb 8e 02 b7 > 18/01/17 02:14:49 WARN yarn.Client: Neither spark.yarn.jars nor > spark.yarn.archive is set, falling back to uploading libraries under > SPARK_HOME. > 18/01/17 02:14:50 INFO yarn.Client: Uploading resource > file:/tmp/spark-7498ee81-d22b-42
[jira] [Assigned] (SPARK-23127) Update FeatureHasher user guide for catCols parameter
[ https://issues.apache.org/jira/browse/SPARK-23127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23127: Assignee: Apache Spark > Update FeatureHasher user guide for catCols parameter > - > > Key: SPARK-23127 > URL: https://issues.apache.org/jira/browse/SPARK-23127 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Affects Versions: 2.3.0 >Reporter: Nick Pentreath >Assignee: Apache Spark >Priority: Major > > SPARK-22801 added the {{categoricalCols}} parameter and updated the Scala and > Python doc, but did not update the user guide entry discussing feature > handling. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23127) Update FeatureHasher user guide for catCols parameter
[ https://issues.apache.org/jira/browse/SPARK-23127?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328584#comment-16328584 ] Apache Spark commented on SPARK-23127: -- User 'MLnick' has created a pull request for this issue: https://github.com/apache/spark/pull/20293 > Update FeatureHasher user guide for catCols parameter > - > > Key: SPARK-23127 > URL: https://issues.apache.org/jira/browse/SPARK-23127 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Affects Versions: 2.3.0 >Reporter: Nick Pentreath >Priority: Major > > SPARK-22801 added the {{categoricalCols}} parameter and updated the Scala and > Python doc, but did not update the user guide entry discussing feature > handling. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-23127) Update FeatureHasher user guide for catCols parameter
[ https://issues.apache.org/jira/browse/SPARK-23127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23127: Assignee: (was: Apache Spark) > Update FeatureHasher user guide for catCols parameter > - > > Key: SPARK-23127 > URL: https://issues.apache.org/jira/browse/SPARK-23127 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Affects Versions: 2.3.0 >Reporter: Nick Pentreath >Priority: Major > > SPARK-22801 added the {{categoricalCols}} parameter and updated the Scala and > Python doc, but did not update the user guide entry discussing feature > handling. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23125) Offset commit failed when spark-streaming batch time is more than kafkaParams session timeout.
[ https://issues.apache.org/jira/browse/SPARK-23125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoshijie updated SPARK-23125: --- Description: I find DirectKafkaInputDStream(kafka010) Offset commit failed when batch time is more than kafkaParams session.timeout.ms .log as fellow: {code:java} 2018-01-16 05:40:00,002 ERROR org.apache.kafka.clients.consumer.internals.ConsumerCoordinator: Offset commit failed. org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records. at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:600) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:541) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658) at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167) at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133) at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitPendingRequests(ConsumerNetworkClient.java:260) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:222) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.ensurePartitionAssignment(ConsumerCoordinator.java:366) at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:978) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:161) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:180) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:207) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334) at scala.Option.orElse(Option.scala:289) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immu
[jira] [Updated] (SPARK-23130) Spark Thrift does not clean-up temporary files (/tmp/*_resources and /tmp/hive/*.pipeout)
[ https://issues.apache.org/jira/browse/SPARK-23130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Roberts updated SPARK-23130: - Labels: thrift (was: ) > Spark Thrift does not clean-up temporary files (/tmp/*_resources and > /tmp/hive/*.pipeout) > - > > Key: SPARK-23130 > URL: https://issues.apache.org/jira/browse/SPARK-23130 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.3, 2.1.0, 2.2.0 > Environment: * Hadoop distributions: HDP 2.5 - 2.6.3.0 > * OS: Seen on SLES12, RHEL 7.3 & RHEL 7.4 >Reporter: Sean Roberts >Priority: Major > Labels: thrift > > Spark Thrift is not cleaning up /tmp for files & directories named like: > /tmp/hive/*.pipeout > /tmp/*_resources > There are such a large number that /tmp quickly runs out of inodes *causing > the partition to be unusable and many services to crash*. This is even true > when the only jobs submitted are routine service checks. > Used `strace` to show that Spark Thrift is responsible: > {code:java} > strace.out.118864:04:53:49 > open("/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout", > O_RDWR|O_CREAT|O_EXCL, 0666) = 134 > strace.out.118864:04:53:49 > mkdir("/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources", 0777) = 0 > {code} > *Those files were left behind, even days later.* > > Example files: > {code:java} > # stat > /tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout > File: > ‘/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout’ > Size: 0 Blocks: 0 IO Block: 4096 regular empty file > Device: fe09h/65033d Inode: 678 Links: 1 > Access: (0644/-rw-r--r--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) > Access: 2017-12-19 04:53:49.126777260 -0600 > Modify: 2017-12-19 04:53:49.126777260 -0600 > Change: 2017-12-19 04:53:49.126777260 -0600 > Birth: - > # stat /tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources > File: ‘/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources’ > Size: 4096 Blocks: 8 IO Block: 4096 directory > Device: fe09h/65033d Inode: 668 Links: 2 > Access: (0700/drwx--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) > Access: 2017-12-19 04:57:38.458937635 -0600 > Modify: 2017-12-19 04:53:49.062777216 -0600 > Change: 2017-12-19 04:53:49.066777218 -0600 > Birth: - > {code} > Showing the large number: > {code:java} > # find /tmp/ -name '*_resources' | wc -l > 68340 > # find /tmp/hive -name "*.pipeout" | wc -l > 51837 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23130) Spark Thrift does not clean-up temporary files (/tmp/*_resources and /tmp/hive/*.pipeout)
[ https://issues.apache.org/jira/browse/SPARK-23130?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328568#comment-16328568 ] Sean Roberts commented on SPARK-23130: -- * SPARK-15401: Similar report for the "_resources" files * HIVE-6091: Possibly a fix for the "pipeout" files > Spark Thrift does not clean-up temporary files (/tmp/*_resources and > /tmp/hive/*.pipeout) > - > > Key: SPARK-23130 > URL: https://issues.apache.org/jira/browse/SPARK-23130 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.3, 2.1.0, 2.2.0 > Environment: * Hadoop distributions: HDP 2.5 - 2.6.3.0 > * OS: Seen on SLES12, RHEL 7.3 & RHEL 7.4 >Reporter: Sean Roberts >Priority: Major > > Spark Thrift is not cleaning up /tmp for files & directories named like: > /tmp/hive/*.pipeout > /tmp/*_resources > There are such a large number that /tmp quickly runs out of inodes *causing > the partition to be unusable and many services to crash*. This is even true > when the only jobs submitted are routine service checks. > Used `strace` to show that Spark Thrift is responsible: > {code:java} > strace.out.118864:04:53:49 > open("/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout", > O_RDWR|O_CREAT|O_EXCL, 0666) = 134 > strace.out.118864:04:53:49 > mkdir("/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources", 0777) = 0 > {code} > *Those files were left behind, even days later.* > ** > > Example files: > {code:java} > # stat > /tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout > File: > ‘/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout’ > Size: 0 Blocks: 0 IO Block: 4096 regular empty file > Device: fe09h/65033d Inode: 678 Links: 1 > Access: (0644/-rw-r--r--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) > Access: 2017-12-19 04:53:49.126777260 -0600 > Modify: 2017-12-19 04:53:49.126777260 -0600 > Change: 2017-12-19 04:53:49.126777260 -0600 > Birth: - > # stat /tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources > File: ‘/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources’ > Size: 4096 Blocks: 8 IO Block: 4096 directory > Device: fe09h/65033d Inode: 668 Links: 2 > Access: (0700/drwx--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) > Access: 2017-12-19 04:57:38.458937635 -0600 > Modify: 2017-12-19 04:53:49.062777216 -0600 > Change: 2017-12-19 04:53:49.066777218 -0600 > Birth: - > {code} > Showing the large number: > {code:java} > # find /tmp/ -name '*_resources' | wc -l > 68340 > # find /tmp/hive -name "*.pipeout" | wc -l > 51837 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23130) Spark Thrift does not clean-up temporary files (/tmp/*_resources and /tmp/hive/*.pipeout)
[ https://issues.apache.org/jira/browse/SPARK-23130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Roberts updated SPARK-23130: - Description: Spark Thrift is not cleaning up /tmp for files & directories named like: /tmp/hive/*.pipeout /tmp/*_resources There are such a large number that /tmp quickly runs out of inodes *causing the partition to be unusable and many services to crash*. This is even true when the only jobs submitted are routine service checks. Used `strace` to show that Spark Thrift is responsible: {code:java} strace.out.118864:04:53:49 open("/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout", O_RDWR|O_CREAT|O_EXCL, 0666) = 134 strace.out.118864:04:53:49 mkdir("/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources", 0777) = 0 {code} *Those files were left behind, even days later.* Example files: {code:java} # stat /tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout File: ‘/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout’ Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: fe09h/65033dInode: 678 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) Access: 2017-12-19 04:53:49.126777260 -0600 Modify: 2017-12-19 04:53:49.126777260 -0600 Change: 2017-12-19 04:53:49.126777260 -0600 Birth: - # stat /tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources File: ‘/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources’ Size: 4096Blocks: 8 IO Block: 4096 directory Device: fe09h/65033dInode: 668 Links: 2 Access: (0700/drwx--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) Access: 2017-12-19 04:57:38.458937635 -0600 Modify: 2017-12-19 04:53:49.062777216 -0600 Change: 2017-12-19 04:53:49.066777218 -0600 Birth: - {code} Showing the large number: {code:java} # find /tmp/ -name '*_resources' | wc -l 68340 # find /tmp/hive -name "*.pipeout" | wc -l 51837 {code} was: Spark Thrift is not cleaning up /tmp for files & directories named like: /tmp/hive/*.pipeout /tmp/*_resources There are such a large number that /tmp quickly runs out of inodes *causing the partition to be unusable and many services to crash*. This is even true when the only jobs submitted are routine service checks. Used `strace` to show that Spark Thrift is responsible: {code:java} strace.out.118864:04:53:49 open("/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout", O_RDWR|O_CREAT|O_EXCL, 0666) = 134 strace.out.118864:04:53:49 mkdir("/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources", 0777) = 0 {code} *Those files were left behind, even days later.* ** Example files: {code:java} # stat /tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout File: ‘/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout’ Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: fe09h/65033dInode: 678 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) Access: 2017-12-19 04:53:49.126777260 -0600 Modify: 2017-12-19 04:53:49.126777260 -0600 Change: 2017-12-19 04:53:49.126777260 -0600 Birth: - # stat /tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources File: ‘/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources’ Size: 4096Blocks: 8 IO Block: 4096 directory Device: fe09h/65033dInode: 668 Links: 2 Access: (0700/drwx--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) Access: 2017-12-19 04:57:38.458937635 -0600 Modify: 2017-12-19 04:53:49.062777216 -0600 Change: 2017-12-19 04:53:49.066777218 -0600 Birth: - {code} Showing the large number: {code:java} # find /tmp/ -name '*_resources' | wc -l 68340 # find /tmp/hive -name "*.pipeout" | wc -l 51837 {code} > Spark Thrift does not clean-up temporary files (/tmp/*_resources and > /tmp/hive/*.pipeout) > - > > Key: SPARK-23130 > URL: https://issues.apache.org/jira/browse/SPARK-23130 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.3, 2.1.0, 2.2.0 > Environment: * Hadoop distributions: HDP 2.5 - 2.6.3.0 > * OS: Seen on SLES12, RHEL 7.3 & RHEL 7.4 >Reporter: Sean Roberts >Priority: Major > > Spark Thrift is not cleaning up /tmp for files & directories named like: > /tmp/hive/*.pipeout > /tmp/*_resources > There are such a large number that /tmp quickly runs out of inodes *causing > the partition to be unusable and many services to crash*. This is even true > when the only jobs submitted are routine service checks. > Used `strace` to show that Spark Thrift is responsible: > {code:java} > strace.out.118864:04:53:49 > open("/tmp/hive/55ad7fc1-f79a-4ad8-8e02-
[jira] [Updated] (SPARK-23130) Spark Thrift does not clean-up temporary files (/tmp/*_resources and /tmp/hive/*.pipeout)
[ https://issues.apache.org/jira/browse/SPARK-23130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Roberts updated SPARK-23130: - Environment: * Spark versions: 1.6.3, 2.1.0, 2.2.0 * Hadoop distributions: HDP 2.5 - 2.6.3.0 * OS: Seen on SLES12, RHEL 7.3 & RHEL 7.4 was: * OS: Seen on SLES12, RHEL 7.3 & RHEL 7.4 * Spark versions: 1.6.3, 2.1.0, 2.2.0 * Hadoop distributions: HDP 2.5 - 2.6.3.0 > Spark Thrift does not clean-up temporary files (/tmp/*_resources and > /tmp/hive/*.pipeout) > - > > Key: SPARK-23130 > URL: https://issues.apache.org/jira/browse/SPARK-23130 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.3, 2.1.0, 2.2.0 > Environment: * Spark versions: 1.6.3, 2.1.0, 2.2.0 > * Hadoop distributions: HDP 2.5 - 2.6.3.0 > * OS: Seen on SLES12, RHEL 7.3 & RHEL 7.4 >Reporter: Sean Roberts >Priority: Major > > Spark Thrift is not cleaning up /tmp for files & directories named like: > /tmp/hive/*.pipeout > /tmp/*_resources > There are such a large number that /tmp quickly runs out of inodes *causing > the partition to be unusable and many services to crash*. This is even true > when the only jobs submitted are routine service checks. > Used `strace` to show that Spark Thrift is responsible: > {code:java} > strace.out.118864:04:53:49 > open("/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout", > O_RDWR|O_CREAT|O_EXCL, 0666) = 134 > strace.out.118864:04:53:49 > mkdir("/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources", 0777) = 0 > {code} > *Those files were left behind, even days later.* > ** > > Example files: > {code:java} > # stat > /tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout > File: > ‘/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout’ > Size: 0 Blocks: 0 IO Block: 4096 regular empty file > Device: fe09h/65033d Inode: 678 Links: 1 > Access: (0644/-rw-r--r--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) > Access: 2017-12-19 04:53:49.126777260 -0600 > Modify: 2017-12-19 04:53:49.126777260 -0600 > Change: 2017-12-19 04:53:49.126777260 -0600 > Birth: - > # stat /tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources > File: ‘/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources’ > Size: 4096 Blocks: 8 IO Block: 4096 directory > Device: fe09h/65033d Inode: 668 Links: 2 > Access: (0700/drwx--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) > Access: 2017-12-19 04:57:38.458937635 -0600 > Modify: 2017-12-19 04:53:49.062777216 -0600 > Change: 2017-12-19 04:53:49.066777218 -0600 > Birth: - > {code} > Showing the large number: > {code:java} > # find /tmp/ -name '*_resources' | wc -l > 68340 > # find /tmp/hive -name "*.pipeout" | wc -l > 51837 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23130) Spark Thrift does not clean-up temporary files (/tmp/*_resources and /tmp/hive/*.pipeout)
[ https://issues.apache.org/jira/browse/SPARK-23130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Roberts updated SPARK-23130: - Environment: * Hadoop distributions: HDP 2.5 - 2.6.3.0 * OS: Seen on SLES12, RHEL 7.3 & RHEL 7.4 was: * Spark versions: 1.6.3, 2.1.0, 2.2.0 * Hadoop distributions: HDP 2.5 - 2.6.3.0 * OS: Seen on SLES12, RHEL 7.3 & RHEL 7.4 > Spark Thrift does not clean-up temporary files (/tmp/*_resources and > /tmp/hive/*.pipeout) > - > > Key: SPARK-23130 > URL: https://issues.apache.org/jira/browse/SPARK-23130 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.3, 2.1.0, 2.2.0 > Environment: * Hadoop distributions: HDP 2.5 - 2.6.3.0 > * OS: Seen on SLES12, RHEL 7.3 & RHEL 7.4 >Reporter: Sean Roberts >Priority: Major > > Spark Thrift is not cleaning up /tmp for files & directories named like: > /tmp/hive/*.pipeout > /tmp/*_resources > There are such a large number that /tmp quickly runs out of inodes *causing > the partition to be unusable and many services to crash*. This is even true > when the only jobs submitted are routine service checks. > Used `strace` to show that Spark Thrift is responsible: > {code:java} > strace.out.118864:04:53:49 > open("/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout", > O_RDWR|O_CREAT|O_EXCL, 0666) = 134 > strace.out.118864:04:53:49 > mkdir("/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources", 0777) = 0 > {code} > *Those files were left behind, even days later.* > ** > > Example files: > {code:java} > # stat > /tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout > File: > ‘/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout’ > Size: 0 Blocks: 0 IO Block: 4096 regular empty file > Device: fe09h/65033d Inode: 678 Links: 1 > Access: (0644/-rw-r--r--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) > Access: 2017-12-19 04:53:49.126777260 -0600 > Modify: 2017-12-19 04:53:49.126777260 -0600 > Change: 2017-12-19 04:53:49.126777260 -0600 > Birth: - > # stat /tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources > File: ‘/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources’ > Size: 4096 Blocks: 8 IO Block: 4096 directory > Device: fe09h/65033d Inode: 668 Links: 2 > Access: (0700/drwx--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) > Access: 2017-12-19 04:57:38.458937635 -0600 > Modify: 2017-12-19 04:53:49.062777216 -0600 > Change: 2017-12-19 04:53:49.066777218 -0600 > Birth: - > {code} > Showing the large number: > {code:java} > # find /tmp/ -name '*_resources' | wc -l > 68340 > # find /tmp/hive -name "*.pipeout" | wc -l > 51837 > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23130) Spark Thrift does not clean-up temporary files (/tmp/*_resources and /tmp/hive/*.pipeout)
Sean Roberts created SPARK-23130: Summary: Spark Thrift does not clean-up temporary files (/tmp/*_resources and /tmp/hive/*.pipeout) Key: SPARK-23130 URL: https://issues.apache.org/jira/browse/SPARK-23130 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.0, 2.1.0, 1.6.3 Environment: * OS: Seen on SLES12, RHEL 7.3 & RHEL 7.4 * Spark versions: 1.6.3, 2.1.0, 2.2.0 * Hadoop distributions: HDP 2.5 - 2.6.3.0 Reporter: Sean Roberts Spark Thrift is not cleaning up /tmp for files & directories named like: /tmp/hive/*.pipeout /tmp/*_resources There are such a large number that /tmp quickly runs out of inodes *causing the partition to be unusable and many services to crash*. This is even true when the only jobs submitted are routine service checks. Used `strace` to show that Spark Thrift is responsible: {code:java} strace.out.118864:04:53:49 open("/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout", O_RDWR|O_CREAT|O_EXCL, 0666) = 134 strace.out.118864:04:53:49 mkdir("/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources", 0777) = 0 {code} *Those files were left behind, even days later.* ** Example files: {code:java} # stat /tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout File: ‘/tmp/hive/55ad7fc1-f79a-4ad8-8e02-26bbeaa86bbc7288010135864174970.pipeout’ Size: 0 Blocks: 0 IO Block: 4096 regular empty file Device: fe09h/65033dInode: 678 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) Access: 2017-12-19 04:53:49.126777260 -0600 Modify: 2017-12-19 04:53:49.126777260 -0600 Change: 2017-12-19 04:53:49.126777260 -0600 Birth: - # stat /tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources File: ‘/tmp/b6dfbf9e-2f7c-4c25-95a1-73c44318ecf4_resources’ Size: 4096Blocks: 8 IO Block: 4096 directory Device: fe09h/65033dInode: 668 Links: 2 Access: (0700/drwx--) Uid: ( 1000/hive) Gid: ( 1002/ hadoop) Access: 2017-12-19 04:57:38.458937635 -0600 Modify: 2017-12-19 04:53:49.062777216 -0600 Change: 2017-12-19 04:53:49.066777218 -0600 Birth: - {code} Showing the large number: {code:java} # find /tmp/ -name '*_resources' | wc -l 68340 # find /tmp/hive -name "*.pipeout" | wc -l 51837 {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-23129) Lazy init DiskMapIterator#deserializeStream to reduce memory usage when ExternalAppendOnlyMap spill too much times
[ https://issues.apache.org/jira/browse/SPARK-23129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23129: Assignee: Apache Spark > Lazy init DiskMapIterator#deserializeStream to reduce memory usage when > ExternalAppendOnlyMap spill too much times > --- > > Key: SPARK-23129 > URL: https://issues.apache.org/jira/browse/SPARK-23129 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.1.0 >Reporter: zhoukang >Assignee: Apache Spark >Priority: Major > > Currently,the deserializeStream in ExternalAppendOnlyMap#DiskMapIterator init > when DiskMapIterator instance created.This will cause memory use overhead > when ExternalAppendOnlyMap spill too much times. > We can avoid this by making deserializeStream init when it is used the first > time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23129) Lazy init DiskMapIterator#deserializeStream to reduce memory usage when ExternalAppendOnlyMap spill too much times
[ https://issues.apache.org/jira/browse/SPARK-23129?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328543#comment-16328543 ] Apache Spark commented on SPARK-23129: -- User 'caneGuy' has created a pull request for this issue: https://github.com/apache/spark/pull/20292 > Lazy init DiskMapIterator#deserializeStream to reduce memory usage when > ExternalAppendOnlyMap spill too much times > --- > > Key: SPARK-23129 > URL: https://issues.apache.org/jira/browse/SPARK-23129 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.1.0 >Reporter: zhoukang >Priority: Major > > Currently,the deserializeStream in ExternalAppendOnlyMap#DiskMapIterator init > when DiskMapIterator instance created.This will cause memory use overhead > when ExternalAppendOnlyMap spill too much times. > We can avoid this by making deserializeStream init when it is used the first > time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-23129) Lazy init DiskMapIterator#deserializeStream to reduce memory usage when ExternalAppendOnlyMap spill too much times
[ https://issues.apache.org/jira/browse/SPARK-23129?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23129: Assignee: (was: Apache Spark) > Lazy init DiskMapIterator#deserializeStream to reduce memory usage when > ExternalAppendOnlyMap spill too much times > --- > > Key: SPARK-23129 > URL: https://issues.apache.org/jira/browse/SPARK-23129 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.1.0 >Reporter: zhoukang >Priority: Major > > Currently,the deserializeStream in ExternalAppendOnlyMap#DiskMapIterator init > when DiskMapIterator instance created.This will cause memory use overhead > when ExternalAppendOnlyMap spill too much times. > We can avoid this by making deserializeStream init when it is used the first > time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23129) Lazy init DiskMapIterator#deserializeStream to reduce memory usage when ExternalAppendOnlyMap spill too much times
zhoukang created SPARK-23129: Summary: Lazy init DiskMapIterator#deserializeStream to reduce memory usage when ExternalAppendOnlyMap spill too much times Key: SPARK-23129 URL: https://issues.apache.org/jira/browse/SPARK-23129 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 2.1.0 Reporter: zhoukang Currently,the deserializeStream in ExternalAppendOnlyMap#DiskMapIterator init when DiskMapIterator instance created.This will cause memory use overhead when ExternalAppendOnlyMap spill too much times. We can avoid this by making deserializeStream init when it is used the first time. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23128) Introduce QueryStage to improve adaptive execution in Spark SQL
Carson Wang created SPARK-23128: --- Summary: Introduce QueryStage to improve adaptive execution in Spark SQL Key: SPARK-23128 URL: https://issues.apache.org/jira/browse/SPARK-23128 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.2.1 Reporter: Carson Wang SPARK-9850 proposed the basic idea of adaptive execution in Spark. In DAGScheduler, a new API is added to support submitting a single map stage. The current implementation of adaptive execution in Spark SQL supports changing the reducer number at runtime. An Exchange coordinator is used to determine the number of post-shuffle partitions for a stage that needs to fetch shuffle data from one or multiple stages. The current implementation adds ExchangeCoordinator while we are adding Exchanges. However there are some limitations. First, it may cause additional shuffles that may decrease the performance. We can see this from EnsureRequirements rule when it adds ExchangeCoordinator. Secondly, it is not a good idea to add ExchangeCoordinators while we are adding Exchanges because we don’t have a global picture of all shuffle dependencies of a post-shuffle stage. I.e. for 3 tables’ join in a single stage, the same ExchangeCoordinator should be used in three Exchanges but currently two separated ExchangeCoordinator will be added. Thirdly, with the current framework it is not easy to implement other features in adaptive execution flexibly like changing the execution plan and handling skewed join at runtime. We'd like to introduce QueryStage and a new way to do adaptive execution in Spark SQL and address the limitations. The idea is described at https://docs.google.com/document/d/1mpVjvQZRAkD-Ggy6-hcjXtBPiQoVbZGe3dLnAKgtJ4k/edit?usp=sharing -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23127) Update FeatureHasher user guide for catCols parameter
[ https://issues.apache.org/jira/browse/SPARK-23127?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Nick Pentreath updated SPARK-23127: --- Description: SPARK-22801 added the {{categoricalCols}} parameter and updated the Scala and Python doc, but did not update the user guide entry discussing feature handling. > Update FeatureHasher user guide for catCols parameter > - > > Key: SPARK-23127 > URL: https://issues.apache.org/jira/browse/SPARK-23127 > Project: Spark > Issue Type: Documentation > Components: Documentation, ML >Affects Versions: 2.3.0 >Reporter: Nick Pentreath >Priority: Major > > SPARK-22801 added the {{categoricalCols}} parameter and updated the Scala and > Python doc, but did not update the user guide entry discussing feature > handling. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23127) Update FeatureHasher user guide for catCols parameter
Nick Pentreath created SPARK-23127: -- Summary: Update FeatureHasher user guide for catCols parameter Key: SPARK-23127 URL: https://issues.apache.org/jira/browse/SPARK-23127 Project: Spark Issue Type: Documentation Components: Documentation, ML Affects Versions: 2.3.0 Reporter: Nick Pentreath -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23126) I used the Project operator and modified the source. After compiling successfully, and testing the jars, I got the exception. Maybe the phenomenon is related with implic
xuetao created SPARK-23126: -- Summary: I used the Project operator and modified the source. After compiling successfully, and testing the jars, I got the exception. Maybe the phenomenon is related with implicits Key: SPARK-23126 URL: https://issues.apache.org/jira/browse/SPARK-23126 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.0 Environment: Centos Reporter: xuetao Fix For: 2.2.0 I used the Project operator and modified the source. After compiling successfully, and testing the jars, I got the exception as follow: org.apache.spark.sql.AnalysisException: Try to map struct to Tuple3, but failed as the number of fields does not line up.; -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23125) Offset commit failed when spark-streaming batch time is more than kafkaParams session timeout.
[ https://issues.apache.org/jira/browse/SPARK-23125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoshijie updated SPARK-23125: --- Description: I find DirectKafkaInputDStream(kafka010) Offset commit failed when batch time is more than kafkaParams session.timeout.ms .log as fellow: {code:java} 2018-01-16 05:40:00,002 ERROR org.apache.kafka.clients.consumer.internals.ConsumerCoordinator: Offset commit failed. org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records. at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:600) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:541) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658) at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167) at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133) at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitPendingRequests(ConsumerNetworkClient.java:260) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:222) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.ensurePartitionAssignment(ConsumerCoordinator.java:366) at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:978) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:161) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:180) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:207) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334) at scala.Option.orElse(Option.scala:289) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immu
[jira] [Updated] (SPARK-23125) Offset commit failed when spark-streaming batch time is more than kafkaParams session timeout.
[ https://issues.apache.org/jira/browse/SPARK-23125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoshijie updated SPARK-23125: --- Description: I find DirectKafkaInputDStream(kafka010) Offset commit failed when batch time is more than kafkaParams session.timeout.ms .log as fellow: {code:java} 2018-01-16 05:40:00,002 ERROR org.apache.kafka.clients.consumer.internals.ConsumerCoordinator: Offset commit failed. org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records. at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:600) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:541) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658) at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167) at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133) at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitPendingRequests(ConsumerNetworkClient.java:260) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:222) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.ensurePartitionAssignment(ConsumerCoordinator.java:366) at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:978) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:161) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:180) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:207) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334) at scala.Option.orElse(Option.scala:289) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immu
[jira] [Updated] (SPARK-23125) Offset commit failed when spark-streaming batch time is more than kafkaParams session timeout.
[ https://issues.apache.org/jira/browse/SPARK-23125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoshijie updated SPARK-23125: --- Description: I find DirectKafkaInputDStream(kafka010) Offset commit failed when batch time is more than kafkaParams session.timeout.ms .log as fellow: {code:java} 2018-01-16 05:40:00,002 ERROR org.apache.kafka.clients.consumer.internals.ConsumerCoordinator: Offset commit failed. org.apache.kafka.clients.consumer.CommitFailedException: Commit cannot be completed since the group has already rebalanced and assigned the partitions to another member. This means that the time between subsequent calls to poll() was longer than the configured session.timeout.ms, which typically implies that the poll loop is spending too much time message processing. You can address this either by increasing the session timeout or by reducing the maximum size of batches returned in poll() with max.poll.records. at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:600) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator$OffsetCommitResponseHandler.handle(ConsumerCoordinator.java:541) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:679) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator$CoordinatorResponseHandler.onSuccess(AbstractCoordinator.java:658) at org.apache.kafka.clients.consumer.internals.RequestFuture$1.onSuccess(RequestFuture.java:167) at org.apache.kafka.clients.consumer.internals.RequestFuture.fireSuccess(RequestFuture.java:133) at org.apache.kafka.clients.consumer.internals.RequestFuture.complete(RequestFuture.java:107) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient$RequestFutureCompletionHandler.onComplete(ConsumerNetworkClient.java:426) at org.apache.kafka.clients.NetworkClient.poll(NetworkClient.java:278) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.clientPoll(ConsumerNetworkClient.java:360) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:224) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.poll(ConsumerNetworkClient.java:192) at org.apache.kafka.clients.consumer.internals.ConsumerNetworkClient.awaitPendingRequests(ConsumerNetworkClient.java:260) at org.apache.kafka.clients.consumer.internals.AbstractCoordinator.ensureActiveGroup(AbstractCoordinator.java:222) at org.apache.kafka.clients.consumer.internals.ConsumerCoordinator.ensurePartitionAssignment(ConsumerCoordinator.java:366) at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:978) at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:938) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.paranoidPoll(DirectKafkaInputDStream.scala:161) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.latestOffsets(DirectKafkaInputDStream.scala:180) at org.apache.spark.streaming.kafka010.DirectKafkaInputDStream.compute(DirectKafkaInputDStream.scala:207) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:342) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:341) at org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:416) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:336) at org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:334) at scala.Option.orElse(Option.scala:289) at org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:331) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) at org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.immutable.List.foreach(List.scala:381) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.immu
[jira] [Updated] (SPARK-23020) Re-enable Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sameer Agarwal updated SPARK-23020: --- Summary: Re-enable Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher (was: Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher) > Re-enable Flaky Test: > org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher > > > Key: SPARK-23020 > URL: https://issues.apache.org/jira/browse/SPARK-23020 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.3.0 >Reporter: Sameer Agarwal >Assignee: Marcelo Vanzin >Priority: Blocker > > https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.3-test-maven-hadoop-2.7/42/testReport/junit/org.apache.spark.launcher/SparkLauncherSuite/testInProcessLauncher/history/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23020) Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328455#comment-16328455 ] Apache Spark commented on SPARK-23020: -- User 'sameeragarwal' has created a pull request for this issue: https://github.com/apache/spark/pull/20291 > Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher > -- > > Key: SPARK-23020 > URL: https://issues.apache.org/jira/browse/SPARK-23020 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.3.0 >Reporter: Sameer Agarwal >Assignee: Marcelo Vanzin >Priority: Blocker > > https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.3-test-maven-hadoop-2.7/42/testReport/junit/org.apache.spark.launcher/SparkLauncherSuite/testInProcessLauncher/history/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-23020) Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23020: Assignee: Marcelo Vanzin (was: Apache Spark) > Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher > -- > > Key: SPARK-23020 > URL: https://issues.apache.org/jira/browse/SPARK-23020 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.3.0 >Reporter: Sameer Agarwal >Assignee: Marcelo Vanzin >Priority: Blocker > > https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.3-test-maven-hadoop-2.7/42/testReport/junit/org.apache.spark.launcher/SparkLauncherSuite/testInProcessLauncher/history/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-23020) Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher
[ https://issues.apache.org/jira/browse/SPARK-23020?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-23020: Assignee: Apache Spark (was: Marcelo Vanzin) > Flaky Test: org.apache.spark.launcher.SparkLauncherSuite.testInProcessLauncher > -- > > Key: SPARK-23020 > URL: https://issues.apache.org/jira/browse/SPARK-23020 > Project: Spark > Issue Type: Bug > Components: Tests >Affects Versions: 2.3.0 >Reporter: Sameer Agarwal >Assignee: Apache Spark >Priority: Blocker > > https://amplab.cs.berkeley.edu/jenkins/job/spark-branch-2.3-test-maven-hadoop-2.7/42/testReport/junit/org.apache.spark.launcher/SparkLauncherSuite/testInProcessLauncher/history/ -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-23125) Offset commit failed when spark-streaming batch time is more than kafkaParams session timeout.
[ https://issues.apache.org/jira/browse/SPARK-23125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] zhaoshijie updated SPARK-23125: --- Description: I find DirectKafkaInputDStream(kafka010) Offset commit failed when batch time is more than > Offset commit failed when spark-streaming batch time is more than kafkaParams > session timeout. > -- > > Key: SPARK-23125 > URL: https://issues.apache.org/jira/browse/SPARK-23125 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 2.2.0 >Reporter: zhaoshijie >Priority: Major > > I find DirectKafkaInputDStream(kafka010) Offset commit failed when batch time > is more than -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23118) SparkR 2.3 QA: Programming guide, migration guide, vignettes updates
[ https://issues.apache.org/jira/browse/SPARK-23118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328430#comment-16328430 ] Felix Cheung commented on SPARK-23118: -- did this and opened SPARK-21616 > SparkR 2.3 QA: Programming guide, migration guide, vignettes updates > > > Key: SPARK-23118 > URL: https://issues.apache.org/jira/browse/SPARK-23118 > Project: Spark > Issue Type: Sub-task > Components: Documentation, SparkR >Reporter: Joseph K. Bradley >Priority: Critical > > Before the release, we need to update the SparkR Programming Guide, its > migration guide, and the R vignettes. Updates will include: > * Add migration guide subsection. > ** Use the results of the QA audit JIRAs. > * Check phrasing, especially in main sections (for outdated items such as > "In this release, ...") > * Update R vignettes > Note: This task is for large changes to the guides. New features are handled > in SPARK-23116. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23125) Offset commit failed when spark-streaming batch time is more than kafkaParams session timeout.
zhaoshijie created SPARK-23125: -- Summary: Offset commit failed when spark-streaming batch time is more than kafkaParams session timeout. Key: SPARK-23125 URL: https://issues.apache.org/jira/browse/SPARK-23125 Project: Spark Issue Type: Bug Components: DStreams Affects Versions: 2.2.0 Reporter: zhaoshijie -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-23062) EXCEPT documentation should make it clear that it's EXCEPT DISTINCT
[ https://issues.apache.org/jira/browse/SPARK-23062?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-23062. - Resolution: Fixed Assignee: Henry Robinson Fix Version/s: 2.3.0 > EXCEPT documentation should make it clear that it's EXCEPT DISTINCT > --- > > Key: SPARK-23062 > URL: https://issues.apache.org/jira/browse/SPARK-23062 > Project: Spark > Issue Type: Improvement > Components: Documentation, SQL >Affects Versions: 2.0.0 >Reporter: Henry Robinson >Assignee: Henry Robinson >Priority: Trivial > Fix For: 2.3.0 > > > Between 1.6 and 2.0, the default behaviour for {{EXCEPT}} changed from > {{EXCEPT ALL}} to {{EXCEPT DISTINCT}}. This is reasonable: postgres defaults > to the same, and if you try and explicitly use {{EXCEPT ALL}} in 2.0 you get > a good error message. However, the change was confusing to some users, so > it's worth being explicit in the documentation. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23115) SparkR 2.3 QA: New R APIs and API docs
[ https://issues.apache.org/jira/browse/SPARK-23115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328422#comment-16328422 ] Felix Cheung commented on SPARK-23115: -- did this, and opened this https://issues.apache.org/jira/browse/SPARK-23069 > SparkR 2.3 QA: New R APIs and API docs > -- > > Key: SPARK-23115 > URL: https://issues.apache.org/jira/browse/SPARK-23115 > Project: Spark > Issue Type: Sub-task > Components: Documentation, SparkR >Reporter: Joseph K. Bradley >Priority: Blocker > > Audit new public R APIs. Take note of: > * Correctness and uniformity of API > * Documentation: Missing? Bad links or formatting? > ** Check both the generated docs linked from the user guide and the R command > line docs `?read.df`. These are generated using roxygen. > As you find issues, please create JIRAs and link them to this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-23115) SparkR 2.3 QA: New R APIs and API docs
[ https://issues.apache.org/jira/browse/SPARK-23115?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328422#comment-16328422 ] Felix Cheung edited comment on SPARK-23115 at 1/17/18 8:01 AM: --- did this, and opened SPARK-23069 was (Author: felixcheung): did this, and opened this https://issues.apache.org/jira/browse/SPARK-23069 > SparkR 2.3 QA: New R APIs and API docs > -- > > Key: SPARK-23115 > URL: https://issues.apache.org/jira/browse/SPARK-23115 > Project: Spark > Issue Type: Sub-task > Components: Documentation, SparkR >Reporter: Joseph K. Bradley >Priority: Blocker > > Audit new public R APIs. Take note of: > * Correctness and uniformity of API > * Documentation: Missing? Bad links or formatting? > ** Check both the generated docs linked from the user guide and the R command > line docs `?read.df`. These are generated using roxygen. > As you find issues, please create JIRAs and link them to this issue. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23114) Spark R 2.3 QA umbrella
[ https://issues.apache.org/jira/browse/SPARK-23114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328419#comment-16328419 ] Felix Cheung commented on SPARK-23114: -- sure, [~josephkb] > Spark R 2.3 QA umbrella > --- > > Key: SPARK-23114 > URL: https://issues.apache.org/jira/browse/SPARK-23114 > Project: Spark > Issue Type: Umbrella > Components: Documentation, SparkR >Reporter: Joseph K. Bradley >Priority: Critical > > This JIRA lists tasks for the next Spark release's QA period for SparkR. > The list below gives an overview of what is involved, and the corresponding > JIRA issues are linked below that. > h2. API > * Audit new public APIs (from the generated html doc) > ** relative to Spark Scala/Java APIs > ** relative to popular R libraries > h2. Documentation and example code > * For new algorithms, create JIRAs for updating the user guide sections & > examples > * Update Programming Guide > * Update website -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org