[jira] [Created] (SPARK-40093) is kubernetes jar required if not using that executor?
t oo created SPARK-40093: Summary: is kubernetes jar required if not using that executor? Key: SPARK-40093 URL: https://issues.apache.org/jira/browse/SPARK-40093 Project: Spark Issue Type: Question Components: Deploy Affects Versions: 3.3.0 Reporter: t oo my docker file is very big with pyspark can i remove dis files below if i don't use 'ML'? 14M /usr/local/lib/python3.10/site-packages/pyspark/jars/breeze_2.12-1.2.jar 5.9M /usr/local/lib/python3.10/site-packages/pyspark/jars/spark-mllib_2.12-3.3.0.jar -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40093) is kubernetes jar required if not using that executor?
[ https://issues.apache.org/jira/browse/SPARK-40093?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-40093: - Description: my docker file is very big with pyspark can i remove dis files below if i don't use 'kubernetes executor'? 11M total 4.0M /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-core-5.12.2.jar 840K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-client-5.12.2.jar 760K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-admissionregistration-5.12.2.jar 704K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-apiextensions-5.12.2.jar 640K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-autoscaling-5.12.2.jar 528K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-extensions-5.12.2.jar 516K /usr/local/lib/python3.10/site-packages/pyspark/jars/spark-kubernetes_2.12-3.3.0.jar 456K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-networking-5.12.2.jar 436K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-apps-5.12.2.jar 364K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-storageclass-5.12.2.jar 336K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-policy-5.12.2.jar 264K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-flowcontrol-5.12.2.jar 244K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-batch-5.12.2.jar 192K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-discovery-5.12.2.jar 176K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-rbac-5.12.2.jar 160K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-node-5.12.2.jar 144K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-certificates-5.12.2.jar 104K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-events-5.12.2.jar 80K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-metrics-5.12.2.jar 68K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-scheduling-5.12.2.jar 48K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-coordination-5.12.2.jar 20K /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-common-5.12.2.jar was: my docker file is very big with pyspark can i remove dis files below if i don't use 'ML'? 14M /usr/local/lib/python3.10/site-packages/pyspark/jars/breeze_2.12-1.2.jar 5.9M /usr/local/lib/python3.10/site-packages/pyspark/jars/spark-mllib_2.12-3.3.0.jar > is kubernetes jar required if not using that executor? > -- > > Key: SPARK-40093 > URL: https://issues.apache.org/jira/browse/SPARK-40093 > Project: Spark > Issue Type: Question > Components: Deploy >Affects Versions: 3.3.0 >Reporter: t oo >Priority: Major > > my docker file is very big with pyspark > can i remove dis files below if i don't use 'kubernetes executor'? > 11M total > 4.0M > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-core-5.12.2.jar > 840K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-client-5.12.2.jar > 760K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-admissionregistration-5.12.2.jar > 704K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-apiextensions-5.12.2.jar > 640K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-autoscaling-5.12.2.jar > 528K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-extensions-5.12.2.jar > 516K > /usr/local/lib/python3.10/site-packages/pyspark/jars/spark-kubernetes_2.12-3.3.0.jar > 456K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-networking-5.12.2.jar > 436K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-apps-5.12.2.jar > 364K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-storageclass-5.12.2.jar > 336K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-policy-5.12.2.jar > 264K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-flowcontrol-5.12.2.jar > 244K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-batch-5.12.2.jar > 192K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-discovery-5.12.2.jar > 176K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-rbac-5.12.2.jar > 160K > /usr/local/lib/python3.10/site-packages/pyspark/jars/kubernetes-model-node-5.12.2.jar > 144K >
[jira] [Created] (SPARK-40092) is breeze required if not using ML?
t oo created SPARK-40092: Summary: is breeze required if not using ML? Key: SPARK-40092 URL: https://issues.apache.org/jira/browse/SPARK-40092 Project: Spark Issue Type: Question Components: Deploy Affects Versions: 3.3.0 Reporter: t oo my docker file is very big with pyspark can i remove dis file below if i don't use 'spark streaming'? 35M /usr/local/lib/python3.10/site-packages/pyspark/jars/rocksdbjni-6.20.3.jar -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-40092) is breeze required if not using ML?
[ https://issues.apache.org/jira/browse/SPARK-40092?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-40092: - Description: my docker file is very big with pyspark can i remove dis files below if i don't use 'ML'? 14M /usr/local/lib/python3.10/site-packages/pyspark/jars/breeze_2.12-1.2.jar 5.9M /usr/local/lib/python3.10/site-packages/pyspark/jars/spark-mllib_2.12-3.3.0.jar was: my docker file is very big with pyspark can i remove dis file below if i don't use 'spark streaming'? 35M /usr/local/lib/python3.10/site-packages/pyspark/jars/rocksdbjni-6.20.3.jar > is breeze required if not using ML? > --- > > Key: SPARK-40092 > URL: https://issues.apache.org/jira/browse/SPARK-40092 > Project: Spark > Issue Type: Question > Components: Deploy >Affects Versions: 3.3.0 >Reporter: t oo >Priority: Major > > my docker file is very big with pyspark > can i remove dis files below if i don't use 'ML'? > 14M > /usr/local/lib/python3.10/site-packages/pyspark/jars/breeze_2.12-1.2.jar > 5.9M > /usr/local/lib/python3.10/site-packages/pyspark/jars/spark-mllib_2.12-3.3.0.jar -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-40091) is rocksdbjni required if not using streaming?
t oo created SPARK-40091: Summary: is rocksdbjni required if not using streaming? Key: SPARK-40091 URL: https://issues.apache.org/jira/browse/SPARK-40091 Project: Spark Issue Type: Question Components: Deploy Affects Versions: 3.3.0 Reporter: t oo my docker file is very big with pyspark can i remove dis file below if i don't use 'spark streaming'? 35M /usr/local/lib/python3.10/site-packages/pyspark/jars/rocksdbjni-6.20.3.jar -- This message was sent by Atlassian Jira (v8.20.10#820010) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38883) smaller pyspark install if not using streaming?
t oo created SPARK-38883: Summary: smaller pyspark install if not using streaming? Key: SPARK-38883 URL: https://issues.apache.org/jira/browse/SPARK-38883 Project: Spark Issue Type: Improvement Components: PySpark Affects Versions: 3.2.1 Reporter: t oo h3. Describe the feature i am trying to include pyspark in my docker image, but the size is around 300MB the largest jar is rocksdbjni-6.20.3.jar at 35MB is it safe to remove this jar if i have no need for SparkStreaming? is there any advice on getting the install smaller? perhaps a map of which jars are needed for batch vs sql vs streaming? h3. Use Case smaller python package means i can pack more concurrent pods on to my eks workers -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-37420) Oracle JDBC - java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38
[ https://issues.apache.org/jira/browse/SPARK-37420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-37420: - Description: reading oracle jdbc is not working as expected, i thought a simple df show should work. {code:java} /usr/local/bin/pyspark --driver-class-path "/home/user/extra_jar_spark/*" --jars "/home/user/extra_jar_spark/*" jdbc2DF = spark.read \ .format("jdbc") \ .option("url", "jdbc:oracle:thin:@redact") \ .option("driver", "oracle.jdbc.OracleDriver") \ .option("dbtable", "s.t") \ .option("user", "redact") \ .option("password", "redact") \ .option("fetchsize", 1) \ .load() jdbc2DF.printSchema() root |-- ID: decimal(38,10) (nullable = true) |-- OBJECT_VERSION_NUMBER: decimal(9,0) (nullable = true) |-- START_DATE: timestamp (nullable = true) |-- END_DATE: timestamp (nullable = true) |-- CREATED_BY: decimal(15,0) (nullable = true) |-- CREATION_DATE: timestamp (nullable = true) |-- LAST_UPDATED_BY: decimal(15,0) (nullable = true) |-- LAST_UPDATE_DATE: timestamp (nullable = true) |-- LAST_UPDATE_LOGIN: decimal(15,0) (nullable = true) |-- CONTINGENCY: string (nullable = true) |-- CONTINGENCY_ID: decimal(38,10) (nullable = true) jdbc2DF.show() 21/11/20 23:42:00 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2) java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38 at org.apache.spark.sql.errors.QueryExecutionErrors$.decimalPrecisionExceedsMaxPrecisionError(QueryExecutionErrors.scala:847) at org.apache.spark.sql.types.Decimal.set(Decimal.scala:123) at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:572) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$4(JdbcUtils.scala:418) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.nullSafeConvert(JdbcUtils.scala:546) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3(JdbcUtils.scala:418) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3$adapted(JdbcUtils.scala:416) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:367) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:349) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:349) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 21/11/20 23:42:00 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2) (localhost executor driver): java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38 at org.apache.spark.sql.errors.QueryExecutionErrors$.decimalPrecisionExceedsMaxPrecisionError(QueryExecutionErrors.scala:847) at org.apache.spark.sql.types.Decimal.set(Decimal.scala:123) at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:572) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$4(JdbcUtils.scala:418) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.nullSafeConvert(JdbcUtils.scala:546) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3(JdbcUtils.scala:418) at
[jira] [Updated] (SPARK-37420) Oracle JDBC - java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38
[ https://issues.apache.org/jira/browse/SPARK-37420?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-37420: - Description: reading oracle jdbc is not working as expected, i thought a simple df show should work. {code:java} /usr/local/bin/pyspark --driver-class-path "/home/user/extra_jar_spark/*" --jars "/home/user/extra_jar_spark/*" jdbc2DF = spark.read \ .format("jdbc") \ .option("url", "jdbc:oracle:thin:@redact") \ .option("driver", "oracle.jdbc.OracleDriver") \ .option("dbtable", "s.t") \ .option("user", "redact") \ .option("password", "redact") \ .option("fetchsize", 1) \ .load() jdbc2DF.printSchema() root |-- ID: decimal(38,10) (nullable = true) |-- OBJECT_VERSION_NUMBER: decimal(9,0) (nullable = true) |-- START_DATE: timestamp (nullable = true) |-- END_DATE: timestamp (nullable = true) |-- CREATED_BY: decimal(15,0) (nullable = true) |-- CREATION_DATE: timestamp (nullable = true) |-- LAST_UPDATED_BY: decimal(15,0) (nullable = true) |-- LAST_UPDATE_DATE: timestamp (nullable = true) |-- LAST_UPDATE_LOGIN: decimal(15,0) (nullable = true) |-- CONTINGENCY: string (nullable = true) |-- CONTINGENCY_ID: decimal(38,10) (nullable = true) jdbc2DF.show() 21/11/20 23:42:00 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2) java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38 at org.apache.spark.sql.errors.QueryExecutionErrors$.decimalPrecisionExceedsMaxPrecisionError(QueryExecutionErrors.scala:847) at org.apache.spark.sql.types.Decimal.set(Decimal.scala:123) at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:572) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$4(JdbcUtils.scala:418) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.nullSafeConvert(JdbcUtils.scala:546) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3(JdbcUtils.scala:418) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3$adapted(JdbcUtils.scala:416) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:367) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:349) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:349) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 21/11/20 23:42:00 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2) (localhost executor driver): java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38 at org.apache.spark.sql.errors.QueryExecutionErrors$.decimalPrecisionExceedsMaxPrecisionError(QueryExecutionErrors.scala:847) at org.apache.spark.sql.types.Decimal.set(Decimal.scala:123) at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:572) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$4(JdbcUtils.scala:418) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.nullSafeConvert(JdbcUtils.scala:546) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3(JdbcUtils.scala:418) at
[jira] [Created] (SPARK-37420) Oracle JDBC - java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38
t oo created SPARK-37420: Summary: Oracle JDBC - java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38 Key: SPARK-37420 URL: https://issues.apache.org/jira/browse/SPARK-37420 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 3.2.0 Reporter: t oo reading oracle jdbc is not working as expected, i thought a simple df show should work. {code:java} /usr/local/bin/pyspark --driver-class-path "/home/user/extra_jar_spark/*" --jars "/home/user/extra_jar_spark/*" jdbc2DF = spark.read \ .format("jdbc") \ .option("url", "jdbc:oracle:thin:@redact") \ .option("driver", "oracle.jdbc.OracleDriver") \ .option("dbtable", "s.t") \ .option("user", "redact") \ .option("password", "redact") \ .option("fetchsize", 1) \ .load() jdbc2DF.printSchema() root |-- ID: decimal(38,10) (nullable = true) |-- OBJECT_VERSION_NUMBER: decimal(9,0) (nullable = true) |-- START_DATE: timestamp (nullable = true) |-- END_DATE: timestamp (nullable = true) |-- CREATED_BY: decimal(15,0) (nullable = true) |-- CREATION_DATE: timestamp (nullable = true) |-- LAST_UPDATED_BY: decimal(15,0) (nullable = true) |-- LAST_UPDATE_DATE: timestamp (nullable = true) |-- LAST_UPDATE_LOGIN: decimal(15,0) (nullable = true) |-- CONTINGENCY: string (nullable = true) |-- CONTINGENCY_ID: decimal(38,10) (nullable = true) jdbc2DF.show() 21/11/20 23:42:00 ERROR Executor: Exception in task 0.0 in stage 2.0 (TID 2) java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38 at org.apache.spark.sql.errors.QueryExecutionErrors$.decimalPrecisionExceedsMaxPrecisionError(QueryExecutionErrors.scala:847) at org.apache.spark.sql.types.Decimal.set(Decimal.scala:123) at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:572) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$4(JdbcUtils.scala:418) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.nullSafeConvert(JdbcUtils.scala:546) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3(JdbcUtils.scala:418) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$3$adapted(JdbcUtils.scala:416) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:367) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$$anon$1.getNext(JdbcUtils.scala:349) at org.apache.spark.util.NextIterator.hasNext(NextIterator.scala:73) at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37) at org.apache.spark.util.CompletionIterator.hasNext(CompletionIterator.scala:31) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anon$1.hasNext(WholeStageCodegenExec.scala:759) at org.apache.spark.sql.execution.SparkPlan.$anonfun$getByteArrayRdd$1(SparkPlan.scala:349) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2(RDD.scala:898) at org.apache.spark.rdd.RDD.$anonfun$mapPartitionsInternal$2$adapted(RDD.scala:898) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:52) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:373) at org.apache.spark.rdd.RDD.iterator(RDD.scala:337) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90) at org.apache.spark.scheduler.Task.run(Task.scala:131) at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506) at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) 21/11/20 23:42:00 WARN TaskSetManager: Lost task 0.0 in stage 2.0 (TID 2) (localhost executor driver): java.lang.ArithmeticException: Decimal precision 49 exceeds max precision 38 at org.apache.spark.sql.errors.QueryExecutionErrors$.decimalPrecisionExceedsMaxPrecisionError(QueryExecutionErrors.scala:847) at org.apache.spark.sql.types.Decimal.set(Decimal.scala:123) at org.apache.spark.sql.types.Decimal$.apply(Decimal.scala:572) at org.apache.spark.sql.execution.datasources.jdbc.JdbcUtils$.$anonfun$makeGetter$4(JdbcUtils.scala:418) at
[jira] [Comment Edited] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS
[ https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390198#comment-17390198 ] t oo edited comment on SPARK-35974 at 7/29/21, 11:38 PM: - same issue on spark 3.1.2: {code:java} { "action" : "SubmissionStatusResponse", "driverState" : "ERROR", "message" : "Exception from the cluster:\njava.nio.file.AccessDeniedException: s3a://redact/ingestion-0.5.2-SNAPSHOT.jar: getFileStatus on s3a://redact/ingestion-0.5.2-SNAPSHOT.jar: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: hidden; S3 Extended Request ID: hideit), S3 Extended Request ID: hideit\n\torg.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)\n\torg.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)\n\torg.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)\n\torg.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:799)\n\torg.apache.spark.util.Utils$.doFetchFile(Utils.scala:776)\n\torg.apache.spark.util.Utils$.fetchFile(Utils.scala:541)\n\torg.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:162)\n\torg.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:180)\n\torg.apache.spark.deploy.worker.DriverRunner$$anon$2.run(DriverRunner.scala:99)", "serverSparkVersion" : "3.1.2", "submissionId" : "driver-20210729233253-0001", "success" : true, "workerHostPort" : "10.redact:17537", "workerId" : "worker-20210729232355-10.redact-17537" } {code} was (Author: toopt4): same issue on spark 3.1.2: { "action" : "SubmissionStatusResponse", "driverState" : "ERROR", "message" : "Exception from the cluster:\njava.nio.file.AccessDeniedException: s3a://redact/ingestion-0.5.2-SNAPSHOT.jar: getFileStatus on s3a://redact/ingestion-0.5.2-SNAPSHOT.jar: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: hidden; S3 Extended Request ID: hideit), S3 Extended Request ID: hideit\n\torg.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)\n\torg.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)\n\torg.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)\n\torg.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:799)\n\torg.apache.spark.util.Utils$.doFetchFile(Utils.scala:776)\n\torg.apache.spark.util.Utils$.fetchFile(Utils.scala:541)\n\torg.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:162)\n\torg.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:180)\n\torg.apache.spark.deploy.worker.DriverRunner$$anon$2.run(DriverRunner.scala:99)", "serverSparkVersion" : "3.1.2", "submissionId" : "driver-20210729233253-0001", "success" : true, "workerHostPort" : "10.redact:17537", "workerId" : "worker-20210729232355-10.redact-17537" } > Spark submit REST cluster/standalone mode - launching an s3a jar with STS > - > > Key: SPARK-35974 > URL: https://issues.apache.org/jira/browse/SPARK-35974 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.1.2 >Reporter: t oo >Priority: Major > > {code:java} > /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master > spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf > spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.hadoop.fs.s3a.secret.key='redact2' --conf > spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.hadoop.fs.s3a.session.token='redact3' --conf > spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider > --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf > spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' > --total-executor-cores 4 --executor-cores 2 --executor-memory 2g >
[jira] [Updated] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS
[ https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-35974: - Affects Version/s: (was: 2.4.8) 3.1.2 > Spark submit REST cluster/standalone mode - launching an s3a jar with STS > - > > Key: SPARK-35974 > URL: https://issues.apache.org/jira/browse/SPARK-35974 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.1.2 >Reporter: t oo >Priority: Major > > {code:java} > /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master > spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf > spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.hadoop.fs.s3a.secret.key='redact2' --conf > spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.hadoop.fs.s3a.session.token='redact3' --conf > spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider > --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf > spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' > --total-executor-cores 4 --executor-cores 2 --executor-memory 2g > --driver-memory 1g --name lin1 --deploy-mode cluster --conf > spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku > s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml > {code} > running the above command give below stack trace: > > {code:java} > Exception from the cluster:\njava.nio.file.AccessDeniedException: > s3a://mybuc/metorikku_2.11.jar: getFileStatus on > s3a://mybuc/metorikku_2.11.jar: > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended > Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\ > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158) > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101) > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542) > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117) > org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463) > org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030) > org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747) > org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723) > org.apache.spark.util.Utils$.fetchFile(Utils.scala:509) > org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155) > org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173) > org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code} > all the ec2s in the spark cluster only have access to s3 via STS tokens. The > jar itself reads csvs from s3 using the tokens, and everything works if > either 1. i change the commandline to point to local jars on the ec2 OR 2. > use port 7077/client mode instead of cluster mode. But it seems the jar > itself can't be launched off s3, as if the tokens are not being picked up > properly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS
[ https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17390198#comment-17390198 ] t oo commented on SPARK-35974: -- same issue on spark 3.1.2: { "action" : "SubmissionStatusResponse", "driverState" : "ERROR", "message" : "Exception from the cluster:\njava.nio.file.AccessDeniedException: s3a://redact/ingestion-0.5.2-SNAPSHOT.jar: getFileStatus on s3a://redact/ingestion-0.5.2-SNAPSHOT.jar: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: hidden; S3 Extended Request ID: hideit), S3 Extended Request ID: hideit\n\torg.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158)\n\torg.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117)\n\torg.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463)\n\torg.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030)\n\torg.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:799)\n\torg.apache.spark.util.Utils$.doFetchFile(Utils.scala:776)\n\torg.apache.spark.util.Utils$.fetchFile(Utils.scala:541)\n\torg.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:162)\n\torg.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:180)\n\torg.apache.spark.deploy.worker.DriverRunner$$anon$2.run(DriverRunner.scala:99)", "serverSparkVersion" : "3.1.2", "submissionId" : "driver-20210729233253-0001", "success" : true, "workerHostPort" : "10.redact:17537", "workerId" : "worker-20210729232355-10.redact-17537" } > Spark submit REST cluster/standalone mode - launching an s3a jar with STS > - > > Key: SPARK-35974 > URL: https://issues.apache.org/jira/browse/SPARK-35974 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 3.1.2 >Reporter: t oo >Priority: Major > > {code:java} > /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master > spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf > spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.hadoop.fs.s3a.secret.key='redact2' --conf > spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.hadoop.fs.s3a.session.token='redact3' --conf > spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider > --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf > spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' > --total-executor-cores 4 --executor-cores 2 --executor-memory 2g > --driver-memory 1g --name lin1 --deploy-mode cluster --conf > spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku > s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml > {code} > running the above command give below stack trace: > > {code:java} > Exception from the cluster:\njava.nio.file.AccessDeniedException: > s3a://mybuc/metorikku_2.11.jar: getFileStatus on > s3a://mybuc/metorikku_2.11.jar: > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended > Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\ > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158) > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101) > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542) > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117) > org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463) > org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030) > org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747) > org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723) > org.apache.spark.util.Utils$.fetchFile(Utils.scala:509) > org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155) > org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173) > org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code} > all the ec2s in the spark cluster only have access to s3 via STS tokens. The > jar itself reads csvs from s3 using the tokens, and everything works if >
[jira] [Reopened] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS
[ https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo reopened SPARK-35974: -- > Spark submit REST cluster/standalone mode - launching an s3a jar with STS > - > > Key: SPARK-35974 > URL: https://issues.apache.org/jira/browse/SPARK-35974 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.8 >Reporter: t oo >Priority: Major > > {code:java} > /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master > spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf > spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.hadoop.fs.s3a.secret.key='redact2' --conf > spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.hadoop.fs.s3a.session.token='redact3' --conf > spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider > --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf > spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' > --total-executor-cores 4 --executor-cores 2 --executor-memory 2g > --driver-memory 1g --name lin1 --deploy-mode cluster --conf > spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku > s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml > {code} > running the above command give below stack trace: > > {code:java} > Exception from the cluster:\njava.nio.file.AccessDeniedException: > s3a://mybuc/metorikku_2.11.jar: getFileStatus on > s3a://mybuc/metorikku_2.11.jar: > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended > Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\ > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158) > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101) > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542) > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117) > org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463) > org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030) > org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747) > org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723) > org.apache.spark.util.Utils$.fetchFile(Utils.scala:509) > org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155) > org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173) > org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code} > all the ec2s in the spark cluster only have access to s3 via STS tokens. The > jar itself reads csvs from s3 using the tokens, and everything works if > either 1. i change the commandline to point to local jars on the ec2 OR 2. > use port 7077/client mode instead of cluster mode. But it seems the jar > itself can't be launched off s3, as if the tokens are not being picked up > properly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-36147) [SQL] - log level should be warning if files not found in BasicWriteStatsTracker
t oo created SPARK-36147: Summary: [SQL] - log level should be warning if files not found in BasicWriteStatsTracker Key: SPARK-36147 URL: https://issues.apache.org/jira/browse/SPARK-36147 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 3.1.2 Reporter: t oo This log should at least be WARN not INFO (in [org/apache/spark/sql/execution/datasources/BasicWriteStatsTracker.scala|https://github.com/apache/spark/pull/2#diff-2a68131bcf611ecf51956d2ba63e8a9ee6d1d20a7a54eeeb4195a4ba1d365134] ) "Expected $numSubmittedFiles files, but only saw $numFiles." in my case (using s3) just yesterday, the file was never created and job didn't fail! so I ended up with silently missing data -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS
[ https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-35974: - Affects Version/s: (was: 2.4.6) 2.4.8 Description: {code:java} /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf spark.hadoop.fs.s3a.secret.key='redact2' --conf spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf spark.hadoop.fs.s3a.session.token='redact3' --conf spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --total-executor-cores 4 --executor-cores 2 --executor-memory 2g --driver-memory 1g --name lin1 --deploy-mode cluster --conf spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml {code} running the above command give below stack trace: {code:java} Exception from the cluster:\njava.nio.file.AccessDeniedException: s3a://mybuc/metorikku_2.11.jar: getFileStatus on s3a://mybuc/metorikku_2.11.jar: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\ org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158) org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101) org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542) org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117) org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463) org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030) org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747) org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723) org.apache.spark.util.Utils$.fetchFile(Utils.scala:509) org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155) org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173) org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code} all the ec2s in the spark cluster only have access to s3 via STS tokens. The jar itself reads csvs from s3 using the tokens, and everything works if either 1. i change the commandline to point to local jars on the ec2 OR 2. use port 7077/client mode instead of cluster mode. But it seems the jar itself can't be launched off s3, as if the tokens are not being picked up properly. was: {code:java} /var/lib/spark-2.3.4-bin-hadoop2.7/bin/spark-submit --master spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf spark.hadoop.fs.s3a.secret.key='redact2' --conf spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf spark.hadoop.fs.s3a.session.token='redact3' --conf spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --total-executor-cores 4 --executor-cores 2 --executor-memory 2g --driver-memory 1g --name lin1 --deploy-mode cluster --conf spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml {code} running the above command give below stack trace: {code:java} Exception from the cluster:\njava.nio.file.AccessDeniedException: s3a://mybuc/metorikku_2.11.jar: getFileStatus on s3a://mybuc/metorikku_2.11.jar: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\ org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158) org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101)
[jira] [Reopened] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS
[ https://issues.apache.org/jira/browse/SPARK-35974?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo reopened SPARK-35974: -- v2.4.8 is less than 2 months old > Spark submit REST cluster/standalone mode - launching an s3a jar with STS > - > > Key: SPARK-35974 > URL: https://issues.apache.org/jira/browse/SPARK-35974 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.4.8 >Reporter: t oo >Priority: Major > > {code:java} > /var/lib/spark-2.4.8-bin-hadoop2.7/bin/spark-submit --master > spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf > spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf > spark.hadoop.fs.s3a.secret.key='redact2' --conf > spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf > spark.hadoop.fs.s3a.session.token='redact3' --conf > spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf > spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider > --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf > spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 > -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' > --total-executor-cores 4 --executor-cores 2 --executor-memory 2g > --driver-memory 1g --name lin1 --deploy-mode cluster --conf > spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku > s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml > {code} > running the above command give below stack trace: > > {code:java} > Exception from the cluster:\njava.nio.file.AccessDeniedException: > s3a://mybuc/metorikku_2.11.jar: getFileStatus on > s3a://mybuc/metorikku_2.11.jar: > com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon > S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended > Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\ > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158) > org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101) > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542) > org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117) > org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463) > org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030) > org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747) > org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723) > org.apache.spark.util.Utils$.fetchFile(Utils.scala:509) > org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155) > org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173) > org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code} > all the ec2s in the spark cluster only have access to s3 via STS tokens. The > jar itself reads csvs from s3 using the tokens, and everything works if > either 1. i change the commandline to point to local jars on the ec2 OR 2. > use port 7077/client mode instead of cluster mode. But it seems the jar > itself can't be launched off s3, as if the tokens are not being picked up > properly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35974) Spark submit REST cluster/standalone mode - launching an s3a jar with STS
t oo created SPARK-35974: Summary: Spark submit REST cluster/standalone mode - launching an s3a jar with STS Key: SPARK-35974 URL: https://issues.apache.org/jira/browse/SPARK-35974 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.4.6 Reporter: t oo {code:java} /var/lib/spark-2.3.4-bin-hadoop2.7/bin/spark-submit --master spark://myhost:6066 --conf spark.hadoop.fs.s3a.access.key='redact1' --conf spark.executorEnv.AWS_ACCESS_KEY_ID='redact1' --conf spark.driverEnv.AWS_ACCESS_KEY_ID='redact1' --conf spark.hadoop.fs.s3a.secret.key='redact2' --conf spark.executorEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf spark.driverEnv.AWS_SECRET_ACCESS_KEY='redact2' --conf spark.hadoop.fs.s3a.session.token='redact3' --conf spark.executorEnv.AWS_SESSION_TOKEN='redact3' --conf spark.driverEnv.AWS_SESSION_TOKEN='redact3' --conf spark.hadoop.fs.s3a.aws.credentials.provider=org.apache.hadoop.fs.s3a.TemporaryAWSCredentialsProvider --conf spark.driver.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --conf spark.executor.extraJavaOptions='-DAWS_ACCESS_KEY_ID=redact1 -DAWS_SECRET_ACCESS_KEY=redact2 -DAWS_SESSION_TOKEN=redact3' --total-executor-cores 4 --executor-cores 2 --executor-memory 2g --driver-memory 1g --name lin1 --deploy-mode cluster --conf spark.eventLog.enabled=false --class com.yotpo.metorikku.Metorikku s3a://mybuc/metorikku_2.11.jar -c s3a://mybuc/spark_ingestion_job.yaml {code} running the above command give below stack trace: {code:java} Exception from the cluster:\njava.nio.file.AccessDeniedException: s3a://mybuc/metorikku_2.11.jar: getFileStatus on s3a://mybuc/metorikku_2.11.jar: com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; Request ID: xx; S3 Extended Request ID: /1qj/yy=), S3 Extended Request ID: /1qj/yy=\n\ org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:158) org.apache.hadoop.fs.s3a.S3AUtils.translateException(S3AUtils.java:101) org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1542) org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:117) org.apache.hadoop.fs.FileSystem.isFile(FileSystem.java:1463) org.apache.hadoop.fs.s3a.S3AFileSystem.isFile(S3AFileSystem.java:2030) org.apache.spark.util.Utils$.fetchHcfsFile(Utils.scala:747) org.apache.spark.util.Utils$.doFetchFile(Utils.scala:723) org.apache.spark.util.Utils$.fetchFile(Utils.scala:509) org.apache.spark.deploy.worker.DriverRunner.downloadUserJar(DriverRunner.scala:155) org.apache.spark.deploy.worker.DriverRunner.prepareAndRunDriver(DriverRunner.scala:173) org.apache.spark.deploy.worker.DriverRunner$$anon$1.run(DriverRunner.scala:92){code} all the ec2s in the spark cluster only have access to s3 via STS tokens. The jar itself reads csvs from s3 using the tokens, and everything works if either 1. i change the commandline to point to local jars on the ec2 OR 2. use port 7077/client mode instead of cluster mode. But it seems the jar itself can't be launched off s3, as if the tokens are not being picked up properly. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-35249) to_timestamp can't parse 6 digit microsecond SSSSSS
t oo created SPARK-35249: Summary: to_timestamp can't parse 6 digit microsecond SS Key: SPARK-35249 URL: https://issues.apache.org/jira/browse/SPARK-35249 Project: Spark Issue Type: Wish Components: SQL Affects Versions: 2.4.6 Reporter: t oo spark-sql> select x, to_timestamp(x,"MMM dd hh:mm:ss.SS") from (select 'Apr 13 2021 12:00:00.001000AM' x); Apr 13 2021 12:00:00.001000AM NULL Why doesn't the to_timestamp work? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-32924) Web UI sort on duration is wrong
[ https://issues.apache.org/jira/browse/SPARK-32924?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17236762#comment-17236762 ] t oo commented on SPARK-32924: -- [~pralabhkumar] yes > Web UI sort on duration is wrong > > > Key: SPARK-32924 > URL: https://issues.apache.org/jira/browse/SPARK-32924 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.4.6 >Reporter: t oo >Priority: Major > Attachments: ui_sort.png > > > See attachment, 9 s(econds) is showing as larger than 8.1min -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-33085) "Master removed our application" error leads to FAILED driver status instead of KILLED driver status
[ https://issues.apache.org/jira/browse/SPARK-33085?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17213815#comment-17213815 ] t oo commented on SPARK-33085: -- # start long running spark job in cluster mode # terminate all spark workers # check status of driver > "Master removed our application" error leads to FAILED driver status instead > of KILLED driver status > > > Key: SPARK-33085 > URL: https://issues.apache.org/jira/browse/SPARK-33085 > Project: Spark > Issue Type: Bug > Components: Scheduler, Spark Core >Affects Versions: 2.4.6 >Reporter: t oo >Priority: Major > > > driver-20200930160855-0316 exited with status FAILED > > I am using Spark Standalone scheduler with spot ec2 workers. I confirmed that > myip.87 EC2 instance was terminated at 2020-09-30 16:16 > > *I would expect the overall driver status to be KILLED but instead it was > FAILED*, my goal is to interpret FAILED status as 'don't rerun as > non-transient error faced' but KILLED/ERROR status as 'yes, rerun as > transient error faced'. But it looks like FAILED status is being set in below > case of transient error: > > Below are driver logs > {code:java} > 2020-09-30 16:12:41,183 [main] INFO > com.yotpo.metorikku.output.writers.file.FileOutputWriter - Writing file to > s3a://redacted2020-09-30 16:12:41,183 [main] INFO > com.yotpo.metorikku.output.writers.file.FileOutputWriter - Writing file to > s3a://redacted20-09-30 16:16:40,366 [dispatcher-event-loop-15] ERROR > org.apache.spark.scheduler.TaskSchedulerImpl - Lost executor 0 on myip.87: > Remote RPC client disassociated. Likely due to containers exceeding > thresholds, or network issues. Check driver logs for WARN messages.2020-09-30 > 16:16:40,372 [dispatcher-event-loop-15] WARN > org.apache.spark.scheduler.TaskSetManager - Lost task 0.0 in stage 6.0 (TID > 6, myip.87, executor 0): ExecutorLostFailure (executor 0 exited caused by one > of the running tasks) Reason: Remote RPC client disassociated. Likely due to > containers exceeding thresholds, or network issues. Check driver logs for > WARN messages.2020-09-30 16:16:40,376 [dispatcher-event-loop-13] WARN > org.apache.spark.storage.BlockManagerMasterEndpoint - No more replicas > available for rdd_3_0 !2020-09-30 16:16:40,398 [dispatcher-event-loop-2] INFO > org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor > app-20200930160902-0895/0 removed: Worker shutting down2020-09-30 > 16:16:40,399 [dispatcher-event-loop-2] INFO > org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted > executor ID app-20200930160902-0895/1 on hostPort myip.87:11647 with 2 > core(s), 5.0 GB RAM2020-09-30 16:16:40,401 [dispatcher-event-loop-5] INFO > org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor > app-20200930160902-0895/1 removed: java.lang.IllegalStateException: Shutdown > hooks cannot be modified during shutdown.2020-09-30 16:16:40,402 > [dispatcher-event-loop-5] INFO > org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted > executor ID app-20200930160902-0895/2 on hostPort myip.87:11647 with 2 > core(s), 5.0 GB RAM2020-09-30 16:16:40,403 [dispatcher-event-loop-11] INFO > org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor > app-20200930160902-0895/2 removed: java.lang.IllegalStateException: Shutdown > hooks cannot be modified during shutdown.2020-09-30 16:16:40,404 > [dispatcher-event-loop-11] INFO > org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted > executor ID app-20200930160902-0895/3 on hostPort myip.87:11647 with 2 > core(s), 5.0 GB RAM2020-09-30 16:16:40,405 [dispatcher-event-loop-1] INFO > org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor > app-20200930160902-0895/3 removed: java.lang.IllegalStateException: Shutdown > hooks cannot be modified during shutdown.2020-09-30 16:16:40,406 > [dispatcher-event-loop-1] INFO > org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted > executor ID app-20200930160902-0895/4 on hostPort myip.87:11647 with 2 > core(s), 5.0 GB RAM2020-09-30 16:16:40,407 [dispatcher-event-loop-12] INFO > org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor > app-20200930160902-0895/4 removed: java.lang.IllegalStateException: Shutdown > hooks cannot be modified during shutdown.2020-09-30 16:16:40,408 > [dispatcher-event-loop-12] INFO > org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted > executor ID app-20200930160902-0895/5 on hostPort myip.87:11647 with 2 > core(s), 5.0 GB RAM2020-09-30 16:16:40,409 [dispatcher-event-loop-4] INFO > org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend -
[jira] [Commented] (SPARK-27733) Upgrade to Avro 1.10.0
[ https://issues.apache.org/jira/browse/SPARK-27733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17211955#comment-17211955 ] t oo commented on SPARK-27733: -- hive 1 gone I think > Upgrade to Avro 1.10.0 > -- > > Key: SPARK-27733 > URL: https://issues.apache.org/jira/browse/SPARK-27733 > Project: Spark > Issue Type: Improvement > Components: Build, SQL >Affects Versions: 3.1.0 >Reporter: Ismaël Mejía >Priority: Minor > > Avro 1.9.2 was released with many nice features including reduced size (1MB > less), and removed dependencies, no paranamer, no shaded guava, security > updates, so probably a worth upgrade. > Avro 1.10.0 was released and this is still not done. > There is at the moment (2020/08) still a blocker because of Hive related > transitive dependencies bringing older versions of Avro, so we could say that > this is somehow still blocked until HIVE-21737 is solved. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-33085) "Master removed our application" error leads to FAILED driver status instead of KILLED driver status
t oo created SPARK-33085: Summary: "Master removed our application" error leads to FAILED driver status instead of KILLED driver status Key: SPARK-33085 URL: https://issues.apache.org/jira/browse/SPARK-33085 Project: Spark Issue Type: Bug Components: Scheduler, Spark Core Affects Versions: 2.4.6 Reporter: t oo driver-20200930160855-0316 exited with status FAILED I am using Spark Standalone scheduler with spot ec2 workers. I confirmed that myip.87 EC2 instance was terminated at 2020-09-30 16:16 *I would expect the overall driver status to be KILLED but instead it was FAILED*, my goal is to interpret FAILED status as 'don't rerun as non-transient error faced' but KILLED/ERROR status as 'yes, rerun as transient error faced'. But it looks like FAILED status is being set in below case of transient error: Below are driver logs {code:java} 2020-09-30 16:12:41,183 [main] INFO com.yotpo.metorikku.output.writers.file.FileOutputWriter - Writing file to s3a://redacted2020-09-30 16:12:41,183 [main] INFO com.yotpo.metorikku.output.writers.file.FileOutputWriter - Writing file to s3a://redacted20-09-30 16:16:40,366 [dispatcher-event-loop-15] ERROR org.apache.spark.scheduler.TaskSchedulerImpl - Lost executor 0 on myip.87: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.2020-09-30 16:16:40,372 [dispatcher-event-loop-15] WARN org.apache.spark.scheduler.TaskSetManager - Lost task 0.0 in stage 6.0 (TID 6, myip.87, executor 0): ExecutorLostFailure (executor 0 exited caused by one of the running tasks) Reason: Remote RPC client disassociated. Likely due to containers exceeding thresholds, or network issues. Check driver logs for WARN messages.2020-09-30 16:16:40,376 [dispatcher-event-loop-13] WARN org.apache.spark.storage.BlockManagerMasterEndpoint - No more replicas available for rdd_3_0 !2020-09-30 16:16:40,398 [dispatcher-event-loop-2] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor app-20200930160902-0895/0 removed: Worker shutting down2020-09-30 16:16:40,399 [dispatcher-event-loop-2] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted executor ID app-20200930160902-0895/1 on hostPort myip.87:11647 with 2 core(s), 5.0 GB RAM2020-09-30 16:16:40,401 [dispatcher-event-loop-5] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor app-20200930160902-0895/1 removed: java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.2020-09-30 16:16:40,402 [dispatcher-event-loop-5] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted executor ID app-20200930160902-0895/2 on hostPort myip.87:11647 with 2 core(s), 5.0 GB RAM2020-09-30 16:16:40,403 [dispatcher-event-loop-11] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor app-20200930160902-0895/2 removed: java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.2020-09-30 16:16:40,404 [dispatcher-event-loop-11] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted executor ID app-20200930160902-0895/3 on hostPort myip.87:11647 with 2 core(s), 5.0 GB RAM2020-09-30 16:16:40,405 [dispatcher-event-loop-1] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor app-20200930160902-0895/3 removed: java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.2020-09-30 16:16:40,406 [dispatcher-event-loop-1] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted executor ID app-20200930160902-0895/4 on hostPort myip.87:11647 with 2 core(s), 5.0 GB RAM2020-09-30 16:16:40,407 [dispatcher-event-loop-12] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor app-20200930160902-0895/4 removed: java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.2020-09-30 16:16:40,408 [dispatcher-event-loop-12] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted executor ID app-20200930160902-0895/5 on hostPort myip.87:11647 with 2 core(s), 5.0 GB RAM2020-09-30 16:16:40,409 [dispatcher-event-loop-4] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor app-20200930160902-0895/5 removed: java.lang.IllegalStateException: Shutdown hooks cannot be modified during shutdown.2020-09-30 16:16:40,410 [dispatcher-event-loop-5] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Granted executor ID app-20200930160902-0895/6 on hostPort myip.87:11647 with 2 core(s), 5.0 GB RAM2020-09-30 16:16:40,420 [dispatcher-event-loop-9] INFO org.apache.spark.scheduler.cluster.StandaloneSchedulerBackend - Executor app-20200930160902-0895/6 removed:
[jira] [Created] (SPARK-33053) Document driver states (difference between FAILED/ERROR/KILLED)
t oo created SPARK-33053: Summary: Document driver states (difference between FAILED/ERROR/KILLED) Key: SPARK-33053 URL: https://issues.apache.org/jira/browse/SPARK-33053 Project: Spark Issue Type: Documentation Components: docs, Documentation Affects Versions: 2.4.6 Reporter: t oo Looking at Spark website i could not find any documentation on difference between these driver states: ERROR FAILED KILLED -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-32924) Web UI sort on duration is wrong
t oo created SPARK-32924: Summary: Web UI sort on duration is wrong Key: SPARK-32924 URL: https://issues.apache.org/jira/browse/SPARK-32924 Project: Spark Issue Type: Bug Components: Web UI Affects Versions: 2.4.6 Reporter: t oo Attachments: ui_sort.png See attachment, 9 s(econds) is showing as larger than 8.1min -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32924) Web UI sort on duration is wrong
[ https://issues.apache.org/jira/browse/SPARK-32924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-32924: - Attachment: ui_sort.png > Web UI sort on duration is wrong > > > Key: SPARK-32924 > URL: https://issues.apache.org/jira/browse/SPARK-32924 > Project: Spark > Issue Type: Bug > Components: Web UI >Affects Versions: 2.4.6 >Reporter: t oo >Priority: Major > Attachments: ui_sort.png > > > See attachment, 9 s(econds) is showing as larger than 8.1min -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-32373) Spark Standalone - RetryingBlockFetcher tries to get block from worker even 10mins after it was de-registered from spark cluster
t oo created SPARK-32373: Summary: Spark Standalone - RetryingBlockFetcher tries to get block from worker even 10mins after it was de-registered from spark cluster Key: SPARK-32373 URL: https://issues.apache.org/jira/browse/SPARK-32373 Project: Spark Issue Type: Bug Components: Block Manager, Scheduler, Shuffle, Spark Core Affects Versions: 2.4.6 Reporter: t oo Using spark standalone in 2.4.6 with spot ec2 instances, the .242 IP instance was terminated at 12:00:11pm, before then it appeared registered in Spark UI as ALIVE for few hours, it then appeared in Spark UI as DEAD until 12:16pm, then it disappeared from Spark UI completely. An app that started at 11:24am had below error. As you can see in below app log from another worker it is still trying to get shuffle block from .242 IP at 12:10pm (10mins after the worker was removed from the spark cluster). I would expect that once within 2mins of the worker being removed from the cluster that it would stop retrying {code:java} 2020-07-20 12:10:02,702 [Block Fetch Retry-9-3] ERROR org.apache.spark.network.shuffle.RetryingBlockFetcher - Exception while beginning fetch of 1 outstanding blocks (after 3 retries) java.io.IOException: Connecting to /redact.242:7337 timed out (12 ms) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:243) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187) at org.apache.spark.network.shuffle.ExternalShuffleClient.lambda$fetchBlocks$0(ExternalShuffleClient.java:100) at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:141) at org.apache.spark.network.shuffle.RetryingBlockFetcher.lambda$initiateRetry$0(RetryingBlockFetcher.java:169) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) 2020-07-20 12:07:57,700 [Block Fetch Retry-9-2] ERROR org.apache.spark.network.shuffle.RetryingBlockFetcher - Exception while beginning fetch of 1 outstanding blocks (after 2 retries) java.io.IOException: Connecting to /redact.242:7337 timed out (12 ms) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:243) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187) at org.apache.spark.network.shuffle.ExternalShuffleClient.lambda$fetchBlocks$0(ExternalShuffleClient.java:100) at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:141) at org.apache.spark.network.shuffle.RetryingBlockFetcher.lambda$initiateRetry$0(RetryingBlockFetcher.java:169) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.lang.Thread.run(Thread.java:748) 2020-07-20 12:05:52,697 [Block Fetch Retry-9-1] ERROR org.apache.spark.network.shuffle.RetryingBlockFetcher - Exception while beginning fetch of 1 outstanding blocks (after 1 retries) java.io.IOException: Connecting to /redact.242:7337 timed out (12 ms) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:243) at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187) at org.apache.spark.network.shuffle.ExternalShuffleClient.lambda$fetchBlocks$0(ExternalShuffleClient.java:100) at org.apache.spark.network.shuffle.RetryingBlockFetcher.fetchAllOutstanding(RetryingBlockFetcher.java:141) at org.apache.spark.network.shuffle.RetryingBlockFetcher.lambda$initiateRetry$0(RetryingBlockFetcher.java:169) at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) at java.util.concurrent.FutureTask.run(FutureTask.java:266) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at
[jira] [Updated] (SPARK-32197) 'Spark driver' stays running even though 'spark application' has FAILED
[ https://issues.apache.org/jira/browse/SPARK-32197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-32197: - Description: App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect driver to fail if app fails. Thread dump from jstack (on the driver pid) attached (j1.out) Last part of stdout driver log attached (full log is 23MB, stderr log just has launch command) Last part of app logs attached Can see that "org.apache.spark.util.ShutdownHookManager - Shutdown hook called" line never appears in the driver log after "org.apache.spark.SparkContext - Successfully stopped SparkContext" Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 6066) in cluster mode was used. Other drivers/apps have worked fine with this setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot terminate at any time. From checking aws logs: the worker was terminated at 01:53:38 I think you can replicate this by tearing down worker machine while app is running. You might have to try several times. Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before! was: App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect driver to fail if app fails. Thread dump from jstack (on the driver pid) attached (j1.out) Last part of stdout driver log attached (full log is 23MB, stderr log just has launch command) Last part of app logs attached Can see that "org.apache.spark.util.ShutdownHookManager - Shutdown hook called" ine never appears in the driver log after "org.apache.spark.SparkContext - Successfully stopped SparkContext" Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 6066) in cluster mode was used. Other drivers/apps have worked fine with this setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot terminate at any time. From checking aws logs: the worker was terminated at 01:53:38 I think you can replicate this by tearing down worker machine while app is running. You might have to try several times. Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before! > 'Spark driver' stays running even though 'spark application' has FAILED > --- > > Key: SPARK-32197 > URL: https://issues.apache.org/jira/browse/SPARK-32197 > Project: Spark > Issue Type: Bug > Components: Scheduler, Spark Core >Affects Versions: 2.4.6 >Reporter: t oo >Priority: Blocker > Attachments: app_executors.png, applog.txt, driverlog.txt, > failed1.png, failed_stages.png, failedapp.png, j1.out, stuckdriver.png > > > App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect > driver to fail if app fails. > > Thread dump from jstack (on the driver pid) attached (j1.out) > Last part of stdout driver log attached (full log is 23MB, stderr log just > has launch command) > Last part of app logs attached > > Can see that "org.apache.spark.util.ShutdownHookManager - Shutdown hook > called" line never appears in the driver log after > "org.apache.spark.SparkContext - Successfully stopped SparkContext" > > Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port > 6066) in cluster mode was used. Other drivers/apps have worked fine with this > setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark > master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot > terminate at any time. From checking aws logs: the worker was terminated at > 01:53:38 > > I think you can replicate this by tearing down worker machine while app is > running. You might have to try several times. > > Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before! > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32197) 'Spark driver' stays running even though 'spark application' has FAILED
[ https://issues.apache.org/jira/browse/SPARK-32197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-32197: - Description: App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect driver to fail if app fails. Thread dump from jstack (on the driver pid) attached (j1.out) Last part of stdout driver log attached (full log is 23MB, stderr log just has launch command) Last part of app logs attached Can see that "org.apache.spark.util.ShutdownHookManager - Shutdown hook called" ine never appears in the driver log after "org.apache.spark.SparkContext - Successfully stopped SparkContext" Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 6066) in cluster mode was used. Other drivers/apps have worked fine with this setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot terminate at any time. From checking aws logs: the worker was terminated at 01:53:38 I think you can replicate this by tearing down worker machine while app is running. You might have to try several times. Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before! was: App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect driver to fail if app fails. Thread dump from jstack (on the driver pid) attached (j1.out) Last part of stdout driver log attached (full log is 23MB, stderr log just has launch command) Last part of app logs attached Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 6066) in cluster mode was used. Other drivers/apps have worked fine with this setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot terminate at any time. From checking aws logs: the worker was terminated at 01:53:38 I think you can replicate this by tearing down worker machine while app is running. You might have to try several times. Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before! > 'Spark driver' stays running even though 'spark application' has FAILED > --- > > Key: SPARK-32197 > URL: https://issues.apache.org/jira/browse/SPARK-32197 > Project: Spark > Issue Type: Bug > Components: Scheduler, Spark Core >Affects Versions: 2.4.6 >Reporter: t oo >Priority: Blocker > Attachments: app_executors.png, applog.txt, driverlog.txt, > failed1.png, failed_stages.png, failedapp.png, j1.out, stuckdriver.png > > > App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect > driver to fail if app fails. > > Thread dump from jstack (on the driver pid) attached (j1.out) > Last part of stdout driver log attached (full log is 23MB, stderr log just > has launch command) > Last part of app logs attached > > Can see that "org.apache.spark.util.ShutdownHookManager - Shutdown hook > called" ine never appears in the driver log after > "org.apache.spark.SparkContext - Successfully stopped SparkContext" > > Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port > 6066) in cluster mode was used. Other drivers/apps have worked fine with this > setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark > master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot > terminate at any time. From checking aws logs: the worker was terminated at > 01:53:38 > > I think you can replicate this by tearing down worker machine while app is > running. You might have to try several times. > > Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before! > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32197) 'Spark driver' stays running even though 'spark application' has FAILED
[ https://issues.apache.org/jira/browse/SPARK-32197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-32197: - Attachment: app_executors.png failed_stages.png failed1.png stuckdriver.png failedapp.png > 'Spark driver' stays running even though 'spark application' has FAILED > --- > > Key: SPARK-32197 > URL: https://issues.apache.org/jira/browse/SPARK-32197 > Project: Spark > Issue Type: Bug > Components: Scheduler, Spark Core >Affects Versions: 2.4.6 >Reporter: t oo >Priority: Blocker > Attachments: app_executors.png, applog.txt, driverlog.txt, > failed1.png, failed_stages.png, failedapp.png, j1.out, stuckdriver.png > > > App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect > driver to fail if app fails. > > Thread dump from jstack (on the driver pid) attached (j1.out) > Last part of stdout driver log attached (full log is 23MB, stderr log just > has launch command) > Last part of app logs attached > > > > Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port > 6066) in cluster mode was used. Other drivers/apps have worked fine with this > setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark > master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot > terminate at any time. From checking aws logs: the worker was terminated at > 01:53:38 > > I think you can replicate this by tearing down worker machine while app is > running. You might have to try several times. > > Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before! > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-32197) 'Spark driver' stays running even though 'spark application' has FAILED
t oo created SPARK-32197: Summary: 'Spark driver' stays running even though 'spark application' has FAILED Key: SPARK-32197 URL: https://issues.apache.org/jira/browse/SPARK-32197 Project: Spark Issue Type: Bug Components: Scheduler, Spark Core Affects Versions: 2.4.6 Reporter: t oo Attachments: applog.txt, driverlog.txt, j1.out App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect driver to fail if app fails. Thread dump from jstack (on the driver pid) attached (j1.out) Last part of stdout driver log attached (full log is 23MB, stderr log just has launch command) Last part of app logs attached Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port 6066) in cluster mode was used. Other drivers/apps have worked fine with this setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot terminate at any time. From checking aws logs: the worker was terminated at 01:53:38 I think you can replicate this by tearing down worker machine while app is running. You might have to try several times. Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before! -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-32197) 'Spark driver' stays running even though 'spark application' has FAILED
[ https://issues.apache.org/jira/browse/SPARK-32197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-32197: - Attachment: applog.txt j1.out driverlog.txt > 'Spark driver' stays running even though 'spark application' has FAILED > --- > > Key: SPARK-32197 > URL: https://issues.apache.org/jira/browse/SPARK-32197 > Project: Spark > Issue Type: Bug > Components: Scheduler, Spark Core >Affects Versions: 2.4.6 >Reporter: t oo >Priority: Blocker > Attachments: applog.txt, driverlog.txt, j1.out > > > App failed in 6 minutes, driver has been stuck for > 8 hours. I would expect > driver to fail if app fails. > > Thread dump from jstack (on the driver pid) attached (j1.out) > Last part of stdout driver log attached (full log is 23MB, stderr log just > has launch command) > Last part of app logs attached > > > > Using spark 2.4.6 with spark standalone mode. spark-submit to REST API (port > 6066) in cluster mode was used. Other drivers/apps have worked fine with this > setup, just this one getting stuck. My cluster has 1 EC2 dedicated as spark > master and 1 Spot EC2 dedicated as spark worker. They can auto heal/spot > terminate at any time. From checking aws logs: the worker was terminated at > 01:53:38 > > I think you can replicate this by tearing down worker machine while app is > running. You might have to try several times. > > Similar to https://issues.apache.org/jira/browse/SPARK-24617 i raised before! > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-32040) Idle cores not being allocated
t oo created SPARK-32040: Summary: Idle cores not being allocated Key: SPARK-32040 URL: https://issues.apache.org/jira/browse/SPARK-32040 Project: Spark Issue Type: Bug Components: Scheduler Affects Versions: 2.4.5 Reporter: t oo Background: I have a cluster (2.4.5) using standalone mode orchestrated by Nomad jobs running on EC2. We deploy a Scala web server as a long running jar via `spark-submit` in client mode. Sometimes we get into a state where the application ends up with 0 cores due to our in-house autoscaler scaling down and killing workers without checking if any of the cores in the worker were allocated to existing applications. These applications then end up with 0 cores, even though there are healthy workers in the cluster. However, if i submit a new application or register a new worker in the cluster, only then will the master finally reallocate cores to the application. This is problematic, because the long running 0 core application is stuck. Could this be related to the fact that `schedule()` is only triggered by new workers / new applications as commented here? [https://github.com/apache/spark/blob/v2.4.5/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L721-L724] If that is the case, should the application be calling `schedule()` when removing workers after calling `timeOutWorkers()`? [https://github.com/apache/spark/blob/v2.4.5/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L417] The downscaling causes me to see this in my logs, so i am fairly certain `timeOutWorkers()` is being called: ``` 20/06/08 11:40:56 INFO Master: Application app-20200608114056-0006 requested to set total executors to 1. 20/06/08 11:40:56 INFO Master: Launching executor app-20200608114056-0006/0 on worker worker-20200608113523--7077 20/06/08 11:41:44 WARN Master: Removing worker-20200608113523--7077 because we got no heartbeat in 60 seconds 20/06/08 11:41:44 INFO Master: Removing worker worker-20200608113523--7077 on :7077 20/06/08 11:41:44 INFO Master: Telling app of lost executor: 0 20/06/08 11:41:44 INFO Master: Telling app of lost worker: worker-20200608113523-10.158.242.213-7077 ``` -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-29037) [Core] Spark gives duplicate result when an application was killed and rerun
[ https://issues.apache.org/jira/browse/SPARK-29037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17095434#comment-17095434 ] t oo commented on SPARK-29037: -- with spark 2.3.4 and hadoop 2.8.5: i am facing this doing simple Overwrite mode (not dynamic overwrite) and "spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version" = 2. Scenario: 1. successful spark run = single data, 2. successful spark run = single data, 3. aborted spark run during write stage, 4. successful spark run = dupe data, 5. successful spark run = triple data...etc > [Core] Spark gives duplicate result when an application was killed and rerun > > > Key: SPARK-29037 > URL: https://issues.apache.org/jira/browse/SPARK-29037 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.3.3 >Reporter: feiwang >Priority: Major > Attachments: screenshot-1.png > > > For InsertIntoHadoopFsRelation operations. > Case A: > Application appA insert overwrite table table_a with static partition > overwrite. > But it was killed when committing tasks, because one task is hang. > And parts of its committed tasks output is kept under > /path/table_a/_temporary/0/. > Then we rerun appA. It will reuse the staging dir /path/table_a/_temporary/0/. > It executes successfully. > But it also commit the data reminded by killed application to destination dir. > Case B: > Application appA insert overwrite table table_a. > Application appB insert overwrite table table_a, too. > They execute concurrently, and they may all use /path/table_a/_temporary/0/ > as workPath. > And their result may be corruptted. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24617) Spark driver not requesting another executor once original executor exits due to 'lost worker'
[ https://issues.apache.org/jira/browse/SPARK-24617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17048627#comment-17048627 ] t oo commented on SPARK-24617: -- same problem in spark 2.3.4 > Spark driver not requesting another executor once original executor exits due > to 'lost worker' > -- > > Key: SPARK-24617 > URL: https://issues.apache.org/jira/browse/SPARK-24617 > Project: Spark > Issue Type: Bug > Components: Scheduler >Affects Versions: 2.1.1 >Reporter: t oo >Priority: Major > Labels: bulk-closed > > I am running Spark v2.1.1 in 'standalone' mode (no yarn/mesos) across EC2s. I > have 1 master ec2 that acts as the driver (since spark-submit is called on > this host), spark.master is setup, deploymode is client (so sparksubmit only > returns a ReturnCode to the putty window once it finishes processing). I have > 1 worker ec2 that is registered with the spark master. When i run sparksubmit > on the master, I can see in the WebUI that executors starting on the worker > and I can verify successful completion. However if while the sparksubmit is > running and the worker ec2 gets terminated and then new ec2 worker becomes > alive 3mins later and registers with the master, I have noticed on the webui > that it shows 'cannot find address' in the executor status but the driver > keeps waiting forever (2 days later I kill it) or in some cases the driver > allocates tasks to the new worker only 5 hours later and then completes! Is > there some setting i am missing that would explain this behavior? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-30561) start spark applications without a 30second startup penalty
[ https://issues.apache.org/jira/browse/SPARK-30561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045015#comment-17045015 ] t oo commented on SPARK-30561: -- [~rxin] looks like you committed val SLEEP_TIME_SECS = 5 in BlockManager, you don't happen to recall why? > start spark applications without a 30second startup penalty > --- > > Key: SPARK-30561 > URL: https://issues.apache.org/jira/browse/SPARK-30561 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.0.0 >Reporter: t oo >Priority: Major > > see > https://stackoverflow.com/questions/57610138/how-to-start-spark-applications-without-a-30second-startup-penalty > using spark standalone. > There are several sleeps that can be removed: > grep -i 'sleep(' -R * | grep -v 'src/test/' | grep -E '^core' | grep -ivE > 'mesos|yarn|python|HistoryServer|spark/ui/' > core/src/main/scala/org/apache/spark/util/Clock.scala: > Thread.sleep(sleepTime) > core/src/main/scala/org/apache/spark/SparkContext.scala: * sc.parallelize(1 > to 1, 2).map { i => Thread.sleep(10); i }.count() > core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala: > private def delay(secs: Duration = 5.seconds) = Thread.sleep(secs.toMillis) > core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala: > Thread.sleep(1000) > core/src/main/scala/org/apache/spark/deploy/Client.scala: > Thread.sleep(5000) > core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala: > Thread.sleep(100) > core/src/main/scala/org/apache/spark/deploy/StandaloneResourceUtils.scala: > Thread.sleep(duration) > core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:def > sleep(seconds: Int): Unit = (0 until seconds).takeWhile { _ => > core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala: > Thread.sleep(1000) > core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala: > sleeper.sleep(waitSeconds) > core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala: def > sleep(seconds: Int): Unit > core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionClient.scala: > Thread.sleep(REPORT_DRIVER_STATUS_INTERVAL) > core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala: > Thread.sleep(10) > core/src/main/scala/org/apache/spark/storage/BlockManager.scala: > Thread.sleep(SLEEP_TIME_SECS * 1000L) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-30561) start spark applications without a 30second startup penalty
[ https://issues.apache.org/jira/browse/SPARK-30561?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17045013#comment-17045013 ] t oo commented on SPARK-30561: -- [~pwendell] looks like you added the sleep(5000), you don't happen to recall why? > start spark applications without a 30second startup penalty > --- > > Key: SPARK-30561 > URL: https://issues.apache.org/jira/browse/SPARK-30561 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 3.0.0 >Reporter: t oo >Priority: Major > > see > https://stackoverflow.com/questions/57610138/how-to-start-spark-applications-without-a-30second-startup-penalty > using spark standalone. > There are several sleeps that can be removed: > grep -i 'sleep(' -R * | grep -v 'src/test/' | grep -E '^core' | grep -ivE > 'mesos|yarn|python|HistoryServer|spark/ui/' > core/src/main/scala/org/apache/spark/util/Clock.scala: > Thread.sleep(sleepTime) > core/src/main/scala/org/apache/spark/SparkContext.scala: * sc.parallelize(1 > to 1, 2).map { i => Thread.sleep(10); i }.count() > core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala: > private def delay(secs: Duration = 5.seconds) = Thread.sleep(secs.toMillis) > core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala: > Thread.sleep(1000) > core/src/main/scala/org/apache/spark/deploy/Client.scala: > Thread.sleep(5000) > core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala: > Thread.sleep(100) > core/src/main/scala/org/apache/spark/deploy/StandaloneResourceUtils.scala: > Thread.sleep(duration) > core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:def > sleep(seconds: Int): Unit = (0 until seconds).takeWhile { _ => > core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala: > Thread.sleep(1000) > core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala: > sleeper.sleep(waitSeconds) > core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala: def > sleep(seconds: Int): Unit > core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionClient.scala: > Thread.sleep(REPORT_DRIVER_STATUS_INTERVAL) > core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala: > Thread.sleep(10) > core/src/main/scala/org/apache/spark/storage/BlockManager.scala: > Thread.sleep(SLEEP_TIME_SECS * 1000L) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-30610) spark worker graceful shutdown
[ https://issues.apache.org/jira/browse/SPARK-30610?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17044974#comment-17044974 ] t oo commented on SPARK-30610: -- >From [~holdenkarau]: In its present state it could help, you'd call decom >instead of stop. But we'd probably want to see the last step to fully consider >https://issues.apache.org/jira/browse/SPARK-30610 solved, right now it won't >schedule any new jobs but won't exit & shutdown automatically (you can use a >timer and sort of approximate it but it's not perfect). > spark worker graceful shutdown > -- > > Key: SPARK-30610 > URL: https://issues.apache.org/jira/browse/SPARK-30610 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 3.1.0 >Reporter: t oo >Priority: Minor > > I am not talking about spark streaming! just regular batch jobs using > spark-submit that may try to read large csv (100+gb) then write it out as > parquet. In an autoscaling cluster would be nice to be able to scale down (ie > terminate) ec2s without slowing down active spark applications. > for example: > 1. start spark cluster with 8 ec2s > 2. submit 6 spark apps > 3. 1 spark app completes, so 5 apps still running > 4. cluster can scale down 1 ec2 (to save $) but don't want to make the > existing apps running on the (soon to be terminated) ec2 have to make its csv > read, RDD processing steps.etc start from the beginning on different ec2's > executors. Instead want to have a 'graceful shutdown' command so that the 8th > ec2 does not accept new spark-submit apps to it (ie don't start new executors > on it) but finish the ones that have already launched on it, then exit the > worker pid. then the ec2 can be terminated > I thought stop-slave.sh could do this but looks like it just kills the pid -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-30610) spark worker graceful shutdown
t oo created SPARK-30610: Summary: spark worker graceful shutdown Key: SPARK-30610 URL: https://issues.apache.org/jira/browse/SPARK-30610 Project: Spark Issue Type: Improvement Components: Scheduler Affects Versions: 2.4.4 Reporter: t oo I am not talking about spark streaming! just regular batch jobs using spark-submit that may try to read large csv (100+gb) then write it out as parquet. In an autoscaling cluster would be nice to be able to scale down (ie terminate) ec2s without slowing down active spark applications. for example: 1. start spark cluster with 8 ec2s 2. submit 6 spark apps 3. 1 spark app completes, so 5 apps still running 4. cluster can scale down 1 ec2 (to save $) but don't want to make the existing apps running on the (soon to be terminated) ec2 have to make its csv read, RDD processing steps.etc start from the beginning on different ec2's executors. Instead want to have a 'graceful shutdown' command so that the 8th ec2 does not accept new spark-submit apps to it (ie don't start new executors on it) but finish the ones that have already launched on it, then exit the worker pid. then the ec2 can be terminated I thought stop-slave.sh could do this but looks like it just kills the pid -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-30561) start spark applications without a 30second startup penalty
t oo created SPARK-30561: Summary: start spark applications without a 30second startup penalty Key: SPARK-30561 URL: https://issues.apache.org/jira/browse/SPARK-30561 Project: Spark Issue Type: Improvement Components: Spark Core Affects Versions: 2.4.4 Reporter: t oo see https://stackoverflow.com/questions/57610138/how-to-start-spark-applications-without-a-30second-startup-penalty using spark standalone. There are several sleeps that can be removed: grep -i 'sleep(' -R * | grep -v 'src/test/' | grep -E '^core' | grep -ivE 'mesos|yarn|python|HistoryServer|spark/ui/' core/src/main/scala/org/apache/spark/util/Clock.scala: Thread.sleep(sleepTime) core/src/main/scala/org/apache/spark/SparkContext.scala: * sc.parallelize(1 to 1, 2).map { i => Thread.sleep(10); i }.count() core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala: private def delay(secs: Duration = 5.seconds) = Thread.sleep(secs.toMillis) core/src/main/scala/org/apache/spark/deploy/FaultToleranceTest.scala: Thread.sleep(1000) core/src/main/scala/org/apache/spark/deploy/Client.scala:Thread.sleep(5000) core/src/main/scala/org/apache/spark/deploy/master/ui/MasterPage.scala: Thread.sleep(100) core/src/main/scala/org/apache/spark/deploy/StandaloneResourceUtils.scala: Thread.sleep(duration) core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala:def sleep(seconds: Int): Unit = (0 until seconds).takeWhile { _ => core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala: Thread.sleep(1000) core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala: sleeper.sleep(waitSeconds) core/src/main/scala/org/apache/spark/deploy/worker/DriverRunner.scala: def sleep(seconds: Int): Unit core/src/main/scala/org/apache/spark/deploy/rest/RestSubmissionClient.scala: Thread.sleep(REPORT_DRIVER_STATUS_INTERVAL) core/src/main/scala/org/apache/spark/scheduler/AsyncEventQueue.scala: Thread.sleep(10) core/src/main/scala/org/apache/spark/storage/BlockManager.scala: Thread.sleep(SLEEP_TIME_SECS * 1000L) -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-30560) allow driver to consume a fractional core
[ https://issues.apache.org/jira/browse/SPARK-30560?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-30560: - Description: see https://stackoverflow.com/questions/56781927/apache-spark-standalone-scheduler-why-does-driver-need-a-whole-core-in-cluste this is to make it possible for a driver to use 0.2 cores rather than a whole core Standard CPUs, no GPUs was: see https://stackoverflow.com/questions/56781927/apache-spark-standalone-scheduler-why-does-driver-need-a-whole-core-in-cluste this is to make it possible for a driver to use 0.2 cores rather than a whole core > allow driver to consume a fractional core > - > > Key: SPARK-30560 > URL: https://issues.apache.org/jira/browse/SPARK-30560 > Project: Spark > Issue Type: Improvement > Components: Scheduler >Affects Versions: 2.4.4 >Reporter: t oo >Priority: Minor > > see > https://stackoverflow.com/questions/56781927/apache-spark-standalone-scheduler-why-does-driver-need-a-whole-core-in-cluste > this is to make it possible for a driver to use 0.2 cores rather than a whole > core > Standard CPUs, no GPUs -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-30560) allow driver to consume a fractional core
t oo created SPARK-30560: Summary: allow driver to consume a fractional core Key: SPARK-30560 URL: https://issues.apache.org/jira/browse/SPARK-30560 Project: Spark Issue Type: Improvement Components: Scheduler Affects Versions: 2.4.4 Reporter: t oo see https://stackoverflow.com/questions/56781927/apache-spark-standalone-scheduler-why-does-driver-need-a-whole-core-in-cluste this is to make it possible for a driver to use 0.2 cores rather than a whole core -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service
[ https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17018544#comment-17018544 ] t oo commented on SPARK-27750: -- did some digging, let me know if I'm on right track. https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L785 ---> why precedence comment? i want the opposite in private def schedule(): Unit = { add: rawFreeCores = shuffledAliveWorkers.map(_.coresFree).sum cores_reserved_for_apps = *get from config somehow* ie 8 forDriversFreeCores = math.max(rawFreeCores-cores_reserved_for_apps,0) then wrap the inner steps in: if forDriversFreeCores >= driver.desc.cores { canLaunchDriver } https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L796 #just note about driver/exec requests https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/master/Master.scala#L746-L775 > Standalone scheduler - ability to prioritize applications over drivers, many > drivers act like Denial of Service > --- > > Key: SPARK-27750 > URL: https://issues.apache.org/jira/browse/SPARK-27750 > Project: Spark > Issue Type: New Feature > Components: Scheduler >Affects Versions: 3.0.0 >Reporter: t oo >Priority: Minor > > If I submit 1000 spark submit drivers then they consume all the cores on my > cluster (essentially it acts like a Denial of Service) and no spark > 'application' gets to run since the cores are all consumed by the 'drivers'. > This feature is about having the ability to prioritize applications over > drivers so that at least some 'applications' can start running. I guess it > would be like: If (driver.state = 'submitted' and (exists some app.state = > 'submitted')) then set app.state = 'running' > if all apps have app.state = 'running' then set driver.state = 'submitted' > > Secondary to this, why must a driver consume a minimum of 1 entire core? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service
[ https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17017448#comment-17017448 ] t oo commented on SPARK-27750: -- yes, I hit this. 'spark standalone' is the RM, it has no pools. I am thinking one way to do is new config, cores_reserved_for_apps=8 (changeable), then those 8 cores cannot be consumed by drivers > Standalone scheduler - ability to prioritize applications over drivers, many > drivers act like Denial of Service > --- > > Key: SPARK-27750 > URL: https://issues.apache.org/jira/browse/SPARK-27750 > Project: Spark > Issue Type: New Feature > Components: Scheduler >Affects Versions: 3.0.0 >Reporter: t oo >Priority: Minor > > If I submit 1000 spark submit drivers then they consume all the cores on my > cluster (essentially it acts like a Denial of Service) and no spark > 'application' gets to run since the cores are all consumed by the 'drivers'. > This feature is about having the ability to prioritize applications over > drivers so that at least some 'applications' can start running. I guess it > would be like: If (driver.state = 'submitted' and (exists some app.state = > 'submitted')) then set app.state = 'running' > if all apps have app.state = 'running' then set driver.state = 'submitted' > > Secondary to this, why must a driver consume a minimum of 1 entire core? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service
[ https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989249#comment-16989249 ] t oo edited comment on SPARK-27750 at 1/16/20 6:33 AM: --- WDYT [~Ngone51] [~squito] [~vanzin] [~mgaido] [~jiangxb1987] [~jiangxb] [~zsxwing] [~jlaskowski] [~cloud_fan] [~srowen] [~dongjoon] [~hyukjin.kwon] was (Author: toopt4): WDYT [~Ngone51] [~squito] [~vanzin] [~mgaido] [~jiangxb1987] [~jiangxb] [~zsxwing] [~jlaskowski] [~cloud_fan] > Standalone scheduler - ability to prioritize applications over drivers, many > drivers act like Denial of Service > --- > > Key: SPARK-27750 > URL: https://issues.apache.org/jira/browse/SPARK-27750 > Project: Spark > Issue Type: New Feature > Components: Scheduler >Affects Versions: 3.0.0 >Reporter: t oo >Priority: Minor > > If I submit 1000 spark submit drivers then they consume all the cores on my > cluster (essentially it acts like a Denial of Service) and no spark > 'application' gets to run since the cores are all consumed by the 'drivers'. > This feature is about having the ability to prioritize applications over > drivers so that at least some 'applications' can start running. I guess it > would be like: If (driver.state = 'submitted' and (exists some app.state = > 'submitted')) then set app.state = 'running' > if all apps have app.state = 'running' then set driver.state = 'submitted' > > Secondary to this, why must a driver consume a minimum of 1 entire core? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service
[ https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989249#comment-16989249 ] t oo edited comment on SPARK-27750 at 1/15/20 11:36 PM: WDYT [~Ngone51] [~squito] [~vanzin] [~mgaido] [~jiangxb1987] [~jiangxb] [~zsxwing] [~jlaskowski] [~cloud_fan] was (Author: toopt4): WDYT [~Ngone51] [~squito] [~vanzin] [~mgaido] [~jiangxb1987] [~jiangxb] [~zsxwing] [~jlaskowski] > Standalone scheduler - ability to prioritize applications over drivers, many > drivers act like Denial of Service > --- > > Key: SPARK-27750 > URL: https://issues.apache.org/jira/browse/SPARK-27750 > Project: Spark > Issue Type: New Feature > Components: Scheduler >Affects Versions: 3.0.0 >Reporter: t oo >Priority: Minor > > If I submit 1000 spark submit drivers then they consume all the cores on my > cluster (essentially it acts like a Denial of Service) and no spark > 'application' gets to run since the cores are all consumed by the 'drivers'. > This feature is about having the ability to prioritize applications over > drivers so that at least some 'applications' can start running. I guess it > would be like: If (driver.state = 'submitted' and (exists some app.state = > 'submitted')) then set app.state = 'running' > if all apps have app.state = 'running' then set driver.state = 'submitted' > > Secondary to this, why must a driver consume a minimum of 1 entire core? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service
[ https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989249#comment-16989249 ] t oo edited comment on SPARK-27750 at 1/11/20 12:59 PM: WDYT [~Ngone51] [~squito] [~vanzin] [~mgaido] [~jiangxb1987] [~jiangxb] [~zsxwing] [~jlaskowski] was (Author: toopt4): bump > Standalone scheduler - ability to prioritize applications over drivers, many > drivers act like Denial of Service > --- > > Key: SPARK-27750 > URL: https://issues.apache.org/jira/browse/SPARK-27750 > Project: Spark > Issue Type: New Feature > Components: Scheduler >Affects Versions: 3.0.0 >Reporter: t oo >Priority: Minor > > If I submit 1000 spark submit drivers then they consume all the cores on my > cluster (essentially it acts like a Denial of Service) and no spark > 'application' gets to run since the cores are all consumed by the 'drivers'. > This feature is about having the ability to prioritize applications over > drivers so that at least some 'applications' can start running. I guess it > would be like: If (driver.state = 'submitted' and (exists some app.state = > 'submitted')) then set app.state = 'running' > if all apps have app.state = 'running' then set driver.state = 'submitted' > > Secondary to this, why must a driver consume a minimum of 1 entire core? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Reopened] (SPARK-27429) [SQL] to_timestamp function with additional argument flag that will allow exception if value could not be cast
[ https://issues.apache.org/jira/browse/SPARK-27429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo reopened SPARK-27429: -- presto has both functionality, https://prestosql.io/docs/current/functions/conditional.html by default they fail if bad data, to get null if bad data wrap in TRY(). I get user data to ingest first as string format, then I want to conform it to casted data types with sql but right now I don't get errors back! > [SQL] to_timestamp function with additional argument flag that will allow > exception if value could not be cast > -- > > Key: SPARK-27429 > URL: https://issues.apache.org/jira/browse/SPARK-27429 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 2.4.1 >Reporter: t oo >Priority: Major > > If I am running a SQL on a csv based dataframe and my query has > to_timestamp(input_col,'-MM-dd HH:mm:ss'), if the values in input_col are > not really timestamp like 'ABC' then I would like to_timestamp function to > throw an exception rather than happily (silently) return the values as null. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x
[ https://issues.apache.org/jira/browse/SPARK-27491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17003393#comment-17003393 ] t oo commented on SPARK-27491: -- [~skonto] curl works curl http://spark-master.corp.com:6066/v1/submissions/status/ but spark-submit --status does not > SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty > response! therefore Airflow won't integrate with Spark 2.3.x > -- > > Key: SPARK-27491 > URL: https://issues.apache.org/jira/browse/SPARK-27491 > Project: Spark > Issue Type: Bug > Components: Java API, Scheduler, Spark Core, Spark Shell, Spark > Submit >Affects Versions: 2.3.3, 2.4.4 >Reporter: t oo >Priority: Major > > This issue must have been introduced after Spark 2.1.1 as it is working in > that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using > spark standalone mode if that makes a difference. > See below spark 2.3.3 returns empty response while 2.1.1 returns a response. > > Spark 2.1.1: > [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + export SPARK_HOME=/home/ec2here/spark_home1 > + SPARK_HOME=/home/ec2here/spark_home1 > + '[' -z /home/ec2here/spark_home1 ']' > + . /home/ec2here/spark_home1/bin/load-spark-env.sh > ++ '[' -z /home/ec2here/spark_home1 ']' > ++ '[' -z '' ']' > ++ export SPARK_ENV_LOADED=1 > ++ SPARK_ENV_LOADED=1 > ++ parent_dir=/home/ec2here/spark_home1 > ++ user_conf_dir=/home/ec2here/spark_home1/conf > ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']' > ++ set -a > ++ . /home/ec2here/spark_home1/conf/spark-env.sh > +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 > +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 > ulimit -n 1048576 > ++ set +a > ++ '[' -z '' ']' > ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11 > ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10 > ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]] > ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']' > ++ export SPARK_SCALA_VERSION=2.10 > ++ SPARK_SCALA_VERSION=2.10 > + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']' > + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java > + '[' -d /home/ec2here/spark_home1/jars ']' > + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars > + '[' '!' -d /home/ec2here/spark_home1/jars ']' > + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*' > + '[' -n '' ']' > + [[ -n '' ]] > + CMD=() > + IFS= > + read -d '' -r ARG > ++ build_command org.apache.spark.deploy.SparkSubmit --master > spark://domainhere:6066 --status driver-20190417130324-0009 > ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp > '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > ++ printf '%d\0' 0 > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + COUNT=10 > + LAST=9 > + LAUNCHER_EXIT_CODE=0 > + [[ 0 =~ ^[0-9]+$ ]] > + '[' 0 '!=' 0 ']' > + CMD=("${CMD[@]:0:$LAST}") > + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp > '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the > status of submission driver-20190417130324-0009 in spark://domainhere:6066. > 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with > SubmissionStatusResponse: > { > "action" : "SubmissionStatusResponse", > "driverState" : "FAILED", > "serverSparkVersion" : "2.3.3", > "submissionId" : "driver-20190417130324-0009", > "success" : true, > "workerHostPort" : "x.y.211.40:11819", > "workerId" : "worker-20190417115840-x.y.211.40-11819" > } > [ec2here@ip-x-y-160-225 ~]$ > > Spark 2.3.3: > [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home/bin/spark-class > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status >
[jira] [Updated] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x
[ https://issues.apache.org/jira/browse/SPARK-27491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-27491: - Affects Version/s: 2.4.4 > SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty > response! therefore Airflow won't integrate with Spark 2.3.x > -- > > Key: SPARK-27491 > URL: https://issues.apache.org/jira/browse/SPARK-27491 > Project: Spark > Issue Type: Bug > Components: Java API, Scheduler, Spark Core, Spark Shell, Spark > Submit >Affects Versions: 2.3.3, 2.4.4 >Reporter: t oo >Priority: Major > > This issue must have been introduced after Spark 2.1.1 as it is working in > that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using > spark standalone mode if that makes a difference. > See below spark 2.3.3 returns empty response while 2.1.1 returns a response. > > Spark 2.1.1: > [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + export SPARK_HOME=/home/ec2here/spark_home1 > + SPARK_HOME=/home/ec2here/spark_home1 > + '[' -z /home/ec2here/spark_home1 ']' > + . /home/ec2here/spark_home1/bin/load-spark-env.sh > ++ '[' -z /home/ec2here/spark_home1 ']' > ++ '[' -z '' ']' > ++ export SPARK_ENV_LOADED=1 > ++ SPARK_ENV_LOADED=1 > ++ parent_dir=/home/ec2here/spark_home1 > ++ user_conf_dir=/home/ec2here/spark_home1/conf > ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']' > ++ set -a > ++ . /home/ec2here/spark_home1/conf/spark-env.sh > +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 > +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 > ulimit -n 1048576 > ++ set +a > ++ '[' -z '' ']' > ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11 > ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10 > ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]] > ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']' > ++ export SPARK_SCALA_VERSION=2.10 > ++ SPARK_SCALA_VERSION=2.10 > + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']' > + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java > + '[' -d /home/ec2here/spark_home1/jars ']' > + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars > + '[' '!' -d /home/ec2here/spark_home1/jars ']' > + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*' > + '[' -n '' ']' > + [[ -n '' ]] > + CMD=() > + IFS= > + read -d '' -r ARG > ++ build_command org.apache.spark.deploy.SparkSubmit --master > spark://domainhere:6066 --status driver-20190417130324-0009 > ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp > '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > ++ printf '%d\0' 0 > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + COUNT=10 > + LAST=9 > + LAUNCHER_EXIT_CODE=0 > + [[ 0 =~ ^[0-9]+$ ]] > + '[' 0 '!=' 0 ']' > + CMD=("${CMD[@]:0:$LAST}") > + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp > '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the > status of submission driver-20190417130324-0009 in spark://domainhere:6066. > 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with > SubmissionStatusResponse: > { > "action" : "SubmissionStatusResponse", > "driverState" : "FAILED", > "serverSparkVersion" : "2.3.3", > "submissionId" : "driver-20190417130324-0009", > "success" : true, > "workerHostPort" : "x.y.211.40:11819", > "workerId" : "worker-20190417115840-x.y.211.40-11819" > } > [ec2here@ip-x-y-160-225 ~]$ > > Spark 2.3.3: > [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home/bin/spark-class > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + '[' -z '' ']' > ++ dirname /home/ec2here/spark_home/bin/spark-class > + source
[jira] [Updated] (SPARK-30251) faster way to read csv.gz?
[ https://issues.apache.org/jira/browse/SPARK-30251?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-30251: - Description: some data providers give files in csv.gz (ie 1gb compressed which is 25gb uncompressed; or 5gb compressed which is 130gb compressed; or .1gb compressed which is 2.5gb uncompressed), now when i tell my boss that famous big data tool spark takes 16hrs to convert the 1gb compressed into parquet then there is look of shock. this is batch data we receive daily (80gb compressed, 2tb uncompressed every day spread across ~300 files). i know gz is not splittable so it ends up loaded on single worker. but we dont have space/patience to do a pre-conversion to bz2 or uncompressed. can spark have a better codec? i saw posts mentioning even python is faster than spark [https://stackoverflow.com/questions/40492967/dealing-with-a-large-gzipped-file-in-spark] [https://github.com/nielsbasjes/splittablegzip] was: some data providers give files in csv.gz (ie 1gb compressed which is 25gb uncompressed; or 5gb compressed which is 130gb compressed; or .1gb compressed which is 2.5gb uncompressed), now when i tell my boss that famous big data tool spark takes 16hrs to convert the 1gb compressed into parquet then there is look of shock. this is batch data we receive daily (80gb compressed, 2tb uncompressed every day spread across ~300 files). i know gz is not splittable so currently loaded on single worker. but we dont have space/patience to do a pre-conversion to bz2 or uncompressed. can spark have a better codec? i saw posts mentioning even python is faster than spark [https://stackoverflow.com/questions/40492967/dealing-with-a-large-gzipped-file-in-spark] [https://github.com/nielsbasjes/splittablegzip] > faster way to read csv.gz? > -- > > Key: SPARK-30251 > URL: https://issues.apache.org/jira/browse/SPARK-30251 > Project: Spark > Issue Type: New Feature > Components: Spark Core >Affects Versions: 2.4.4 >Reporter: t oo >Priority: Major > > some data providers give files in csv.gz (ie 1gb compressed which is 25gb > uncompressed; or 5gb compressed which is 130gb compressed; or .1gb compressed > which is 2.5gb uncompressed), now when i tell my boss that famous big data > tool spark takes 16hrs to convert the 1gb compressed into parquet then there > is look of shock. this is batch data we receive daily (80gb compressed, 2tb > uncompressed every day spread across ~300 files). > i know gz is not splittable so it ends up loaded on single worker. but we > dont have space/patience to do a pre-conversion to bz2 or uncompressed. can > spark have a better codec? i saw posts mentioning even python is faster than > spark > > [https://stackoverflow.com/questions/40492967/dealing-with-a-large-gzipped-file-in-spark] > [https://github.com/nielsbasjes/splittablegzip] > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-30251) faster way to read csv.gz?
t oo created SPARK-30251: Summary: faster way to read csv.gz? Key: SPARK-30251 URL: https://issues.apache.org/jira/browse/SPARK-30251 Project: Spark Issue Type: New Feature Components: Spark Core Affects Versions: 2.4.4 Reporter: t oo some data providers give files in csv.gz (ie 1gb compressed which is 25gb uncompressed; or 5gb compressed which is 130gb compressed; or .1gb compressed which is 2.5gb uncompressed), now when i tell my boss that famous big data tool spark takes 16hrs to convert the 1gb compressed into parquet then there is look of shock. this is batch data we receive daily (80gb compressed, 2tb uncompressed every day spread across ~300 files). i know gz is not splittable so currently loaded on single worker. but we dont have space/patience to do a pre-conversion to bz2 or uncompressed. can spark have a better codec? i saw posts mentioning even python is faster than spark [https://stackoverflow.com/questions/40492967/dealing-with-a-large-gzipped-file-in-spark] [https://github.com/nielsbasjes/splittablegzip] -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26346) Upgrade parquet to 1.11.0
[ https://issues.apache.org/jira/browse/SPARK-26346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16991165#comment-16991165 ] t oo commented on SPARK-26346: -- https://www.apache.org/dist/parquet/apache-parquet-1.11.0/ http://mail-archives.apache.org/mod_mbox/parquet-dev/201912.mbox/browser. released > Upgrade parquet to 1.11.0 > - > > Key: SPARK-26346 > URL: https://issues.apache.org/jira/browse/SPARK-26346 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-30162) Filter is not being pushed down for Parquet files
[ https://issues.apache.org/jira/browse/SPARK-30162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990725#comment-16990725 ] t oo commented on SPARK-30162: -- did u try on scala sparkshell? > Filter is not being pushed down for Parquet files > - > > Key: SPARK-30162 > URL: https://issues.apache.org/jira/browse/SPARK-30162 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 3.0.0 > Environment: pyspark 3.0 preview > Ubuntu/Centos > pyarrow 0.14.1 >Reporter: Nasir Ali >Priority: Major > > Filters are not pushed down in Spark 3.0 preview. Also the output of > "explain" method is different. It is hard to debug in 3.0 whether filters > were pushed down or not. Below code could reproduce the bug: > > {code:java} > // code placeholder > df = spark.createDataFrame([("usr1",17.00, "2018-03-10T15:27:18+00:00"), > ("usr1",13.00, "2018-03-11T12:27:18+00:00"), > ("usr1",25.00, "2018-03-12T11:27:18+00:00"), > ("usr1",20.00, "2018-03-13T15:27:18+00:00"), > ("usr1",17.00, "2018-03-14T12:27:18+00:00"), > ("usr2",99.00, "2018-03-15T11:27:18+00:00"), > ("usr2",156.00, "2018-03-22T11:27:18+00:00"), > ("usr2",17.00, "2018-03-31T11:27:18+00:00"), > ("usr2",25.00, "2018-03-15T11:27:18+00:00"), > ("usr2",25.00, "2018-03-16T11:27:18+00:00") > ], >["user","id", "ts"]) > df = df.withColumn('ts', df.ts.cast('timestamp')) > df.write.partitionBy("user").parquet("/home/cnali/data/")df2 = > spark.read.load("/home/cnali/data/")df2.filter("user=='usr2'").explain(True) > {code} > {code:java} > // Spark 2.4 output > == Parsed Logical Plan == > 'Filter ('user = usr2) > +- Relation[id#38,ts#39,user#40] parquet== Analyzed Logical Plan == > id: double, ts: timestamp, user: string > Filter (user#40 = usr2) > +- Relation[id#38,ts#39,user#40] parquet== Optimized Logical Plan == > Filter (isnotnull(user#40) && (user#40 = usr2)) > +- Relation[id#38,ts#39,user#40] parquet== Physical Plan == > *(1) FileScan parquet [id#38,ts#39,user#40] Batched: true, Format: Parquet, > Location: InMemoryFileIndex[file:/home/cnali/data], PartitionCount: 1, > PartitionFilters: [isnotnull(user#40), (user#40 = usr2)], PushedFilters: [], > ReadSchema: struct{code} > {code:java} > // Spark 3.0.0-preview output > == Parsed Logical Plan == > 'Filter ('user = usr2) > +- RelationV2[id#0, ts#1, user#2] parquet file:/home/cnali/data== Analyzed > Logical Plan == > id: double, ts: timestamp, user: string > Filter (user#2 = usr2) > +- RelationV2[id#0, ts#1, user#2] parquet file:/home/cnali/data== Optimized > Logical Plan == > Filter (isnotnull(user#2) AND (user#2 = usr2)) > +- RelationV2[id#0, ts#1, user#2] parquet file:/home/cnali/data== Physical > Plan == > *(1) Project [id#0, ts#1, user#2] > +- *(1) Filter (isnotnull(user#2) AND (user#2 = usr2)) >+- *(1) ColumnarToRow > +- BatchScan[id#0, ts#1, user#2] ParquetScan Location: > InMemoryFileIndex[file:/home/cnali/data], ReadSchema: > struct > {code} > I have tested it on much larger dataset. Spark 3.0 tries to load whole data > and then apply filter. Whereas Spark 2.4 push down the filter. Above output > shows that Spark 2.4 applied partition filter but not the Spark 3.0 preview. > > Minor: in Spark 3.0 "explain()" output is truncated (maybe fixed length?) and > it's hard to debug. spark.sql.orc.cache.stripe.details.size=1 doesn't > work. > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27623) Provider org.apache.spark.sql.avro.AvroFileFormat could not be instantiated
[ https://issues.apache.org/jira/browse/SPARK-27623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990723#comment-16990723 ] t oo commented on SPARK-27623: -- bump > Provider org.apache.spark.sql.avro.AvroFileFormat could not be instantiated > --- > > Key: SPARK-27623 > URL: https://issues.apache.org/jira/browse/SPARK-27623 > Project: Spark > Issue Type: Bug > Components: PySpark >Affects Versions: 2.4.2 >Reporter: Alexandru Barbulescu >Priority: Major > > After updating to spark 2.4.2 when using the > {code:java} > spark.read.format().options().load() > {code} > > chain of methods, regardless of what parameter is passed to "format" we get > the following error related to avro: > > {code:java} > - .options(**load_options) > - File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/readwriter.py", line > 172, in load > - File "/opt/spark/python/lib/py4j-0.10.7-src.zip/py4j/java_gateway.py", line > 1257, in __call__ > - File "/opt/spark/python/lib/pyspark.zip/pyspark/sql/utils.py", line 63, in > deco > - File "/opt/spark/python/lib/py4j-0.10.7-src.zip/py4j/protocol.py", line > 328, in get_return_value > - py4j.protocol.Py4JJavaError: An error occurred while calling o69.load. > - : java.util.ServiceConfigurationError: > org.apache.spark.sql.sources.DataSourceRegister: Provider > org.apache.spark.sql.avro.AvroFileFormat could not be instantiated > - at java.util.ServiceLoader.fail(ServiceLoader.java:232) > - at java.util.ServiceLoader.access$100(ServiceLoader.java:185) > - at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:384) > - at java.util.ServiceLoader$LazyIterator.next(ServiceLoader.java:404) > - at java.util.ServiceLoader$1.next(ServiceLoader.java:480) > - at > scala.collection.convert.Wrappers$JIteratorWrapper.next(Wrappers.scala:44) > - at scala.collection.Iterator.foreach(Iterator.scala:941) > - at scala.collection.Iterator.foreach$(Iterator.scala:941) > - at scala.collection.AbstractIterator.foreach(Iterator.scala:1429) > - at scala.collection.IterableLike.foreach(IterableLike.scala:74) > - at scala.collection.IterableLike.foreach$(IterableLike.scala:73) > - at scala.collection.AbstractIterable.foreach(Iterable.scala:56) > - at scala.collection.TraversableLike.filterImpl(TraversableLike.scala:250) > - at scala.collection.TraversableLike.filterImpl$(TraversableLike.scala:248) > - at scala.collection.AbstractTraversable.filterImpl(Traversable.scala:108) > - at scala.collection.TraversableLike.filter(TraversableLike.scala:262) > - at scala.collection.TraversableLike.filter$(TraversableLike.scala:262) > - at scala.collection.AbstractTraversable.filter(Traversable.scala:108) > - at > org.apache.spark.sql.execution.datasources.DataSource$.lookupDataSource(DataSource.scala:630) > - at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:194) > - at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:167) > - at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) > - at > sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) > - at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > - at java.lang.reflect.Method.invoke(Method.java:498) > - at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244) > - at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357) > - at py4j.Gateway.invoke(Gateway.java:282) > - at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132) > - at py4j.commands.CallCommand.execute(CallCommand.java:79) > - at py4j.GatewayConnection.run(GatewayConnection.java:238) > - at java.lang.Thread.run(Thread.java:748) > - Caused by: java.lang.NoClassDefFoundError: > org/apache/spark/sql/execution/datasources/FileFormat$class > - at org.apache.spark.sql.avro.AvroFileFormat.(AvroFileFormat.scala:44) > - at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) > - at > sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62) > - at > sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45) > - at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > - at java.lang.Class.newInstance(Class.java:442) > - at java.util.ServiceLoader$LazyIterator.nextService(ServiceLoader.java:380) > - ... 29 more > - Caused by: java.lang.ClassNotFoundException: > org.apache.spark.sql.execution.datasources.FileFormat$class > - at java.net.URLClassLoader.findClass(URLClassLoader.java:382) > - at java.lang.ClassLoader.loadClass(ClassLoader.java:424) > - at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:349) > - at java.lang.ClassLoader.loadClass(ClassLoader.java:357) > - ... 36 more > {code} > > The code we run looks like
[jira] [Commented] (SPARK-26346) Upgrade parquet to 1.11.0
[ https://issues.apache.org/jira/browse/SPARK-26346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16990722#comment-16990722 ] t oo commented on SPARK-26346: -- bump > Upgrade parquet to 1.11.0 > - > > Key: SPARK-26346 > URL: https://issues.apache.org/jira/browse/SPARK-26346 > Project: Spark > Issue Type: Sub-task > Components: SQL >Affects Versions: 3.0.0 >Reporter: Yuming Wang >Priority: Major > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26091) Upgrade to 2.3.4 for Hive Metastore Client 2.3
[ https://issues.apache.org/jira/browse/SPARK-26091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989616#comment-16989616 ] t oo commented on SPARK-26091: -- but i dont use 2.3.3 > Upgrade to 2.3.4 for Hive Metastore Client 2.3 > -- > > Key: SPARK-26091 > URL: https://issues.apache.org/jira/browse/SPARK-26091 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 3.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26091) Upgrade to 2.3.4 for Hive Metastore Client 2.3
[ https://issues.apache.org/jira/browse/SPARK-26091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989461#comment-16989461 ] t oo commented on SPARK-26091: -- i just want to access hivemetaxtore 2.3.4 > Upgrade to 2.3.4 for Hive Metastore Client 2.3 > -- > > Key: SPARK-26091 > URL: https://issues.apache.org/jira/browse/SPARK-26091 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 3.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26091) Upgrade to 2.3.4 for Hive Metastore Client 2.3
[ https://issues.apache.org/jira/browse/SPARK-26091?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989256#comment-16989256 ] t oo commented on SPARK-26091: -- can this go in spark 2.4.5 ? > Upgrade to 2.3.4 for Hive Metastore Client 2.3 > -- > > Key: SPARK-26091 > URL: https://issues.apache.org/jira/browse/SPARK-26091 > Project: Spark > Issue Type: Improvement > Components: SQL >Affects Versions: 3.0.0 >Reporter: Dongjoon Hyun >Assignee: Dongjoon Hyun >Priority: Minor > Fix For: 3.0.0 > > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22860) Spark workers log ssl passwords passed to the executors
[ https://issues.apache.org/jira/browse/SPARK-22860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989255#comment-16989255 ] t oo commented on SPARK-22860: -- [~kabhwan] can this go in 2.4.5? > Spark workers log ssl passwords passed to the executors > --- > > Key: SPARK-22860 > URL: https://issues.apache.org/jira/browse/SPARK-22860 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.1 >Reporter: Felix K. >Assignee: Jungtaek Lim >Priority: Major > Fix For: 3.0.0 > > > The workers log the spark.ssl.keyStorePassword and > spark.ssl.trustStorePassword passed by cli to the executor processes. The > ExecutorRunner should escape passwords to not appear in the worker's log > files in INFO level. In this example, you can see my 'SuperSecretPassword' in > a worker log: > {code} > 17/12/08 08:04:12 INFO ExecutorRunner: Launch command: > "/global/myapp/oem/jdk/bin/java" "-cp" > "/global/myapp/application/myapp_software/thing_loader_lib/core-repository-model-zzz-1.2.3-SNAPSHOT.jar > [...] > :/global/myapp/application/spark-2.1.1-bin-hadoop2.7/jars/*" "-Xmx16384M" > "-Dspark.authenticate.enableSaslEncryption=true" > "-Dspark.ssl.keyStorePassword=SuperSecretPassword" > "-Dspark.ssl.keyStore=/global/myapp/application/config/ssl/keystore.jks" > "-Dspark.ssl.trustStore=/global/myapp/application/config/ssl/truststore.jks" > "-Dspark.ssl.enabled=true" "-Dspark.driver.port=39927" > "-Dspark.ssl.protocol=TLS" > "-Dspark.ssl.trustStorePassword=SuperSecretPassword" > "-Dspark.authenticate=true" "-Dmyapp_IMPORT_DATE=2017-10-30" > "-Dmyapp.config.directory=/global/myapp/application/config" > "-Dsolr.httpclient.builder.factory=com.company.myapp.loader.auth.LoaderConfigSparkSolrBasicAuthConfigurer" > > "-Djavax.net.ssl.trustStore=/global/myapp/application/config/ssl/truststore.jks" > "-XX:+UseG1GC" "-XX:+UseStringDeduplication" > "-Dthings.loader.export.zzz_files=false" > "-Dlog4j.configuration=file:/global/myapp/application/config/spark-executor-log4j.properties" > "-XX:+HeapDumpOnOutOfMemoryError" "-XX:+UseStringDeduplication" > "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" > "spark://CoarseGrainedScheduler@192.168.0.1:39927" "--executor-id" "2" > "--hostname" "192.168.0.1" "--cores" "4" "--app-id" "app-20171208080412-" > "--worker-url" "spark://Worker@192.168.0.1:59530" > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23534) Spark run on Hadoop 3.0.0
[ https://issues.apache.org/jira/browse/SPARK-23534?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989254#comment-16989254 ] t oo commented on SPARK-23534: -- close? > Spark run on Hadoop 3.0.0 > - > > Key: SPARK-23534 > URL: https://issues.apache.org/jira/browse/SPARK-23534 > Project: Spark > Issue Type: Improvement > Components: Build >Affects Versions: 2.3.0 >Reporter: Saisai Shao >Priority: Major > > Major Hadoop vendors already/will step in Hadoop 3.0. So we should also make > sure Spark can run with Hadoop 3.0. This Jira tracks the work to make Spark > run on Hadoop 3.0. > The work includes: > # Add a Hadoop 3.0.0 new profile to make Spark build-able with Hadoop 3.0. > # Test to see if there's dependency issues with Hadoop 3.0. > # Investigating the feasibility to use shaded client jars (HADOOP-11804). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-24590) Make Jenkins tests passed with hadoop 3 profile
[ https://issues.apache.org/jira/browse/SPARK-24590?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989253#comment-16989253 ] t oo commented on SPARK-24590: -- close? > Make Jenkins tests passed with hadoop 3 profile > --- > > Key: SPARK-24590 > URL: https://issues.apache.org/jira/browse/SPARK-24590 > Project: Spark > Issue Type: Sub-task > Components: Build >Affects Versions: 2.4.0 >Reporter: Hyukjin Kwon >Priority: Major > > Currently, some tests are being failed with hadoop-3 profile. > Given PR builder > (https://github.com/apache/spark/pull/21441#issuecomment-397818337), it > reported: > {code} > org.apache.spark.sql.hive.HiveSparkSubmitSuite.SPARK-8020: set sql conf in > spark conf > org.apache.spark.sql.hive.HiveSparkSubmitSuite.SPARK-9757 Persist Parquet > relation with decimal column > org.apache.spark.sql.hive.HiveSparkSubmitSuite.ConnectionURL > org.apache.spark.sql.hive.StatisticsSuite.SPARK-22745 - read Hive's > statistics for partition > org.apache.spark.sql.hive.StatisticsSuite.alter table rename after analyze > table > org.apache.spark.sql.hive.StatisticsSuite.alter table SET TBLPROPERTIES after > analyze table > org.apache.spark.sql.hive.StatisticsSuite.alter table UNSET TBLPROPERTIES > after analyze table > org.apache.spark.sql.hive.client.HiveClientSuites.(It is not a test it is a > sbt.testing.SuiteSelector) > org.apache.spark.sql.hive.client.VersionsSuite.success sanity check > org.apache.spark.sql.hive.client.VersionsSuite.hadoop configuration preserved > 75 ms > org.apache.spark.sql.hive.client.VersionsSuite.*: * (roughly) > org.apache.spark.sql.hive.execution.HiveCatalogedDDLSuite.basic DDL using > locale tr - caseSensitive true > org.apache.spark.sql.hive.execution.HiveDDLSuite.create Hive-serde table and > view with unicode columns and comment > org.apache.spark.sql.hive.execution.Hive_2_1_DDLSuite.SPARK-21617: ALTER > TABLE for non-compatible DataSource tables > org.apache.spark.sql.hive.execution.Hive_2_1_DDLSuite.SPARK-21617: ALTER > TABLE for Hive-compatible DataSource tables > org.apache.spark.sql.hive.execution.Hive_2_1_DDLSuite.SPARK-21617: ALTER > TABLE for Hive tables > org.apache.spark.sql.hive.execution.Hive_2_1_DDLSuite.SPARK-21617: ALTER > TABLE with incompatible schema on Hive-compatible table > org.apache.spark.sql.hive.execution.Hive_2_1_DDLSuite.(It is not a test it is > a sbt.testing.SuiteSelector) > org.apache.spark.sql.hive.execution.SQLQuerySuite.SPARK-18355 Read data from > a hive table with a new column - orc > org.apache.spark.sql.hive.execution.SQLQuerySuite.SPARK-18355 Read data from > a hive table with a new column - parquet > org.apache.spark.sql.hive.orc.HiveOrcSourceSuite.SPARK-19459/SPARK-18220: > read char/varchar column written by Hive > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true
[ https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989251#comment-16989251 ] t oo edited comment on SPARK-5159 at 12/5/19 11:31 PM: --- [~yumwang] does removal of hive fork solve this one? was (Author: toopt4): [~yumwang] does removal of hive fork soove this one? > Thrift server does not respect hive.server2.enable.doAs=true > > > Key: SPARK-5159 > URL: https://issues.apache.org/jira/browse/SPARK-5159 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.0 >Reporter: Andrew Ray >Priority: Major > Attachments: spark_thrift_server_log.txt > > > I'm currently testing the spark sql thrift server on a kerberos secured > cluster in YARN mode. Currently any user can access any table regardless of > HDFS permissions as all data is read as the hive user. In HiveServer2 the > property hive.server2.enable.doAs=true causes all access to be done as the > submitting user. We should do the same. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-5159) Thrift server does not respect hive.server2.enable.doAs=true
[ https://issues.apache.org/jira/browse/SPARK-5159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989251#comment-16989251 ] t oo commented on SPARK-5159: - [~yumwang] does removal of hive fork soove this one? > Thrift server does not respect hive.server2.enable.doAs=true > > > Key: SPARK-5159 > URL: https://issues.apache.org/jira/browse/SPARK-5159 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.2.0 >Reporter: Andrew Ray >Priority: Major > Attachments: spark_thrift_server_log.txt > > > I'm currently testing the spark sql thrift server on a kerberos secured > cluster in YARN mode. Currently any user can access any table regardless of > HDFS permissions as all data is read as the hive user. In HiveServer2 the > property hive.server2.enable.doAs=true causes all access to be done as the > submitting user. We should do the same. -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service
[ https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989249#comment-16989249 ] t oo commented on SPARK-27750: -- bump > Standalone scheduler - ability to prioritize applications over drivers, many > drivers act like Denial of Service > --- > > Key: SPARK-27750 > URL: https://issues.apache.org/jira/browse/SPARK-27750 > Project: Spark > Issue Type: New Feature > Components: Scheduler >Affects Versions: 3.0.0 >Reporter: t oo >Priority: Minor > > If I submit 1000 spark submit drivers then they consume all the cores on my > cluster (essentially it acts like a Denial of Service) and no spark > 'application' gets to run since the cores are all consumed by the 'drivers'. > This feature is about having the ability to prioritize applications over > drivers so that at least some 'applications' can start running. I guess it > would be like: If (driver.state = 'submitted' and (exists some app.state = > 'submitted')) then set app.state = 'running' > if all apps have app.state = 'running' then set driver.state = 'submitted' > > Secondary to this, why must a driver consume a minimum of 1 entire core? -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27821) Spark WebUI - show numbers of drivers/apps in waiting/submitted/killed/running state
[ https://issues.apache.org/jira/browse/SPARK-27821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16989248#comment-16989248 ] t oo commented on SPARK-27821: -- duration of running drivers missing too > Spark WebUI - show numbers of drivers/apps in > waiting/submitted/killed/running state > > > Key: SPARK-27821 > URL: https://issues.apache.org/jira/browse/SPARK-27821 > Project: Spark > Issue Type: Improvement > Components: Web UI >Affects Versions: 3.0.0 >Reporter: t oo >Priority: Minor > Attachments: webui.png > > > The webui shows total number of apps/drivers in running/completed state. This > improvement is to show total number in following more fine-grained states: > waiting/submitted/killed/running/completed -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25088) Rest Server default & doc updates
[ https://issues.apache.org/jira/browse/SPARK-25088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16929103#comment-16929103 ] t oo commented on SPARK-25088: -- they are on different ports, now i have to maintain a patched fork > Rest Server default & doc updates > - > > Key: SPARK-25088 > URL: https://issues.apache.org/jira/browse/SPARK-25088 > Project: Spark > Issue Type: Improvement > Components: Deploy, Spark Core >Affects Versions: 2.1.3, 2.2.2, 2.3.1, 2.4.0 >Reporter: Imran Rashid >Assignee: Imran Rashid >Priority: Major > Labels: release-notes > Fix For: 2.4.0 > > > The rest server could use some updates on defaults & docs, both in standalone > and mesos. -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-28376) Support to write sorted parquet files in each row group
[ https://issues.apache.org/jira/browse/SPARK-28376?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884883#comment-16884883 ] t oo commented on SPARK-28376: -- [~rdblue] can u add some color here? > Support to write sorted parquet files in each row group > --- > > Key: SPARK-28376 > URL: https://issues.apache.org/jira/browse/SPARK-28376 > Project: Spark > Issue Type: New Feature > Components: Input/Output, Spark Core >Affects Versions: 2.4.3 >Reporter: t oo >Priority: Major > > this is for the ability to writeee parquet with sorteed values in each > rowgroup > > see > [https://stackoverflow.com/questions/52159938/cant-write-ordered-data-to-parquet-in-spark] > [https://www.slideshare.net/RyanBlue3/parquet-performance-tuning-the-missing-guide] > (slidee 26-27) > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-28376) Write sorted parquet files
t oo created SPARK-28376: Summary: Write sorted parquet files Key: SPARK-28376 URL: https://issues.apache.org/jira/browse/SPARK-28376 Project: Spark Issue Type: New Feature Components: Input/Output, Spark Core Affects Versions: 2.4.3 Reporter: t oo this is for the ability to writeee parquet with sorteed values in each rowgroup see [https://stackoverflow.com/questions/52159938/cant-write-ordered-data-to-parquet-in-spark] [https://www.slideshare.net/RyanBlue3/parquet-performance-tuning-the-missing-guide] (slidee 26-27) -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15420) Repartition and sort before Parquet writes
[ https://issues.apache.org/jira/browse/SPARK-15420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16884220#comment-16884220 ] t oo commented on SPARK-15420: -- PR was abandoned :( > Repartition and sort before Parquet writes > -- > > Key: SPARK-15420 > URL: https://issues.apache.org/jira/browse/SPARK-15420 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.1 >Reporter: Ryan Blue >Priority: Major > > Parquet requires buffering data in memory before writing a group of rows > organized by column. This causes significant memory pressure when writing > partitioned output because each open file must buffer rows. > Currently, Spark will sort data and spill if necessary in the > {{WriterContainer}} to avoid keeping many files open at once. But, this isn't > a full solution for a few reasons: > * The final sort is always performed, even if incoming data is already sorted > correctly. For example, a global sort will cause two sorts to happen, even if > the global sort correctly prepares the data. > * To prevent a large number of output small output files, users must manually > add a repartition step. That step is also ignored by the sort within the > writer. > * Hive does not currently support {{DataFrameWriter#sortBy}} > The sort in {{WriterContainer}} makes sense to prevent problems, but should > detect if the incoming data is already sorted. The {{DataFrameWriter}} should > also expose the ability to repartition data before the write stage, and the > query planner should expose an option to automatically insert repartition > operations. -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service
[ https://issues.apache.org/jira/browse/SPARK-27750?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-27750: - Description: If I submit 1000 spark submit drivers then they consume all the cores on my cluster (essentially it acts like a Denial of Service) and no spark 'application' gets to run since the cores are all consumed by the 'drivers'. This feature is about having the ability to prioritize applications over drivers so that at least some 'applications' can start running. I guess it would be like: If (driver.state = 'submitted' and (exists some app.state = 'submitted')) then set app.state = 'running' if all apps have app.state = 'running' then set driver.state = 'submitted' Secondary to this, why must a driver consume a minimum of 1 entire core? was: If I submit 1000 spark submit drivers then they consume all the cores on my cluster (essentially it acts like a Denial of Service) and no spark 'application' gets to run since the cores are all consumed by the 'drivers'. This feature is about having the ability to prioritize applications over drivers so that at least some 'applications' can start running. I guess it would be like: If (driver.state = 'submitted' and (exists some app.state = 'submitted')) then set app.state = 'running' if all apps have app.state = 'running' then set driver.state = 'submitted' > Standalone scheduler - ability to prioritize applications over drivers, many > drivers act like Denial of Service > --- > > Key: SPARK-27750 > URL: https://issues.apache.org/jira/browse/SPARK-27750 > Project: Spark > Issue Type: New Feature > Components: Scheduler >Affects Versions: 2.3.3, 2.4.3 >Reporter: t oo >Priority: Minor > > If I submit 1000 spark submit drivers then they consume all the cores on my > cluster (essentially it acts like a Denial of Service) and no spark > 'application' gets to run since the cores are all consumed by the 'drivers'. > This feature is about having the ability to prioritize applications over > drivers so that at least some 'applications' can start running. I guess it > would be like: If (driver.state = 'submitted' and (exists some app.state = > 'submitted')) then set app.state = 'running' > if all apps have app.state = 'running' then set driver.state = 'submitted' > > Secondary to this, why must a driver consume a minimum of 1 entire core? -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-20202: - Comment: was deleted (was: gentle ping) > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Issue Comment Deleted] (SPARK-20202) Remove references to org.spark-project.hive
[ https://issues.apache.org/jira/browse/SPARK-20202?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-20202: - Comment: was deleted (was: bump) > Remove references to org.spark-project.hive > --- > > Key: SPARK-20202 > URL: https://issues.apache.org/jira/browse/SPARK-20202 > Project: Spark > Issue Type: Bug > Components: Build, SQL >Affects Versions: 1.6.4, 2.0.3, 2.1.1 >Reporter: Owen O'Malley >Priority: Major > > Spark can't continue to depend on their fork of Hive and must move to > standard Hive versions. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-27822) Spark WebUi - for running applications have a drivername column
[ https://issues.apache.org/jira/browse/SPARK-27822?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-27822: - Attachment: spark-ui-waiting-2.png > Spark WebUi - for running applications have a drivername column > --- > > Key: SPARK-27822 > URL: https://issues.apache.org/jira/browse/SPARK-27822 > Project: Spark > Issue Type: Improvement > Components: Web UI >Affects Versions: 2.3.3 >Reporter: t oo >Priority: Major > Attachments: spark-ui-waiting-2.png > > > Since every app has one driver the list of running apps could have an extra > column of driver name to show association between app and driver names -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27822) Spark WebUi - for running applications have a drivername column
[ https://issues.apache.org/jira/browse/SPARK-27822?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16847329#comment-16847329 ] t oo commented on SPARK-27822: -- [~hyukjin.kwon] attached > Spark WebUi - for running applications have a drivername column > --- > > Key: SPARK-27822 > URL: https://issues.apache.org/jira/browse/SPARK-27822 > Project: Spark > Issue Type: Improvement > Components: Web UI >Affects Versions: 2.3.3 >Reporter: t oo >Priority: Major > Attachments: spark-ui-waiting-2.png > > > Since every app has one driver the list of running apps could have an extra > column of driver name to show association between app and driver names -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-27822) Spark WebUi - for running applications have a drivername column
t oo created SPARK-27822: Summary: Spark WebUi - for running applications have a drivername column Key: SPARK-27822 URL: https://issues.apache.org/jira/browse/SPARK-27822 Project: Spark Issue Type: Improvement Components: Web UI Affects Versions: 2.3.3 Reporter: t oo Since every app has one driver the list of running apps could have an extra column of driver name to show association between app and driver names -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-27821) Spark WebUI - show numbers of drivers/apps in waiting/submitted/killed/running state
[ https://issues.apache.org/jira/browse/SPARK-27821?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-27821: - Attachment: webui.png > Spark WebUI - show numbers of drivers/apps in > waiting/submitted/killed/running state > > > Key: SPARK-27821 > URL: https://issues.apache.org/jira/browse/SPARK-27821 > Project: Spark > Issue Type: Improvement > Components: Web UI >Affects Versions: 2.3.3 >Reporter: t oo >Priority: Minor > Attachments: webui.png > > > The webui shows total number of apps/drivers in running/completed state. This > improvement is to show total number in following more fine-grained states: > waiting/submitted/killed/running/completed -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-27821) Spark WebUI - show numbers of drivers/apps in waiting/submitted/killed/running state
t oo created SPARK-27821: Summary: Spark WebUI - show numbers of drivers/apps in waiting/submitted/killed/running state Key: SPARK-27821 URL: https://issues.apache.org/jira/browse/SPARK-27821 Project: Spark Issue Type: Improvement Components: Web UI Affects Versions: 2.3.3 Reporter: t oo The webui shows total number of apps/drivers in running/completed state. This improvement is to show total number in following more fine-grained states: waiting/submitted/killed/running/completed -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-27750) Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service
t oo created SPARK-27750: Summary: Standalone scheduler - ability to prioritize applications over drivers, many drivers act like Denial of Service Key: SPARK-27750 URL: https://issues.apache.org/jira/browse/SPARK-27750 Project: Spark Issue Type: New Feature Components: Scheduler Affects Versions: 2.4.3, 2.3.3 Reporter: t oo If I submit 1000 spark submit drivers then they consume all the cores on my cluster (essentially it acts like a Denial of Service) and no spark 'application' gets to run since the cores are all consumed by the 'drivers'. This feature is about having the ability to prioritize applications over drivers so that at least some 'applications' can start running. I guess it would be like: If (driver.state = 'submitted' and (exists some app.state = 'submitted')) then set app.state = 'running' if all apps have app.state = 'running' then set driver.state = 'submitted' -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x
[ https://issues.apache.org/jira/browse/SPARK-27491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16825804#comment-16825804 ] t oo commented on SPARK-27491: -- my current workaround to get airflow to integrate with spark2.3 cluster is to use 2 versions of spark on the airflow machines (spark2.3 to submit and spark2.1 to poll status) :) had to overwrite airflow's spark_submit_operator to point to 2 paths of spark > SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty > response! therefore Airflow won't integrate with Spark 2.3.x > -- > > Key: SPARK-27491 > URL: https://issues.apache.org/jira/browse/SPARK-27491 > Project: Spark > Issue Type: Bug > Components: Java API, Scheduler, Spark Core, Spark Shell, Spark > Submit >Affects Versions: 2.3.3 >Reporter: t oo >Priority: Major > > This issue must have been introduced after Spark 2.1.1 as it is working in > that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using > spark standalone mode if that makes a difference. > See below spark 2.3.3 returns empty response while 2.1.1 returns a response. > > Spark 2.1.1: > [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + export SPARK_HOME=/home/ec2here/spark_home1 > + SPARK_HOME=/home/ec2here/spark_home1 > + '[' -z /home/ec2here/spark_home1 ']' > + . /home/ec2here/spark_home1/bin/load-spark-env.sh > ++ '[' -z /home/ec2here/spark_home1 ']' > ++ '[' -z '' ']' > ++ export SPARK_ENV_LOADED=1 > ++ SPARK_ENV_LOADED=1 > ++ parent_dir=/home/ec2here/spark_home1 > ++ user_conf_dir=/home/ec2here/spark_home1/conf > ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']' > ++ set -a > ++ . /home/ec2here/spark_home1/conf/spark-env.sh > +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 > +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 > ulimit -n 1048576 > ++ set +a > ++ '[' -z '' ']' > ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11 > ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10 > ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]] > ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']' > ++ export SPARK_SCALA_VERSION=2.10 > ++ SPARK_SCALA_VERSION=2.10 > + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']' > + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java > + '[' -d /home/ec2here/spark_home1/jars ']' > + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars > + '[' '!' -d /home/ec2here/spark_home1/jars ']' > + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*' > + '[' -n '' ']' > + [[ -n '' ]] > + CMD=() > + IFS= > + read -d '' -r ARG > ++ build_command org.apache.spark.deploy.SparkSubmit --master > spark://domainhere:6066 --status driver-20190417130324-0009 > ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp > '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > ++ printf '%d\0' 0 > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + COUNT=10 > + LAST=9 > + LAUNCHER_EXIT_CODE=0 > + [[ 0 =~ ^[0-9]+$ ]] > + '[' 0 '!=' 0 ']' > + CMD=("${CMD[@]:0:$LAST}") > + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp > '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the > status of submission driver-20190417130324-0009 in spark://domainhere:6066. > 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with > SubmissionStatusResponse: > { > "action" : "SubmissionStatusResponse", > "driverState" : "FAILED", > "serverSparkVersion" : "2.3.3", > "submissionId" : "driver-20190417130324-0009", > "success" : true, > "workerHostPort" : "x.y.211.40:11819", > "workerId" : "worker-20190417115840-x.y.211.40-11819" > } > [ec2here@ip-x-y-160-225 ~]$ > > Spark 2.3.3: > [ec2here@ip-x-y-160-225 ~]$ bash -x
[jira] [Commented] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x
[ https://issues.apache.org/jira/browse/SPARK-27491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820542#comment-16820542 ] t oo commented on SPARK-27491: -- cc: [~skonto] [~mpmolek] [~gschiavon] [~scrapco...@gmail.com] > SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty > response! therefore Airflow won't integrate with Spark 2.3.x > -- > > Key: SPARK-27491 > URL: https://issues.apache.org/jira/browse/SPARK-27491 > Project: Spark > Issue Type: Bug > Components: Java API, Scheduler, Spark Core, Spark Shell, Spark > Submit >Affects Versions: 2.3.3 >Reporter: t oo >Priority: Blocker > > This issue must have been introduced after Spark 2.1.1 as it is working in > that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using > spark standalone mode if that makes a difference. > See below spark 2.3.3 returns empty response while 2.1.1 returns a response. > > Spark 2.1.1: > [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + export SPARK_HOME=/home/ec2here/spark_home1 > + SPARK_HOME=/home/ec2here/spark_home1 > + '[' -z /home/ec2here/spark_home1 ']' > + . /home/ec2here/spark_home1/bin/load-spark-env.sh > ++ '[' -z /home/ec2here/spark_home1 ']' > ++ '[' -z '' ']' > ++ export SPARK_ENV_LOADED=1 > ++ SPARK_ENV_LOADED=1 > ++ parent_dir=/home/ec2here/spark_home1 > ++ user_conf_dir=/home/ec2here/spark_home1/conf > ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']' > ++ set -a > ++ . /home/ec2here/spark_home1/conf/spark-env.sh > +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 > +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 > ulimit -n 1048576 > ++ set +a > ++ '[' -z '' ']' > ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11 > ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10 > ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]] > ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']' > ++ export SPARK_SCALA_VERSION=2.10 > ++ SPARK_SCALA_VERSION=2.10 > + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']' > + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java > + '[' -d /home/ec2here/spark_home1/jars ']' > + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars > + '[' '!' -d /home/ec2here/spark_home1/jars ']' > + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*' > + '[' -n '' ']' > + [[ -n '' ]] > + CMD=() > + IFS= > + read -d '' -r ARG > ++ build_command org.apache.spark.deploy.SparkSubmit --master > spark://domainhere:6066 --status driver-20190417130324-0009 > ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp > '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > ++ printf '%d\0' 0 > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + COUNT=10 > + LAST=9 > + LAUNCHER_EXIT_CODE=0 > + [[ 0 =~ ^[0-9]+$ ]] > + '[' 0 '!=' 0 ']' > + CMD=("${CMD[@]:0:$LAST}") > + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp > '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the > status of submission driver-20190417130324-0009 in spark://domainhere:6066. > 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with > SubmissionStatusResponse: > { > "action" : "SubmissionStatusResponse", > "driverState" : "FAILED", > "serverSparkVersion" : "2.3.3", > "submissionId" : "driver-20190417130324-0009", > "success" : true, > "workerHostPort" : "x.y.211.40:11819", > "workerId" : "worker-20190417115840-x.y.211.40-11819" > } > [ec2here@ip-x-y-160-225 ~]$ > > Spark 2.3.3: > [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home/bin/spark-class > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + '[' -z '' ']' > ++ dirname
[jira] [Updated] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x
[ https://issues.apache.org/jira/browse/SPARK-27491?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-27491: - Summary: SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x (was: SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark) > SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty > response! therefore Airflow won't integrate with Spark 2.3.x > -- > > Key: SPARK-27491 > URL: https://issues.apache.org/jira/browse/SPARK-27491 > Project: Spark > Issue Type: Bug > Components: Java API, Scheduler, Spark Core, Spark Shell, Spark > Submit >Affects Versions: 2.3.3 >Reporter: t oo >Priority: Blocker > > This issue must have been introduced after Spark 2.1.1 as it is working in > that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using > spark standalone mode if that makes a difference. > See below spark 2.3.3 returns empty response while 2.1.1 returns a response. > > Spark 2.1.1: > [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + export SPARK_HOME=/home/ec2here/spark_home1 > + SPARK_HOME=/home/ec2here/spark_home1 > + '[' -z /home/ec2here/spark_home1 ']' > + . /home/ec2here/spark_home1/bin/load-spark-env.sh > ++ '[' -z /home/ec2here/spark_home1 ']' > ++ '[' -z '' ']' > ++ export SPARK_ENV_LOADED=1 > ++ SPARK_ENV_LOADED=1 > ++ parent_dir=/home/ec2here/spark_home1 > ++ user_conf_dir=/home/ec2here/spark_home1/conf > ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']' > ++ set -a > ++ . /home/ec2here/spark_home1/conf/spark-env.sh > +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 > +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 > ulimit -n 1048576 > ++ set +a > ++ '[' -z '' ']' > ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11 > ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10 > ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]] > ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']' > ++ export SPARK_SCALA_VERSION=2.10 > ++ SPARK_SCALA_VERSION=2.10 > + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']' > + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java > + '[' -d /home/ec2here/spark_home1/jars ']' > + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars > + '[' '!' -d /home/ec2here/spark_home1/jars ']' > + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*' > + '[' -n '' ']' > + [[ -n '' ]] > + CMD=() > + IFS= > + read -d '' -r ARG > ++ build_command org.apache.spark.deploy.SparkSubmit --master > spark://domainhere:6066 --status driver-20190417130324-0009 > ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp > '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > ++ printf '%d\0' 0 > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + COUNT=10 > + LAST=9 > + LAUNCHER_EXIT_CODE=0 > + [[ 0 =~ ^[0-9]+$ ]] > + '[' 0 '!=' 0 ']' > + CMD=("${CMD[@]:0:$LAST}") > + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp > '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the > status of submission driver-20190417130324-0009 in spark://domainhere:6066. > 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with > SubmissionStatusResponse: > { > "action" : "SubmissionStatusResponse", > "driverState" : "FAILED", > "serverSparkVersion" : "2.3.3", > "submissionId" : "driver-20190417130324-0009", > "success" : true, > "workerHostPort" : "x.y.211.40:11819", > "workerId" : "worker-20190417115840-x.y.211.40-11819" > } > [ec2here@ip-x-y-160-225 ~]$ > > Spark 2.3.3: > [ec2here@ip-x-y-160-225 ~]$ bash -x
[jira] [Commented] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark 2.3.x
[ https://issues.apache.org/jira/browse/SPARK-27491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16820211#comment-16820211 ] t oo commented on SPARK-27491: -- cc: [~ash] [~bolke] > SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty > response! therefore Airflow won't integrate with Spark 2.3.x > -- > > Key: SPARK-27491 > URL: https://issues.apache.org/jira/browse/SPARK-27491 > Project: Spark > Issue Type: Bug > Components: Java API, Scheduler, Spark Core, Spark Shell, Spark > Submit >Affects Versions: 2.3.3 >Reporter: t oo >Priority: Blocker > > This issue must have been introduced after Spark 2.1.1 as it is working in > that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using > spark standalone mode if that makes a difference. > See below spark 2.3.3 returns empty response while 2.1.1 returns a response. > > Spark 2.1.1: > [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + export SPARK_HOME=/home/ec2here/spark_home1 > + SPARK_HOME=/home/ec2here/spark_home1 > + '[' -z /home/ec2here/spark_home1 ']' > + . /home/ec2here/spark_home1/bin/load-spark-env.sh > ++ '[' -z /home/ec2here/spark_home1 ']' > ++ '[' -z '' ']' > ++ export SPARK_ENV_LOADED=1 > ++ SPARK_ENV_LOADED=1 > ++ parent_dir=/home/ec2here/spark_home1 > ++ user_conf_dir=/home/ec2here/spark_home1/conf > ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']' > ++ set -a > ++ . /home/ec2here/spark_home1/conf/spark-env.sh > +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 > +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 > ulimit -n 1048576 > ++ set +a > ++ '[' -z '' ']' > ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11 > ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10 > ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]] > ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']' > ++ export SPARK_SCALA_VERSION=2.10 > ++ SPARK_SCALA_VERSION=2.10 > + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']' > + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java > + '[' -d /home/ec2here/spark_home1/jars ']' > + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars > + '[' '!' -d /home/ec2here/spark_home1/jars ']' > + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*' > + '[' -n '' ']' > + [[ -n '' ]] > + CMD=() > + IFS= > + read -d '' -r ARG > ++ build_command org.apache.spark.deploy.SparkSubmit --master > spark://domainhere:6066 --status driver-20190417130324-0009 > ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp > '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > ++ printf '%d\0' 0 > + CMD+=("$ARG") > + IFS= > + read -d '' -r ARG > + COUNT=10 > + LAST=9 > + LAUNCHER_EXIT_CODE=0 > + [[ 0 =~ ^[0-9]+$ ]] > + '[' 0 '!=' 0 ']' > + CMD=("${CMD[@]:0:$LAST}") > + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp > '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > Using Spark's default log4j profile: > org/apache/spark/log4j-defaults.properties > 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the > status of submission driver-20190417130324-0009 in spark://domainhere:6066. > 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with > SubmissionStatusResponse: > { > "action" : "SubmissionStatusResponse", > "driverState" : "FAILED", > "serverSparkVersion" : "2.3.3", > "submissionId" : "driver-20190417130324-0009", > "success" : true, > "workerHostPort" : "x.y.211.40:11819", > "workerId" : "worker-20190417115840-x.y.211.40-11819" > } > [ec2here@ip-x-y-160-225 ~]$ > > Spark 2.3.3: > [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home/bin/spark-class > org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status > driver-20190417130324-0009 > + '[' -z '' ']' > ++ dirname /home/ec2here/spark_home/bin/spark-class > + source
[jira] [Created] (SPARK-27491) SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark
t oo created SPARK-27491: Summary: SPARK REST API - "org.apache.spark.deploy.SparkSubmit --status" returns empty response! therefore Airflow won't integrate with Spark Key: SPARK-27491 URL: https://issues.apache.org/jira/browse/SPARK-27491 Project: Spark Issue Type: Bug Components: Java API, Scheduler, Spark Core, Spark Shell, Spark Submit Affects Versions: 2.3.3 Reporter: t oo This issue must have been introduced after Spark 2.1.1 as it is working in that version. This issue is affecting me in Spark 2.3.3/2.3.0. I am using spark standalone mode if that makes a difference. See below spark 2.3.3 returns empty response while 2.1.1 returns a response. Spark 2.1.1: [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home1/bin/spark-class org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009 + export SPARK_HOME=/home/ec2here/spark_home1 + SPARK_HOME=/home/ec2here/spark_home1 + '[' -z /home/ec2here/spark_home1 ']' + . /home/ec2here/spark_home1/bin/load-spark-env.sh ++ '[' -z /home/ec2here/spark_home1 ']' ++ '[' -z '' ']' ++ export SPARK_ENV_LOADED=1 ++ SPARK_ENV_LOADED=1 ++ parent_dir=/home/ec2here/spark_home1 ++ user_conf_dir=/home/ec2here/spark_home1/conf ++ '[' -f /home/ec2here/spark_home1/conf/spark-env.sh ']' ++ set -a ++ . /home/ec2here/spark_home1/conf/spark-env.sh +++ export JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 +++ JAVA_HOME=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ulimit -n 1048576 ++ set +a ++ '[' -z '' ']' ++ ASSEMBLY_DIR2=/home/ec2here/spark_home1/assembly/target/scala-2.11 ++ ASSEMBLY_DIR1=/home/ec2here/spark_home1/assembly/target/scala-2.10 ++ [[ -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ]] ++ '[' -d /home/ec2here/spark_home1/assembly/target/scala-2.11 ']' ++ export SPARK_SCALA_VERSION=2.10 ++ SPARK_SCALA_VERSION=2.10 + '[' -n /usr/lib/jvm/jre-1.8.0-openjdk.x86_64 ']' + RUNNER=/usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java + '[' -d /home/ec2here/spark_home1/jars ']' + SPARK_JARS_DIR=/home/ec2here/spark_home1/jars + '[' '!' -d /home/ec2here/spark_home1/jars ']' + LAUNCH_CLASSPATH='/home/ec2here/spark_home1/jars/*' + '[' -n '' ']' + [[ -n '' ]] + CMD=() + IFS= + read -d '' -r ARG ++ build_command org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009 ++ /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -Xmx128m -cp '/home/ec2here/spark_home1/jars/*' org.apache.spark.launcher.Main org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009 + CMD+=("$ARG") + IFS= + read -d '' -r ARG + CMD+=("$ARG") + IFS= + read -d '' -r ARG + CMD+=("$ARG") + IFS= + read -d '' -r ARG + CMD+=("$ARG") + IFS= + read -d '' -r ARG + CMD+=("$ARG") + IFS= + read -d '' -r ARG + CMD+=("$ARG") + IFS= + read -d '' -r ARG + CMD+=("$ARG") + IFS= + read -d '' -r ARG + CMD+=("$ARG") + IFS= + read -d '' -r ARG + CMD+=("$ARG") + IFS= + read -d '' -r ARG ++ printf '%d\0' 0 + CMD+=("$ARG") + IFS= + read -d '' -r ARG + COUNT=10 + LAST=9 + LAUNCHER_EXIT_CODE=0 + [[ 0 =~ ^[0-9]+$ ]] + '[' 0 '!=' 0 ']' + CMD=("${CMD[@]:0:$LAST}") + exec /usr/lib/jvm/jre-1.8.0-openjdk.x86_64/bin/java -cp '/home/ec2here/spark_home1/conf/:/home/ec2here/spark_home1/jars/*' -Xmx2048m org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009 Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 19/04/17 14:03:27 INFO RestSubmissionClient: Submitting a request for the status of submission driver-20190417130324-0009 in spark://domainhere:6066. 19/04/17 14:03:28 INFO RestSubmissionClient: Server responded with SubmissionStatusResponse: { "action" : "SubmissionStatusResponse", "driverState" : "FAILED", "serverSparkVersion" : "2.3.3", "submissionId" : "driver-20190417130324-0009", "success" : true, "workerHostPort" : "x.y.211.40:11819", "workerId" : "worker-20190417115840-x.y.211.40-11819" } [ec2here@ip-x-y-160-225 ~]$ Spark 2.3.3: [ec2here@ip-x-y-160-225 ~]$ bash -x /home/ec2here/spark_home/bin/spark-class org.apache.spark.deploy.SparkSubmit --master spark://domainhere:6066 --status driver-20190417130324-0009 + '[' -z '' ']' ++ dirname /home/ec2here/spark_home/bin/spark-class + source /home/ec2here/spark_home/bin/find-spark-home dirname /home/ec2here/spark_home/bin/spark-class +++ cd /home/ec2here/spark_home/bin +++ pwd ++ FIND_SPARK_HOME_PYTHON_SCRIPT=/home/ec2here/spark_home/bin/find_spark_home.py ++ '[' '!' -z '' ']' ++ '[' '!' -f /home/ec2here/spark_home/bin/find_spark_home.py ']' dirname /home/ec2here/spark_home/bin/spark-class +++ cd /home/ec2here/spark_home/bin/.. +++ pwd ++ export SPARK_HOME=/home/ec2here/spark_home ++ SPARK_HOME=/home/ec2here/spark_home + . /home/ec2here/spark_home/bin/load-spark-env.sh ++ '[' -z /home/ec2here/spark_home ']' ++
[jira] [Commented] (SPARK-25088) Rest Server default & doc updates
[ https://issues.apache.org/jira/browse/SPARK-25088?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16819764#comment-16819764 ] t oo commented on SPARK-25088: -- why block rest if auth is on? for example i want to be able to use unauthed rest AND authed standard submission > Rest Server default & doc updates > - > > Key: SPARK-25088 > URL: https://issues.apache.org/jira/browse/SPARK-25088 > Project: Spark > Issue Type: Improvement > Components: Deploy, Spark Core >Affects Versions: 2.1.3, 2.2.2, 2.3.1, 2.4.0 >Reporter: Imran Rashid >Assignee: Imran Rashid >Priority: Major > Labels: release-notes > Fix For: 2.4.0 > > > The rest server could use some updates on defaults & docs, both in standalone > and mesos. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-27429) [SQL] to_timestamp function with additional argument flag that will allow exception if value could not be cast
t oo created SPARK-27429: Summary: [SQL] to_timestamp function with additional argument flag that will allow exception if value could not be cast Key: SPARK-27429 URL: https://issues.apache.org/jira/browse/SPARK-27429 Project: Spark Issue Type: Improvement Components: SQL Affects Versions: 2.4.1 Reporter: t oo If I am running a SQL on a csv based dataframe and my query has to_timestamp(input_col,'-MM-dd HH:mm:ss'), if the values in input_col are not really timestamp like 'ABC' then I would like to_timestamp function to throw an exception rather than happily (silently) return the values as null. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-27284) Spark Standalone aggregated logs in 1 file per appid (ala yarn -logs -applicationId)
[ https://issues.apache.org/jira/browse/SPARK-27284?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16806462#comment-16806462 ] t oo commented on SPARK-27284: -- [~srowen] in yarn mode if you want all logs for a single spark app you can see them in a single command (yarn -logs -applicationId). But in spark standalone the logs are separated under work/ ie spread across many files. This Jira is about a single command to get all executor+driver logs in 1 file > Spark Standalone aggregated logs in 1 file per appid (ala yarn -logs > -applicationId) > - > > Key: SPARK-27284 > URL: https://issues.apache.org/jira/browse/SPARK-27284 > Project: Spark > Issue Type: New Feature > Components: Scheduler >Affects Versions: 2.3.3, 2.4.0 >Reporter: t oo >Priority: Minor > > Feature: Spark Standalone aggregated logs in 1 file per appid (ala yarn > -logs -applicationId) > > This would be 1 single file per appid with contents of ALL the executors logs > [https://stackoverflow.com/questions/46004528/how-can-i-see-the-aggregated-logs-for-a-spark-standalone-cluster] > > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-27284) Spark Standalone aggregated logs in 1 file per appid (ala yarn -logs -applicationId)
t oo created SPARK-27284: Summary: Spark Standalone aggregated logs in 1 file per appid (ala yarn -logs -applicationId) Key: SPARK-27284 URL: https://issues.apache.org/jira/browse/SPARK-27284 Project: Spark Issue Type: New Feature Components: Scheduler Affects Versions: 2.4.0, 2.3.3 Reporter: t oo Feature: Spark Standalone aggregated logs in 1 file per appid (ala yarn -logs -applicationId) This would be 1 single file per appid with contents of ALL the executors logs [https://stackoverflow.com/questions/46004528/how-can-i-see-the-aggregated-logs-for-a-spark-standalone-cluster] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26998) spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor processes in Standalone mode
[ https://issues.apache.org/jira/browse/SPARK-26998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16784791#comment-16784791 ] t oo commented on SPARK-26998: -- [~gsomogyi] please take it forward. [~kabhwan] truststore password being shown is not much of a problem since truststore is often distributed to users anyway. But keystore password still being shown is the big no-no. > spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor > processes in Standalone mode > --- > > Key: SPARK-26998 > URL: https://issues.apache.org/jira/browse/SPARK-26998 > Project: Spark > Issue Type: Bug > Components: Scheduler, Security, Spark Core >Affects Versions: 2.3.3, 2.4.0 >Reporter: t oo >Priority: Major > Labels: SECURITY, Security, secur, security, security-issue > > Run spark standalone mode, then start a spark-submit requiring at least 1 > executor. Do a 'ps -ef' on linux (ie putty terminal) and you will be able to > see spark.ssl.keyStorePassword value in plaintext! > > spark.ssl.keyStorePassword and spark.ssl.keyPassword don't need to be passed > to CoarseGrainedExecutorBackend. Only spark.ssl.trustStorePassword is used. > > Can be resolved if below PR is merged: > [[Github] Pull Request #21514 > (tooptoop4)|https://github.com/apache/spark/pull/21514] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26998) spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor processes in Standalone mode
[ https://issues.apache.org/jira/browse/SPARK-26998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782330#comment-16782330 ] t oo commented on SPARK-26998: -- [https://github.com/apache/spark/pull/23820] is only about hiding password from log file, SPARK-26998 is about hiding passwords from showing in 'ps -ef' process list > spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor > processes in Standalone mode > --- > > Key: SPARK-26998 > URL: https://issues.apache.org/jira/browse/SPARK-26998 > Project: Spark > Issue Type: Bug > Components: Scheduler, Security, Spark Core >Affects Versions: 2.3.3, 2.4.0 >Reporter: t oo >Priority: Major > Labels: SECURITY, Security, secur, security, security-issue > > Run spark standalone mode, then start a spark-submit requiring at least 1 > executor. Do a 'ps -ef' on linux (ie putty terminal) and you will be able to > see spark.ssl.keyStorePassword value in plaintext! > > spark.ssl.keyStorePassword and spark.ssl.keyPassword don't need to be passed > to CoarseGrainedExecutorBackend. Only spark.ssl.trustStorePassword is used. > > Can be resolved if below PR is merged: > [[Github] Pull Request #21514 > (tooptoop4)|https://github.com/apache/spark/pull/21514] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26998) spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor processes in Standalone mode
[ https://issues.apache.org/jira/browse/SPARK-26998?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] t oo updated SPARK-26998: - Labels: SECURITY Security secur security security-issue (was: ) > spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor > processes in Standalone mode > --- > > Key: SPARK-26998 > URL: https://issues.apache.org/jira/browse/SPARK-26998 > Project: Spark > Issue Type: Bug > Components: Scheduler, Security, Spark Core >Affects Versions: 2.3.3, 2.4.0 >Reporter: t oo >Priority: Major > Labels: SECURITY, Security, secur, security, security-issue > > Run spark standalone mode, then start a spark-submit requiring at least 1 > executor. Do a 'ps -ef' on linux (ie putty terminal) and you will be able to > see spark.ssl.keyStorePassword value in plaintext! > > spark.ssl.keyStorePassword and spark.ssl.keyPassword don't need to be passed > to CoarseGrainedExecutorBackend. Only spark.ssl.trustStorePassword is used. > > Can be resolved if below PR is merged: > [[Github] Pull Request #21514 > (tooptoop4)|https://github.com/apache/spark/pull/21514] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-26998) spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor processes in Standalone mode
t oo created SPARK-26998: Summary: spark.ssl.keyStorePassword in plaintext on 'ps -ef' output of executor processes in Standalone mode Key: SPARK-26998 URL: https://issues.apache.org/jira/browse/SPARK-26998 Project: Spark Issue Type: Bug Components: Scheduler, Security, Spark Core Affects Versions: 2.4.0, 2.3.3 Reporter: t oo Run spark standalone mode, then start a spark-submit requiring at least 1 executor. Do a 'ps -ef' on linux (ie putty terminal) and you will be able to see spark.ssl.keyStorePassword value in plaintext! spark.ssl.keyStorePassword and spark.ssl.keyPassword don't need to be passed to CoarseGrainedExecutorBackend. Only spark.ssl.trustStorePassword is used. Can be resolved if below PR is merged: [[Github] Pull Request #21514 (tooptoop4)|https://github.com/apache/spark/pull/21514] -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25757) Upgrade netty-all from 4.1.17.Final to 4.1.30.Final
[ https://issues.apache.org/jira/browse/SPARK-25757?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16775873#comment-16775873 ] t oo commented on SPARK-25757: -- [~lipzhu] want to upg netty to 4.1.33? > Upgrade netty-all from 4.1.17.Final to 4.1.30.Final > --- > > Key: SPARK-25757 > URL: https://issues.apache.org/jira/browse/SPARK-25757 > Project: Spark > Issue Type: Dependency upgrade > Components: Build >Affects Versions: 3.0.0 >Reporter: Zhu, Lipeng >Assignee: Zhu, Lipeng >Priority: Minor > Fix For: 3.0.0 > > > Upgrade netty from 4.1.17.Final to 4.1.30.Final to fix some netty version > bugs. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16859) History Server storage information is missing
[ https://issues.apache.org/jira/browse/SPARK-16859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770679#comment-16770679 ] t oo commented on SPARK-16859: -- I know how you feel, I submitted a PR to fix a security flaw (SPARK-22860) but the reviewers were more concerned about cosmetic aspects and did not merge it > History Server storage information is missing > - > > Key: SPARK-16859 > URL: https://issues.apache.org/jira/browse/SPARK-16859 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.6.2, 2.0.0 >Reporter: Andrei Ivanov >Priority: Major > Labels: historyserver, newbie > > It looks like job history storage tab in history server is broken for > completed jobs since *1.6.2*. > More specifically it's broken since > [SPARK-13845|https://issues.apache.org/jira/browse/SPARK-13845]. > I've fixed for my installation by effectively reverting the above patch > ([see|https://github.com/EinsamHauer/spark/commit/3af62ea09af8bb350c8c8a9117149c09b8feba08]). > IMHO, the most straightforward fix would be to implement > _SparkListenerBlockUpdated_ serialization to JSON in _JsonProtocol_ making > sure it works from _ReplayListenerBus_. > The downside will be that it will still work incorrectly with pre patch job > histories. But then, it doesn't work since *1.6.2* anyhow. > PS: I'd really love to have this fixed eventually. But I'm pretty new to > Apache Spark and missing hands on Scala experience. So I'd prefer that it be > fixed by someone experienced with roadmap vision. If nobody volunteers I'll > try to patch myself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-16859) History Server storage information is missing
[ https://issues.apache.org/jira/browse/SPARK-16859?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770482#comment-16770482 ] t oo commented on SPARK-16859: -- please don't leave us [~Hauer] > History Server storage information is missing > - > > Key: SPARK-16859 > URL: https://issues.apache.org/jira/browse/SPARK-16859 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 1.6.2, 2.0.0 >Reporter: Andrei Ivanov >Priority: Major > Labels: historyserver, newbie > > It looks like job history storage tab in history server is broken for > completed jobs since *1.6.2*. > More specifically it's broken since > [SPARK-13845|https://issues.apache.org/jira/browse/SPARK-13845]. > I've fixed for my installation by effectively reverting the above patch > ([see|https://github.com/EinsamHauer/spark/commit/3af62ea09af8bb350c8c8a9117149c09b8feba08]). > IMHO, the most straightforward fix would be to implement > _SparkListenerBlockUpdated_ serialization to JSON in _JsonProtocol_ making > sure it works from _ReplayListenerBus_. > The downside will be that it will still work incorrectly with pre patch job > histories. But then, it doesn't work since *1.6.2* anyhow. > PS: I'd really love to have this fixed eventually. But I'm pretty new to > Apache Spark and missing hands on Scala experience. So I'd prefer that it be > fixed by someone experienced with roadmap vision. If nobody volunteers I'll > try to patch myself. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17556) Executor side broadcast for broadcast joins
[ https://issues.apache.org/jira/browse/SPARK-17556?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770481#comment-16770481 ] t oo commented on SPARK-17556: -- please don't leave us [~scwf] > Executor side broadcast for broadcast joins > --- > > Key: SPARK-17556 > URL: https://issues.apache.org/jira/browse/SPARK-17556 > Project: Spark > Issue Type: New Feature > Components: Spark Core, SQL >Reporter: Reynold Xin >Priority: Major > Attachments: executor broadcast.pdf, executor-side-broadcast.pdf > > > Currently in Spark SQL, in order to perform a broadcast join, the driver must > collect the result of an RDD and then broadcast it. This introduces some > extra latency. It might be possible to broadcast directly from executors. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22860) Spark workers log ssl passwords passed to the executors
[ https://issues.apache.org/jira/browse/SPARK-22860?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770373#comment-16770373 ] t oo commented on SPARK-22860: -- Not just in worker log, but in 'ps -ef' process list > Spark workers log ssl passwords passed to the executors > --- > > Key: SPARK-22860 > URL: https://issues.apache.org/jira/browse/SPARK-22860 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.1 >Reporter: Felix K. >Priority: Major > > The workers log the spark.ssl.keyStorePassword and > spark.ssl.trustStorePassword passed by cli to the executor processes. The > ExecutorRunner should escape passwords to not appear in the worker's log > files in INFO level. In this example, you can see my 'SuperSecretPassword' in > a worker log: > {code} > 17/12/08 08:04:12 INFO ExecutorRunner: Launch command: > "/global/myapp/oem/jdk/bin/java" "-cp" > "/global/myapp/application/myapp_software/thing_loader_lib/core-repository-model-zzz-1.2.3-SNAPSHOT.jar > [...] > :/global/myapp/application/spark-2.1.1-bin-hadoop2.7/jars/*" "-Xmx16384M" > "-Dspark.authenticate.enableSaslEncryption=true" > "-Dspark.ssl.keyStorePassword=SuperSecretPassword" > "-Dspark.ssl.keyStore=/global/myapp/application/config/ssl/keystore.jks" > "-Dspark.ssl.trustStore=/global/myapp/application/config/ssl/truststore.jks" > "-Dspark.ssl.enabled=true" "-Dspark.driver.port=39927" > "-Dspark.ssl.protocol=TLS" > "-Dspark.ssl.trustStorePassword=SuperSecretPassword" > "-Dspark.authenticate=true" "-Dmyapp_IMPORT_DATE=2017-10-30" > "-Dmyapp.config.directory=/global/myapp/application/config" > "-Dsolr.httpclient.builder.factory=com.company.myapp.loader.auth.LoaderConfigSparkSolrBasicAuthConfigurer" > > "-Djavax.net.ssl.trustStore=/global/myapp/application/config/ssl/truststore.jks" > "-XX:+UseG1GC" "-XX:+UseStringDeduplication" > "-Dthings.loader.export.zzz_files=false" > "-Dlog4j.configuration=file:/global/myapp/application/config/spark-executor-log4j.properties" > "-XX:+HeapDumpOnOutOfMemoryError" "-XX:+UseStringDeduplication" > "org.apache.spark.executor.CoarseGrainedExecutorBackend" "--driver-url" > "spark://CoarseGrainedScheduler@192.168.0.1:39927" "--executor-id" "2" > "--hostname" "192.168.0.1" "--cores" "4" "--app-id" "app-20171208080412-" > "--worker-url" "spark://Worker@192.168.0.1:59530" > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20286) dynamicAllocation.executorIdleTimeout is ignored after unpersist
[ https://issues.apache.org/jira/browse/SPARK-20286?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770344#comment-16770344 ] t oo commented on SPARK-20286: -- gentle ping > dynamicAllocation.executorIdleTimeout is ignored after unpersist > > > Key: SPARK-20286 > URL: https://issues.apache.org/jira/browse/SPARK-20286 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.0.1 >Reporter: Miguel Pérez >Priority: Major > > With dynamic allocation enabled, it seems that executors with cached data > which are unpersisted are still being killed using the > {{dynamicAllocation.cachedExecutorIdleTimeout}} configuration, instead of > {{dynamicAllocation.executorIdleTimeout}}. Assuming the default configuration > ({{dynamicAllocation.cachedExecutorIdleTimeout = Infinity}}), an executor > with unpersisted data won't be released until the job ends. > *How to reproduce* > - Set different values for {{dynamicAllocation.executorIdleTimeout}} and > {{dynamicAllocation.cachedExecutorIdleTimeout}} > - Load a file into a RDD and persist it > - Execute an action on the RDD (like a count) so some executors are activated. > - When the action has finished, unpersist the RDD > - The application UI removes correctly the persisted data from the *Storage* > tab, but if you look in the *Executors* tab, you will find that the executors > remain *active* until ({{dynamicAllocation.cachedExecutorIdleTimeout}} is > reached. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-25594) OOM in long running applications even with UI disabled
[ https://issues.apache.org/jira/browse/SPARK-25594?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16770230#comment-16770230 ] t oo commented on SPARK-25594: -- please merge > OOM in long running applications even with UI disabled > -- > > Key: SPARK-25594 > URL: https://issues.apache.org/jira/browse/SPARK-25594 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.3.0, 2.4.0 >Reporter: Mridul Muralidharan >Assignee: Mridul Muralidharan >Priority: Major > > Typically for long running applications with large number of tasks it is > common to disable UI to minimize overhead at driver. > Earlier, with spark ui disabled, only stage/job information was kept as part > of JobProgressListener. > As part of history server scalability fixes, particularly SPARK-20643, > inspite of disabling UI - task information continues to be maintained in > memory. > In our long running tests against spark thrift server, this eventually > results in OOM. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-15046) When running hive-thriftserver with yarn on a secure cluster the workers fail with java.lang.NumberFormatException
[ https://issues.apache.org/jira/browse/SPARK-15046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769890#comment-16769890 ] t oo commented on SPARK-15046: -- gentle ping > When running hive-thriftserver with yarn on a secure cluster the workers fail > with java.lang.NumberFormatException > -- > > Key: SPARK-15046 > URL: https://issues.apache.org/jira/browse/SPARK-15046 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.0.0 >Reporter: Trystan Leftwich >Assignee: Marcelo Vanzin >Priority: Blocker > Fix For: 2.0.0 > > > When running hive-thriftserver with yarn on a secure cluster > (spark.yarn.principal and spark.yarn.keytab are set) the workers fail with > the following error. > {code} > 16/04/30 22:40:50 ERROR yarn.ApplicationMaster: Uncaught exception: > java.lang.NumberFormatException: For input string: "86400079ms" > at > java.lang.NumberFormatException.forInputString(NumberFormatException.java:65) > at java.lang.Long.parseLong(Long.java:441) > at java.lang.Long.parseLong(Long.java:483) > at > scala.collection.immutable.StringLike$class.toLong(StringLike.scala:276) > at scala.collection.immutable.StringOps.toLong(StringOps.scala:29) > at > org.apache.spark.SparkConf$$anonfun$getLong$2.apply(SparkConf.scala:380) > at > org.apache.spark.SparkConf$$anonfun$getLong$2.apply(SparkConf.scala:380) > at scala.Option.map(Option.scala:146) > at org.apache.spark.SparkConf.getLong(SparkConf.scala:380) > at > org.apache.spark.deploy.SparkHadoopUtil.getTimeFromNowToRenewal(SparkHadoopUtil.scala:289) > at > org.apache.spark.deploy.yarn.AMDelegationTokenRenewer.org$apache$spark$deploy$yarn$AMDelegationTokenRenewer$$scheduleRenewal$1(AMDelegationTokenRenewer.scala:89) > at > org.apache.spark.deploy.yarn.AMDelegationTokenRenewer.scheduleLoginFromKeytab(AMDelegationTokenRenewer.scala:121) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$3.apply(ApplicationMaster.scala:243) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$run$3.apply(ApplicationMaster.scala:243) > at scala.Option.foreach(Option.scala:257) > at > org.apache.spark.deploy.yarn.ApplicationMaster.run(ApplicationMaster.scala:243) > at > org.apache.spark.deploy.yarn.ApplicationMaster$$anonfun$main$1.apply$mcV$sp(ApplicationMaster.scala:723) > at > org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:67) > at > org.apache.spark.deploy.SparkHadoopUtil$$anon$1.run(SparkHadoopUtil.scala:66) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628) > at > org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:66) > at > org.apache.spark.deploy.yarn.ApplicationMaster$.main(ApplicationMaster.scala:721) > at > org.apache.spark.deploy.yarn.ExecutorLauncher$.main(ApplicationMaster.scala:748) > at > org.apache.spark.deploy.yarn.ExecutorLauncher.main(ApplicationMaster.scala) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22374) STS ran into OOM in a secure cluster
[ https://issues.apache.org/jira/browse/SPARK-22374?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16769885#comment-16769885 ] t oo commented on SPARK-22374: -- gentle ping > STS ran into OOM in a secure cluster > > > Key: SPARK-22374 > URL: https://issues.apache.org/jira/browse/SPARK-22374 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Dongjoon Hyun >Priority: Major > Attachments: 1.png, 2.png, 3.png > > > In a secure cluster, FileSystem.CACHE grows indefinitely. > *ENVIRONMENT* > 1. `spark.yarn.principal` and `spark.yarn.keytab` is used. > 2. Spark Thrift Server run with `doAs` false. > {code} > > hive.server2.enable.doAs > false > > {code} > With 6GB (-Xmx6144m) options, `HiveConf` consumes 4GB inside FileSystem.CACHE. > {code} > 20,030 instances of "org.apache.hadoop.hive.conf.HiveConf", loaded by > "sun.misc.Launcher$AppClassLoader @ 0x64001c160" occupy 4,418,101,352 > (73.42%) bytes. These instances are referenced from one instance of > "java.util.HashMap$Node[]", loaded by "" > {code} > Please see the attached images. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org