[jira] [Commented] (SPARK-38058) Writing a spark dataframe to Azure Sql Server is causing duplicate records intermittently
[ https://issues.apache.org/jira/browse/SPARK-38058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17486263#comment-17486263 ] john commented on SPARK-38058: -- since i am working in production env i cannot disclose any docs in here. this may be bug in spark. it happend every 3/5 times. for 2 times all the records are inserted correctly. other times duplicats are inserted. we have tried all workarounds it is not working > Writing a spark dataframe to Azure Sql Server is causing duplicate records > intermittently > - > > Key: SPARK-38058 > URL: https://issues.apache.org/jira/browse/SPARK-38058 > Project: Spark > Issue Type: Bug > Components: PySpark, Spark Core >Affects Versions: 3.1.0 >Reporter: john >Priority: Major > > We are using JDBC option to insert transformed data in a spark DataFrame to a > table in Azure SQL Server. Below is the code snippet we are using for this > insert. However, we noticed on few occasions that some records are being > duplicated in the destination table. This is happening for large tables. e.g. > if a DataFrame has 600K records, after inserting data into the table, we get > around 620K records. we still want to understand why that's happening. > {{DataToLoad.write.jdbc(url = jdbcUrl, table = targetTable, mode = > "overwrite", properties = jdbcConnectionProperties)}} > > Only reason we could think of is that while inserts are happening in > distributed fashion, if one of the executors fail in between, they are being > re-tried and could be inserting duplicate records. This could be totally > meaningless but just to see if that could be an issue.{{{}{}}} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-38058) Writing a spark dataframe to Azure Sql Server is causing duplicate records intermittently
[ https://issues.apache.org/jira/browse/SPARK-38058?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17484318#comment-17484318 ] john commented on SPARK-38058: -- it seems it doesn't specific to sql server.it is the problem with the spark itself. https://issues.apache.org/jira/browse/SPARK-16741 - this link suggest that disable the spark.speculation . but in latest spark version it is disable is default i have tried that also. also then the duplicate rows were there in sql server when i am using jdbc in spark. i have tried with small mount of data like 10K . it is working fine no duplicates. when i have load millions of data duplicate is there. because of this issue. we are using intermediate stage layer table to get all data including duplicates and we are inserting into landing zone with distinct clause. > Writing a spark dataframe to Azure Sql Server is causing duplicate records > intermittently > - > > Key: SPARK-38058 > URL: https://issues.apache.org/jira/browse/SPARK-38058 > Project: Spark > Issue Type: Bug > Components: PySpark, Spark Core >Affects Versions: 3.1.0 >Reporter: john >Priority: Major > > We are using JDBC option to insert transformed data in a spark DataFrame to a > table in Azure SQL Server. Below is the code snippet we are using for this > insert. However, we noticed on few occasions that some records are being > duplicated in the destination table. This is happening for large tables. e.g. > if a DataFrame has 600K records, after inserting data into the table, we get > around 620K records. we still want to understand why that's happening. > {{DataToLoad.write.jdbc(url = jdbcUrl, table = targetTable, mode = > "overwrite", properties = jdbcConnectionProperties)}} > > Only reason we could think of is that while inserts are happening in > distributed fashion, if one of the executors fail in between, they are being > re-tried and could be inserting duplicate records. This could be totally > meaningless but just to see if that could be an issue.{{{}{}}} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-38058) Writing a spark dataframe to Azure Sql Server is causing duplicate records intermittently
john created SPARK-38058: Summary: Writing a spark dataframe to Azure Sql Server is causing duplicate records intermittently Key: SPARK-38058 URL: https://issues.apache.org/jira/browse/SPARK-38058 Project: Spark Issue Type: Bug Components: PySpark, Spark Core Affects Versions: 3.1.0 Reporter: john We are using JDBC option to insert transformed data in a spark DataFrame to a table in Azure SQL Server. Below is the code snippet we are using for this insert. However, we noticed on few occasions that some records are being duplicated in the destination table. This is happening for large tables. e.g. if a DataFrame has 600K records, after inserting data into the table, we get around 620K records. we still want to understand why that's happening. {{DataToLoad.write.jdbc(url = jdbcUrl, table = targetTable, mode = "overwrite", properties = jdbcConnectionProperties)}} Only reason we could think of is that while inserts are happening in distributed fashion, if one of the executors fail in between, they are being re-tried and could be inserting duplicate records. This could be totally meaningless but just to see if that could be an issue.{{{}{}}} -- This message was sent by Atlassian Jira (v8.20.1#820001) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-34513) Kubernetes Spark Driver Pod Name Length Limitation
[ https://issues.apache.org/jira/browse/SPARK-34513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John updated SPARK-34513: - Description: Hi, We are using Spark in Airflow with the k8s-master. Airflow is attaching to our spark-driver pod a unique id utilizing the k8s-subdomain convention '.' This creates rather long pod-names. We noticed an issue with pod names in total (pod name + airflow attached uuid) exceeding 63 chars. Usually pod names can be up to 253 chars long. However Spark seems to have an issue with driver pod names which are longer than 63 characters. In our case the driver pod name is exactly 65 chars long, but Spark is omitting the last 2 chars in its error message. I assume internally Spark is loosing those two characters. Reducing our Driver Pod Name to just 63 charts fixed the issue. Here you can see the actual pod name (row 1) and the pod name from the Spark Error log (row 2) {code:java} ab-aa--cc-dd.3s092032c69f4639adff835a826e0120 ab-aa--cc-dd.3s092032c69f4639adff835a826e01{code} {code:java} [2021-02-20 00:30:06,289] {pod_launcher.py:136} INFO - Exception in thread "main" org.apache.spark.SparkException: No pod was found named Some(ab-aa--cc-dd.3s092032c69f4639adff835a826e01) in the cluster in the namespace airflow-ns (this was supposed to be the driver pod.).{code} was: Hi, We are using Spark in Airflow with the k8s-master. Airflow is attaching to our spark-driver pod a unique id utilizing the k8s-subdomain convention '.' This creates rather long pod-names. We noticed an issue with pod names in total (pod name + airflow attached uuid) exceeding 63 chars. Usually pod names can be up to 253 chars long. However Spark seems to have an issue with driver pod names which are longer than 63 characters. In our case the driver pod name is exactly 65 chars long, but Spark is omitting the last 2 chars in its error message. I assume internally Spark is loosing those two characters. Reducing our Driver Pod Name to just 63 charts fixed the issue. Here you can see the actual pod name (row 1) and the pod name from the Spark Error log (row 2) ab-aa--cc-dd.3s092032c69f4639adff835a826e0120 ab-aa--cc-dd.3s092032c69f4639adff835a826e01 [2021-02-20 00:30:06,289] \{pod_launcher.py:136} INFO - Exception in thread "main" org.apache.spark.SparkException: No pod was found named Some(ab-aa--cc-dd.3s092032c69f4639adff835a826e01) in the cluster in the namespace airflow-ns (this was supposed to be the driver pod.). > Kubernetes Spark Driver Pod Name Length Limitation > -- > > Key: SPARK-34513 > URL: https://issues.apache.org/jira/browse/SPARK-34513 > Project: Spark > Issue Type: Bug > Components: Kubernetes >Affects Versions: 3.0.0, 3.0.1 >Reporter: John >Priority: Major > > Hi, > We are using Spark in Airflow with the k8s-master. Airflow is attaching to > our spark-driver pod a unique id utilizing the k8s-subdomain convention '.' > This creates rather long pod-names. > We noticed an issue with pod names in total (pod name + airflow attached > uuid) exceeding 63 chars. Usually pod names can be up to 253 chars long. > However Spark seems to have an issue with driver pod names which are longer > than 63 characters. > In our case the driver pod name is exactly 65 chars long, but Spark is > omitting the last 2 chars in its error message. I assume internally Spark is > loosing those two characters. Reducing our Driver Pod Name to just 63 charts > fixed the issue. > Here you can see the actual pod name (row 1) and the pod name from the Spark > Error log (row 2) > {code:java} > ab-aa--cc-dd.3s092032c69f4639adff835a826e0120 > ab-aa--cc-dd.3s092032c69f4639adff835a826e01{code} > {code:java} > [2021-02-20 00:30:06,289] {pod_launcher.py:136} INFO - Exception in thread > "main" org.apache.spark.SparkException: No pod was found named > Some(ab-aa--cc-dd.3s092032c69f4639adff835a826e01) in the > cluster in the namespace airflow-ns (this was supposed to be the driver > pod.).{code} > -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-34513) Kubernetes Spark Driver Pod Name Length Limitation
John created SPARK-34513: Summary: Kubernetes Spark Driver Pod Name Length Limitation Key: SPARK-34513 URL: https://issues.apache.org/jira/browse/SPARK-34513 Project: Spark Issue Type: Bug Components: Kubernetes Affects Versions: 3.0.1, 3.0.0 Reporter: John Hi, We are using Spark in Airflow with the k8s-master. Airflow is attaching to our spark-driver pod a unique id utilizing the k8s-subdomain convention '.' This creates rather long pod-names. We noticed an issue with pod names in total (pod name + airflow attached uuid) exceeding 63 chars. Usually pod names can be up to 253 chars long. However Spark seems to have an issue with driver pod names which are longer than 63 characters. In our case the driver pod name is exactly 65 chars long, but Spark is omitting the last 2 chars in its error message. I assume internally Spark is loosing those two characters. Reducing our Driver Pod Name to just 63 charts fixed the issue. Here you can see the actual pod name (row 1) and the pod name from the Spark Error log (row 2) ab-aa--cc-dd.3s092032c69f4639adff835a826e0120 ab-aa--cc-dd.3s092032c69f4639adff835a826e01 [2021-02-20 00:30:06,289] \{pod_launcher.py:136} INFO - Exception in thread "main" org.apache.spark.SparkException: No pod was found named Some(ab-aa--cc-dd.3s092032c69f4639adff835a826e01) in the cluster in the namespace airflow-ns (this was supposed to be the driver pod.). -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-26000) Missing block when reading HDFS Data from Cloudera Manager
[ https://issues.apache.org/jira/browse/SPARK-26000?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16682163#comment-16682163 ] john commented on SPARK-26000: -- I have Cloudera Manager in Environment A which has HDFS component and Spark in B. I am doing a very sample read and write to/from HDFS. Writing to HDFS Cloudera Manager is working as expected when reading back i m getting below issues: "java.lang.reflect.InvocationTargetException" Caused By: "org.apache.spark.sql.AnalysisException: Unable to infer schema for Parquet. It must be specified manually.;" Caused By: "java.net.SocketTimeoutException: 6 millis timeout while waiting for channel to be ready for read. ch : java.nio.channels.SocketChannel[connected local=/SparkNode_IP_PORT_NoO remote=/NameNode:50010:" java Sample code // writing spark.write().mode("append").format("parquet").save(path_to_file); // read spark.read().parquet(path_to_file); > Missing block when reading HDFS Data from Cloudera Manager > -- > > Key: SPARK-26000 > URL: https://issues.apache.org/jira/browse/SPARK-26000 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.2 >Reporter: john >Priority: Major > > I am able to write to Cloudera Manager HDFS through Open Source Spark which > runs separately. but not able to read the Cloudera Manger HDFS data . > > I am getting missing block location, socketTimeOut. > > spark.read().textfile(path_to_file) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-26000) Missing block when reading HDFS Data from Cloudera Manager
john created SPARK-26000: Summary: Missing block when reading HDFS Data from Cloudera Manager Key: SPARK-26000 URL: https://issues.apache.org/jira/browse/SPARK-26000 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.2.2 Reporter: john I am able to write to Cloudera Manager HDFS through Open Source Spark which runs separately. but not able to read the Cloudera Manger HDFS data . I am getting missing block location, socketTimeOut. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-26000) Missing block when reading HDFS Data from Cloudera Manager
[ https://issues.apache.org/jira/browse/SPARK-26000?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] john updated SPARK-26000: - Description: I am able to write to Cloudera Manager HDFS through Open Source Spark which runs separately. but not able to read the Cloudera Manger HDFS data . I am getting missing block location, socketTimeOut. spark.read().textfile(path_to_file) was: I am able to write to Cloudera Manager HDFS through Open Source Spark which runs separately. but not able to read the Cloudera Manger HDFS data . I am getting missing block location, socketTimeOut. > Missing block when reading HDFS Data from Cloudera Manager > -- > > Key: SPARK-26000 > URL: https://issues.apache.org/jira/browse/SPARK-26000 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.2 >Reporter: john >Priority: Major > > I am able to write to Cloudera Manager HDFS through Open Source Spark which > runs separately. but not able to read the Cloudera Manger HDFS data . > > I am getting missing block location, socketTimeOut. > > spark.read().textfile(path_to_file) -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-23982) NoSuchMethodException: There is no startCredentialUpdater method in the object YarnSparkHadoopUtil
[ https://issues.apache.org/jira/browse/SPARK-23982?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16438895#comment-16438895 ] John commented on SPARK-23982: -- see spark-core_2.11-2.3.0 and spark-yarn_2.11-2.3.0 org.apache.spark.executor.CoarseGrainedExecutorBackend: if (driverConf.contains("spark.yarn.credentials.file")) { logInfo("Will periodically update credentials from: " + driverConf.get("spark.yarn.credentials.file")) Utils.classForName("org.apache.spark.deploy.yarn.YarnSparkHadoopUtil") .getMethod("startCredentialUpdater", classOf[SparkConf]) .invoke(null, driverConf) } > NoSuchMethodException: There is no startCredentialUpdater method in the > object YarnSparkHadoopUtil > -- > > Key: SPARK-23982 > URL: https://issues.apache.org/jira/browse/SPARK-23982 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.3.0 >Reporter: John >Priority: Major > > In the 219 line of the CoarseGrainedExecutorBackend class: > Utils.classForName("org.apache.spark.deploy.yarn.YarnSparkHadoopUtil").getMethod("startCredentialUpdater", > classOf[SparkConf]).invoke(null, driverConf) > But, There is no startCredentialUpdater method in the object > YarnSparkHadoopUtil. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-23982) NoSuchMethodException: There is no startCredentialUpdater method in the object YarnSparkHadoopUtil
John created SPARK-23982: Summary: NoSuchMethodException: There is no startCredentialUpdater method in the object YarnSparkHadoopUtil Key: SPARK-23982 URL: https://issues.apache.org/jira/browse/SPARK-23982 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.3.0 Reporter: John In the 219 line of the CoarseGrainedExecutorBackend class: Utils.classForName("org.apache.spark.deploy.yarn.YarnSparkHadoopUtil").getMethod("startCredentialUpdater", classOf[SparkConf]).invoke(null, driverConf) But, There is no startCredentialUpdater method in the object YarnSparkHadoopUtil. -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-17885) Spark Streaming deletes checkpointed RDD then tries to load it after restart
[ https://issues.apache.org/jira/browse/SPARK-17885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189483#comment-16189483 ] Vishal John edited comment on SPARK-17885 at 10/3/17 11:27 AM: --- I can see that the checkpointed folder was explicitly deleted - INFO dstream.DStreamCheckpointData: Deleted checkpoint file 'hdfs://nameservice1/user/my-user/checkpoints/my-application/8c683e77-33b9-42ee-80f7-167abb39c241/rdd-401 I was looking at the source code of `cleanup` method in `DStreamCheckpointData`. I am curious to know what setting is causing this behaviour. My StreamingContext batch duration is 30 seconds and I haven't provided any other time intervals. Should i need to provide any other intervals like checkpoint interval or something like that ? - UPDATE: I was able to get around this problem by setting "spark.streaming.stopGracefullyOnShutdown" to "true"" was (Author: vishaljohn): I can see that the checkpointed folder was explicitly deleted - INFO dstream.DStreamCheckpointData: Deleted checkpoint file 'hdfs://nameservice1/user/my-user/checkpoints/my-application/8c683e77-33b9-42ee-80f7-167abb39c241/rdd-401 I was looking at the source code of `cleanup` method in `DStreamCheckpointData`. I am curious to know what setting is causing this behaviour. My StreamingContext batch duration is 30 seconds and I haven't provided any other time intervals. Should i need to provide any other intervals like checkpoint interval or something like that ? > Spark Streaming deletes checkpointed RDD then tries to load it after restart > > > Key: SPARK-17885 > URL: https://issues.apache.org/jira/browse/SPARK-17885 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 1.5.1 >Reporter: Cosmin Ciobanu > > The issue is that the Spark driver checkpoints an RDD, deletes it, the job > restarts, and the new driver tries to load the deleted checkpoint RDD. > The application is run in YARN, which attempts to restart the application a > number of times (100 in our case), all of which fail due to missing the > deleted RDD. > Here is a Splunk log which shows the inconsistency in checkpoint behaviour: > *2016-10-09 02:48:43,533* [streaming-job-executor-0] INFO > org.apache.spark.rdd.ReliableRDDCheckpointData - Done checkpointing RDD 73847 > to > hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847*, > new parent is RDD 73872 > host = ip-10-1-1-13.ec2.internal > *2016-10-09 02:53:14,696* [JobGenerator] INFO > org.apache.spark.streaming.dstream.DStreamCheckpointData - Deleted checkpoint > file > 'hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847*' > for time 147598131 ms > host = ip-10-1-1-13.ec2.internal > *Job restarts here, notice driver host change from ip-10-1-1-13.ec2.internal > to ip-10-1-1-25.ec2.internal.* > *2016-10-09 02:53:30,175* [Driver] INFO > org.apache.spark.streaming.dstream.DStreamCheckpointData - Restoring > checkpointed RDD for time 147598131 ms from file > 'hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847*' > host = ip-10-1-1-25.ec2.internal > *2016-10-09 02:53:30,491* [Driver] ERROR > org.apache.spark.deploy.yarn.ApplicationMaster - User class threw exception: > java.lang.IllegalArgumentException: requirement failed: Checkpoint directory > does not exist: > hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847* > java.lang.IllegalArgumentException: requirement failed: Checkpoint directory > does not exist: > hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847* > host = ip-10-1-1-25.ec2.internal > Spark streaming is configured with a microbatch interval of 30 seconds, > checkpoint interval of 120 seconds, and cleaner.ttl of 28800 (8 hours), but > as far as I can tell, this TTL only affects metadata cleanup interval. RDDs > seem to be deleted every 4-5 minutes after being checkpointed. > Running on top of Spark 1.5.1. > There are at least two possible issues here: > - In case of a driver restart the new driver tries to load checkpointed RDDs > which the previous driver had just deleted; > - Spark loads stale checkpointed data - the logs show that the deleted RDD > was initially checkpointed 4 minutes and 31 seconds before deletion, and 4 > minutes and 47 seconds before the new driver tries to load it. Given the fact > the checkpointing interval is 120 seconds, it makes no sense to load data > older than that. > P.S. Looking at the source code with the event loop that handles checkpoint > updates and cleanup, nothing seems to have changed in more recent
[jira] [Commented] (SPARK-17885) Spark Streaming deletes checkpointed RDD then tries to load it after restart
[ https://issues.apache.org/jira/browse/SPARK-17885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189483#comment-16189483 ] Vishal John commented on SPARK-17885: - I can see that the checkpointed folder was explicitly deleted - INFO dstream.DStreamCheckpointData: Deleted checkpoint file 'hdfs://nameservice1/user/my-user/checkpoints/my-application/8c683e77-33b9-42ee-80f7-167abb39c241/rdd-401 I was looking at the source code of `cleanup` method in `DStreamCheckpointData`. I am curious to know what setting is causing this behaviour. My StreamingContext batch duration is 30 seconds and I haven't provided any other time intervals. Should i need to provide any other intervals like checkpoint interval or something like that ? > Spark Streaming deletes checkpointed RDD then tries to load it after restart > > > Key: SPARK-17885 > URL: https://issues.apache.org/jira/browse/SPARK-17885 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 1.5.1 >Reporter: Cosmin Ciobanu > > The issue is that the Spark driver checkpoints an RDD, deletes it, the job > restarts, and the new driver tries to load the deleted checkpoint RDD. > The application is run in YARN, which attempts to restart the application a > number of times (100 in our case), all of which fail due to missing the > deleted RDD. > Here is a Splunk log which shows the inconsistency in checkpoint behaviour: > *2016-10-09 02:48:43,533* [streaming-job-executor-0] INFO > org.apache.spark.rdd.ReliableRDDCheckpointData - Done checkpointing RDD 73847 > to > hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847*, > new parent is RDD 73872 > host = ip-10-1-1-13.ec2.internal > *2016-10-09 02:53:14,696* [JobGenerator] INFO > org.apache.spark.streaming.dstream.DStreamCheckpointData - Deleted checkpoint > file > 'hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847*' > for time 147598131 ms > host = ip-10-1-1-13.ec2.internal > *Job restarts here, notice driver host change from ip-10-1-1-13.ec2.internal > to ip-10-1-1-25.ec2.internal.* > *2016-10-09 02:53:30,175* [Driver] INFO > org.apache.spark.streaming.dstream.DStreamCheckpointData - Restoring > checkpointed RDD for time 147598131 ms from file > 'hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847*' > host = ip-10-1-1-25.ec2.internal > *2016-10-09 02:53:30,491* [Driver] ERROR > org.apache.spark.deploy.yarn.ApplicationMaster - User class threw exception: > java.lang.IllegalArgumentException: requirement failed: Checkpoint directory > does not exist: > hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847* > java.lang.IllegalArgumentException: requirement failed: Checkpoint directory > does not exist: > hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847* > host = ip-10-1-1-25.ec2.internal > Spark streaming is configured with a microbatch interval of 30 seconds, > checkpoint interval of 120 seconds, and cleaner.ttl of 28800 (8 hours), but > as far as I can tell, this TTL only affects metadata cleanup interval. RDDs > seem to be deleted every 4-5 minutes after being checkpointed. > Running on top of Spark 1.5.1. > There are at least two possible issues here: > - In case of a driver restart the new driver tries to load checkpointed RDDs > which the previous driver had just deleted; > - Spark loads stale checkpointed data - the logs show that the deleted RDD > was initially checkpointed 4 minutes and 31 seconds before deletion, and 4 > minutes and 47 seconds before the new driver tries to load it. Given the fact > the checkpointing interval is 120 seconds, it makes no sense to load data > older than that. > P.S. Looking at the source code with the event loop that handles checkpoint > updates and cleanup, nothing seems to have changed in more recent versions of > Spark, so the bug is likely present in 2.0.1 as well. > P.P.S. The issue is difficult to reproduce - it only occurs once in every 10 > or so restarts, and only in clusters with high-load. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-17885) Spark Streaming deletes checkpointed RDD then tries to load it after restart
[ https://issues.apache.org/jira/browse/SPARK-17885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16189403#comment-16189403 ] Vishal John commented on SPARK-17885: - Hello all, Our application also suffers from the same problem. Our application uses Spark states(mapWithState) and checkpointed RDDs are getting created in the specified checkpoint folder. But when the application is killed then the directory containing the checkpointed RDDs is cleared. When I launch the application again, it fails because it cannot find the checkpoint directory. This is the error 'java.lang.IllegalArgumentException: requirement failed: Checkpoint directory does not exist: hdfs://nameservice1/user/my-user/checkpoints/my-application/77b1dd15-f904-4e80-a5ed-5018224b4df0/rdd-6833' The applications uses Spark 2.0.2 and it's deployed on Cloudera YARN (2.5.0-cdh5.2.0) Because of this error we are unable to use the checkpointed RDDs and Spark states. Can this issue be taken up as priority ? Please let me know if you require any additional information. [~tdas][~srowen] thanks a lot, Vishal > Spark Streaming deletes checkpointed RDD then tries to load it after restart > > > Key: SPARK-17885 > URL: https://issues.apache.org/jira/browse/SPARK-17885 > Project: Spark > Issue Type: Bug > Components: DStreams >Affects Versions: 1.5.1 >Reporter: Cosmin Ciobanu > > The issue is that the Spark driver checkpoints an RDD, deletes it, the job > restarts, and the new driver tries to load the deleted checkpoint RDD. > The application is run in YARN, which attempts to restart the application a > number of times (100 in our case), all of which fail due to missing the > deleted RDD. > Here is a Splunk log which shows the inconsistency in checkpoint behaviour: > *2016-10-09 02:48:43,533* [streaming-job-executor-0] INFO > org.apache.spark.rdd.ReliableRDDCheckpointData - Done checkpointing RDD 73847 > to > hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847*, > new parent is RDD 73872 > host = ip-10-1-1-13.ec2.internal > *2016-10-09 02:53:14,696* [JobGenerator] INFO > org.apache.spark.streaming.dstream.DStreamCheckpointData - Deleted checkpoint > file > 'hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847*' > for time 147598131 ms > host = ip-10-1-1-13.ec2.internal > *Job restarts here, notice driver host change from ip-10-1-1-13.ec2.internal > to ip-10-1-1-25.ec2.internal.* > *2016-10-09 02:53:30,175* [Driver] INFO > org.apache.spark.streaming.dstream.DStreamCheckpointData - Restoring > checkpointed RDD for time 147598131 ms from file > 'hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847*' > host = ip-10-1-1-25.ec2.internal > *2016-10-09 02:53:30,491* [Driver] ERROR > org.apache.spark.deploy.yarn.ApplicationMaster - User class threw exception: > java.lang.IllegalArgumentException: requirement failed: Checkpoint directory > does not exist: > hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847* > java.lang.IllegalArgumentException: requirement failed: Checkpoint directory > does not exist: > hdfs://proc-job/checkpoint/cadf8dcf-ebc2-4366-a2e1-0939976c6ce1/*rdd-73847* > host = ip-10-1-1-25.ec2.internal > Spark streaming is configured with a microbatch interval of 30 seconds, > checkpoint interval of 120 seconds, and cleaner.ttl of 28800 (8 hours), but > as far as I can tell, this TTL only affects metadata cleanup interval. RDDs > seem to be deleted every 4-5 minutes after being checkpointed. > Running on top of Spark 1.5.1. > There are at least two possible issues here: > - In case of a driver restart the new driver tries to load checkpointed RDDs > which the previous driver had just deleted; > - Spark loads stale checkpointed data - the logs show that the deleted RDD > was initially checkpointed 4 minutes and 31 seconds before deletion, and 4 > minutes and 47 seconds before the new driver tries to load it. Given the fact > the checkpointing interval is 120 seconds, it makes no sense to load data > older than that. > P.S. Looking at the source code with the event loop that handles checkpoint > updates and cleanup, nothing seems to have changed in more recent versions of > Spark, so the bug is likely present in 2.0.1 as well. > P.P.S. The issue is difficult to reproduce - it only occurs once in every 10 > or so restarts, and only in clusters with high-load. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-21346) Spark does not use SSL for HTTP File Server and Broadcast Server
[ https://issues.apache.org/jira/browse/SPARK-21346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079405#comment-16079405 ] John edited comment on SPARK-21346 at 7/9/17 1:27 AM: -- Sorry, I wasn't aware of that. Do you mind elaborating more on what kind of resources are fetched using HTTPS? I'd be happy to make a PR but I'd just like a little more information. was (Author: jljlee118): Sorry, I wasn't aware of that. Do you mind elaborating more on what kind of resources are fetched using HTTPS? I'd be happy to make a PR buy I'd just like a little more information. > Spark does not use SSL for HTTP File Server and Broadcast Server > > > Key: SPARK-21346 > URL: https://issues.apache.org/jira/browse/SPARK-21346 > Project: Spark > Issue Type: Question > Components: Documentation, Spark Core >Affects Versions: 2.1.1 >Reporter: John >Priority: Minor > Labels: documentation > > SecurityManager states that SSL is used to secure HTTP communication for the > broadcast and file server. However, the SSLOptions from the SecurityManager > only seem to be used by the SparkUI, the WebUI, and the HistoryServer. > According to [Spark-11140|https://issues.apache.org/jira/browse/SPARK-11140] > and [Spark-12588|https://issues.apache.org/jira/browse/SPARK-12588], neither > the file server nor broadcast use HTTP anymore. It seems that the > documentation is inaccurate and that Spark actually uses SASL on the RPC > endpoints to secure the file server and broadcast communications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21346) Spark does not use SSL for HTTP File Server and Broadcast Server
[ https://issues.apache.org/jira/browse/SPARK-21346?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16079405#comment-16079405 ] John commented on SPARK-21346: -- Sorry, I wasn't aware of that. Do you mind elaborating more on what kind of resources are fetched using HTTPS? I'd be happy to make a PR buy I'd just like a little more information. > Spark does not use SSL for HTTP File Server and Broadcast Server > > > Key: SPARK-21346 > URL: https://issues.apache.org/jira/browse/SPARK-21346 > Project: Spark > Issue Type: Question > Components: Documentation, Spark Core >Affects Versions: 2.1.1 >Reporter: John >Priority: Minor > Labels: documentation > > SecurityManager states that SSL is used to secure HTTP communication for the > broadcast and file server. However, the SSLOptions from the SecurityManager > only seem to be used by the SparkUI, the WebUI, and the HistoryServer. > According to [Spark-11140|https://issues.apache.org/jira/browse/SPARK-11140] > and [Spark-12588|https://issues.apache.org/jira/browse/SPARK-12588], neither > the file server nor broadcast use HTTP anymore. It seems that the > documentation is inaccurate and that Spark actually uses SASL on the RPC > endpoints to secure the file server and broadcast communications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-21346) Spark does not use SSL for HTTP File Server and Broadcast Server
John created SPARK-21346: Summary: Spark does not use SSL for HTTP File Server and Broadcast Server Key: SPARK-21346 URL: https://issues.apache.org/jira/browse/SPARK-21346 Project: Spark Issue Type: Question Components: Documentation, Spark Core Affects Versions: 2.1.1 Reporter: John Priority: Minor SecurityManager states that SSL is used to secure HTTP communication for the broadcast and file server. However, the SSLOptions from the SecurityManager only seem to be used by the SparkUI, the WebUI, and the HistoryServer. According to [Spark-11140|https://issues.apache.org/jira/browse/SPARK-11140] and [Spark-12588|https://issues.apache.org/jira/browse/SPARK-12588], neither the file server nor broadcast use HTTP anymore. It seems that the documentation is inaccurate and that Spark actually uses SASL on the RPC endpoints to secure the file server and broadcast communications. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-12763) Spark gets stuck executing SSB query
[ https://issues.apache.org/jira/browse/SPARK-12763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15292263#comment-15292263 ] Rogers Jeffrey Leo John commented on SPARK-12763: - I believe it is related to the join order : date, customer, supplier, part, ddate customer supplier are all dimension tables and joining them in that order would result in a crossproduct. I believe rewriting the query to use "lineorder date, customer, supplier, part" in the from clause should get it to work > Spark gets stuck executing SSB query > > > Key: SPARK-12763 > URL: https://issues.apache.org/jira/browse/SPARK-12763 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 1.6.0 > Environment: Standalone cluster >Reporter: Vadim Tkachenko > Attachments: Spark shell - Details for Stage 5 (Attempt 0).pdf > > > I am trying to emulate SSB load. Data generated with > https://github.com/Percona-Lab/ssb-dbgen > generated size is with 1000 scale factor and converted to parquet format. > Now there is a following script > val pLineOrder = > sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/lineorder").cache() > val pDate = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/date").cache() > val pPart = sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/part").cache() > val pSupplier = > sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/supplier").cache() > val pCustomer = > sqlContext.read.parquet("/mnt/i3600/spark/ssb-1000/customer").cache() > pLineOrder.registerTempTable("lineorder") > pDate.registerTempTable("date") > pPart.registerTempTable("part") > pSupplier.registerTempTable("supplier") > pCustomer.registerTempTable("customer") > query > val sql41=sqlContext.sql("select D_YEAR, C_NATION,sum(LO_REVENUE - > LO_SUPPLYCOST) as profit from date, customer, supplier, part, lineorder > where LO_CUSTKEY = C_CUSTKEYand LO_SUPPKEY = S_SUPPKEYand > LO_PARTKEY = P_PARTKEY and LO_ORDERDATE = D_DATEKEYand C_REGION = > 'AMERICA'and S_REGION = 'AMERICA'and (P_MFGR = 'MFGR#1' or P_MFGR = > 'MFGR#2') group by D_YEAR, C_NATION order by D_YEAR, C_NATION") > and > sql41.show() > get stuck, at some point there is no progress and server is fully idle, but > Job is staying at the same stage. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-6388) Spark 1.3 + Hadoop 2.6 Can't work on Java 8_40
John created SPARK-6388: --- Summary: Spark 1.3 + Hadoop 2.6 Can't work on Java 8_40 Key: SPARK-6388 URL: https://issues.apache.org/jira/browse/SPARK-6388 Project: Spark Issue Type: Bug Components: Block Manager, Spark Submit, YARN Affects Versions: 1.3.0 Environment: 1. Linux version 3.16.0-30-generic (buildd@komainu) (gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6) ) #40-Ubuntu SMP Mon Jan 12 22:06:37 UTC 2015 2. Oracle Java 8 update 40 for Linux X64 3. Scala 2.10.5 Reporter: John I build Apache Spark 1.3 munally. --- JAVA_HOME=PATH_TO_JAVA8 mvn clean package -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests --- Something goes wrong, akka always tell me --- 15/03/17 21:28:10 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM@Server2:42161] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. --- I build another version of Spark 1.3 + Hadoop 2.6 under Java 7. Everything goes well. Logs --- 15/03/17 21:27:06 INFO spark.SparkContext: Running Spark version 1.3.0 15/03/17 21:27:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/03/17 21:27:08 INFO spark.SecurityManager: Changing view Servers to: hduser 15/03/17 21:27:08 INFO spark.SecurityManager: Changing modify Servers to: hduser 15/03/17 21:27:08 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui Servers disabled; users with view permissions: Set(hduser); users with modify permissions: Set(hduser) 15/03/17 21:27:08 INFO slf4j.Slf4jLogger: Slf4jLogger started 15/03/17 21:27:08 INFO Remoting: Starting remoting 15/03/17 21:27:09 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@Server3:37951] 15/03/17 21:27:09 INFO util.Utils: Successfully started service 'sparkDriver' on port 37951. 15/03/17 21:27:09 INFO spark.SparkEnv: Registering MapOutputTracker 15/03/17 21:27:09 INFO spark.SparkEnv: Registering BlockManagerMaster 15/03/17 21:27:09 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-0db692bb-cd02-40c8-a8f0-3813c6da18e2/blockmgr-a1d0ad23-ab76-4177-80a0-a6f982a64d80 15/03/17 21:27:09 INFO storage.MemoryStore: MemoryStore started with capacity 265.1 MB 15/03/17 21:27:09 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-502ef3f8-b8cd-45cf-b1df-97df297cdb35/httpd-6303e24d-4b2b-4614-bb1d-74e8d331189b 15/03/17 21:27:09 INFO spark.HttpServer: Starting HTTP Server 15/03/17 21:27:09 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/03/17 21:27:10 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:48000 15/03/17 21:27:10 INFO util.Utils: Successfully started service 'HTTP file server' on port 48000. 15/03/17 21:27:10 INFO spark.SparkEnv: Registering OutputCommitCoordinator 15/03/17 21:27:10 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/03/17 21:27:10 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040 15/03/17 21:27:10 INFO util.Utils: Successfully started service 'SparkUI' on port 4040. 15/03/17 21:27:10 INFO ui.SparkUI: Started SparkUI at http://Server3:4040 15/03/17 21:27:10 INFO spark.SparkContext: Added JAR file:/home/hduser/spark-java2.jar at http://192.168.11.42:48000/jars/spark-java2.jar with timestamp 1426598830307 15/03/17 21:27:10 INFO client.RMProxy: Connecting to ResourceManager at Server3/192.168.11.42:8050 15/03/17 21:27:11 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers 15/03/17 21:27:11 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 15/03/17 21:27:11 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 15/03/17 21:27:11 INFO yarn.Client: Setting up container launch context for our AM 15/03/17 21:27:11 INFO yarn.Client: Preparing resources for our AM container 15/03/17 21:27:12 INFO yarn.Client: Uploading resource file:/home/hduser/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.6.0.jar - hdfs://Server3:9000/user/hduser/.sparkStaging/application_1426595477608_0002/spark-assembly-1.3.0-hadoop2.6.0.jar 15/03/17 21:27:21 INFO yarn.Client: Setting up the launch environment for our AM container 15/03/17 21:27:21 INFO spark.SecurityManager: Changing view Servers to: hduser 15/03/17 21:27:21 INFO spark.SecurityManager: Changing modify Servers to: hduser 15/03/17 21:27:21 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui Servers disabled; users with view permissions: Set(hduser); users with modify permissions: Set(hduser) 15/03/17 21:27:21 INFO yarn.Client: Submitting application 2 to ResourceManager 15/03/17
[jira] [Closed] (SPARK-6388) Spark 1.3 + Hadoop 2.6 Can't work on Java 8_40
[ https://issues.apache.org/jira/browse/SPARK-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John closed SPARK-6388. --- Resolution: Not a Problem Spark 1.3 + Hadoop 2.6 Can't work on Java 8_40 -- Key: SPARK-6388 URL: https://issues.apache.org/jira/browse/SPARK-6388 Project: Spark Issue Type: Bug Components: Block Manager, Spark Submit, YARN Affects Versions: 1.3.0 Environment: 1. Linux version 3.16.0-30-generic (buildd@komainu) (gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6) ) #40-Ubuntu SMP Mon Jan 12 22:06:37 UTC 2015 2. Oracle Java 8 update 40 for Linux X64 3. Scala 2.10.5 4. Hadoop 2.6 (pre-build version) Reporter: John Original Estimate: 24h Remaining Estimate: 24h I build Apache Spark 1.3 munally. --- JAVA_HOME=PATH_TO_JAVA8 mvn clean package -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests --- Something goes wrong, akka always tell me --- 15/03/17 21:28:10 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM@Server2:42161] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. --- I build another version of Spark 1.3 + Hadoop 2.6 under Java 7. Everything goes well. Logs --- 15/03/17 21:27:06 INFO spark.SparkContext: Running Spark version 1.3.0 15/03/17 21:27:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/03/17 21:27:08 INFO spark.SecurityManager: Changing view Servers to: hduser 15/03/17 21:27:08 INFO spark.SecurityManager: Changing modify Servers to: hduser 15/03/17 21:27:08 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui Servers disabled; users with view permissions: Set(hduser); users with modify permissions: Set(hduser) 15/03/17 21:27:08 INFO slf4j.Slf4jLogger: Slf4jLogger started 15/03/17 21:27:08 INFO Remoting: Starting remoting 15/03/17 21:27:09 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@Server3:37951] 15/03/17 21:27:09 INFO util.Utils: Successfully started service 'sparkDriver' on port 37951. 15/03/17 21:27:09 INFO spark.SparkEnv: Registering MapOutputTracker 15/03/17 21:27:09 INFO spark.SparkEnv: Registering BlockManagerMaster 15/03/17 21:27:09 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-0db692bb-cd02-40c8-a8f0-3813c6da18e2/blockmgr-a1d0ad23-ab76-4177-80a0-a6f982a64d80 15/03/17 21:27:09 INFO storage.MemoryStore: MemoryStore started with capacity 265.1 MB 15/03/17 21:27:09 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-502ef3f8-b8cd-45cf-b1df-97df297cdb35/httpd-6303e24d-4b2b-4614-bb1d-74e8d331189b 15/03/17 21:27:09 INFO spark.HttpServer: Starting HTTP Server 15/03/17 21:27:09 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/03/17 21:27:10 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:48000 15/03/17 21:27:10 INFO util.Utils: Successfully started service 'HTTP file server' on port 48000. 15/03/17 21:27:10 INFO spark.SparkEnv: Registering OutputCommitCoordinator 15/03/17 21:27:10 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/03/17 21:27:10 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040 15/03/17 21:27:10 INFO util.Utils: Successfully started service 'SparkUI' on port 4040. 15/03/17 21:27:10 INFO ui.SparkUI: Started SparkUI at http://Server3:4040 15/03/17 21:27:10 INFO spark.SparkContext: Added JAR file:/home/hduser/spark-java2.jar at http://192.168.11.42:48000/jars/spark-java2.jar with timestamp 1426598830307 15/03/17 21:27:10 INFO client.RMProxy: Connecting to ResourceManager at Server3/192.168.11.42:8050 15/03/17 21:27:11 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers 15/03/17 21:27:11 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 15/03/17 21:27:11 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 15/03/17 21:27:11 INFO yarn.Client: Setting up container launch context for our AM 15/03/17 21:27:11 INFO yarn.Client: Preparing resources for our AM container 15/03/17 21:27:12 INFO yarn.Client: Uploading resource file:/home/hduser/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.6.0.jar - hdfs://Server3:9000/user/hduser/.sparkStaging/application_1426595477608_0002/spark-assembly-1.3.0-hadoop2.6.0.jar 15/03/17 21:27:21 INFO yarn.Client: Setting up the launch environment for our AM container 15/03/17 21:27:21 INFO spark.SecurityManager: Changing view Servers to: hduser 15/03/17 21:27:21 INFO
[jira] [Commented] (SPARK-6388) Spark 1.3 + Hadoop 2.6 Can't work on Java 8_40
[ https://issues.apache.org/jira/browse/SPARK-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365148#comment-14365148 ] John commented on SPARK-6388: - Ok, it looks like my problem. I'll try it later. Sorry for opening a issue. Let me close it. Spark 1.3 + Hadoop 2.6 Can't work on Java 8_40 -- Key: SPARK-6388 URL: https://issues.apache.org/jira/browse/SPARK-6388 Project: Spark Issue Type: Bug Components: Block Manager, Spark Submit, YARN Affects Versions: 1.3.0 Environment: 1. Linux version 3.16.0-30-generic (buildd@komainu) (gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6) ) #40-Ubuntu SMP Mon Jan 12 22:06:37 UTC 2015 2. Oracle Java 8 update 40 for Linux X64 3. Scala 2.10.5 4. Hadoop 2.6 (pre-build version) Reporter: John Original Estimate: 24h Remaining Estimate: 24h I build Apache Spark 1.3 munally. --- JAVA_HOME=PATH_TO_JAVA8 mvn clean package -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests --- Something goes wrong, akka always tell me --- 15/03/17 21:28:10 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM@Server2:42161] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. --- I build another version of Spark 1.3 + Hadoop 2.6 under Java 7. Everything goes well. Logs --- 15/03/17 21:27:06 INFO spark.SparkContext: Running Spark version 1.3.0 15/03/17 21:27:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/03/17 21:27:08 INFO spark.SecurityManager: Changing view Servers to: hduser 15/03/17 21:27:08 INFO spark.SecurityManager: Changing modify Servers to: hduser 15/03/17 21:27:08 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui Servers disabled; users with view permissions: Set(hduser); users with modify permissions: Set(hduser) 15/03/17 21:27:08 INFO slf4j.Slf4jLogger: Slf4jLogger started 15/03/17 21:27:08 INFO Remoting: Starting remoting 15/03/17 21:27:09 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@Server3:37951] 15/03/17 21:27:09 INFO util.Utils: Successfully started service 'sparkDriver' on port 37951. 15/03/17 21:27:09 INFO spark.SparkEnv: Registering MapOutputTracker 15/03/17 21:27:09 INFO spark.SparkEnv: Registering BlockManagerMaster 15/03/17 21:27:09 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-0db692bb-cd02-40c8-a8f0-3813c6da18e2/blockmgr-a1d0ad23-ab76-4177-80a0-a6f982a64d80 15/03/17 21:27:09 INFO storage.MemoryStore: MemoryStore started with capacity 265.1 MB 15/03/17 21:27:09 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-502ef3f8-b8cd-45cf-b1df-97df297cdb35/httpd-6303e24d-4b2b-4614-bb1d-74e8d331189b 15/03/17 21:27:09 INFO spark.HttpServer: Starting HTTP Server 15/03/17 21:27:09 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/03/17 21:27:10 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:48000 15/03/17 21:27:10 INFO util.Utils: Successfully started service 'HTTP file server' on port 48000. 15/03/17 21:27:10 INFO spark.SparkEnv: Registering OutputCommitCoordinator 15/03/17 21:27:10 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/03/17 21:27:10 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040 15/03/17 21:27:10 INFO util.Utils: Successfully started service 'SparkUI' on port 4040. 15/03/17 21:27:10 INFO ui.SparkUI: Started SparkUI at http://Server3:4040 15/03/17 21:27:10 INFO spark.SparkContext: Added JAR file:/home/hduser/spark-java2.jar at http://192.168.11.42:48000/jars/spark-java2.jar with timestamp 1426598830307 15/03/17 21:27:10 INFO client.RMProxy: Connecting to ResourceManager at Server3/192.168.11.42:8050 15/03/17 21:27:11 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers 15/03/17 21:27:11 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 15/03/17 21:27:11 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 15/03/17 21:27:11 INFO yarn.Client: Setting up container launch context for our AM 15/03/17 21:27:11 INFO yarn.Client: Preparing resources for our AM container 15/03/17 21:27:12 INFO yarn.Client: Uploading resource file:/home/hduser/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.6.0.jar - hdfs://Server3:9000/user/hduser/.sparkStaging/application_1426595477608_0002/spark-assembly-1.3.0-hadoop2.6.0.jar 15/03/17 21:27:21 INFO yarn.Client: Setting up the launch environment for
[jira] [Updated] (SPARK-6388) Spark 1.3 + Hadoop 2.6 Can't work on Java 8_40
[ https://issues.apache.org/jira/browse/SPARK-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] John updated SPARK-6388: Environment: 1. Linux version 3.16.0-30-generic (buildd@komainu) (gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6) ) #40-Ubuntu SMP Mon Jan 12 22:06:37 UTC 2015 2. Oracle Java 8 update 40 for Linux X64 3. Scala 2.10.5 4. Hadoop 2.6 (pre-build version) was: 1. Linux version 3.16.0-30-generic (buildd@komainu) (gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6) ) #40-Ubuntu SMP Mon Jan 12 22:06:37 UTC 2015 2. Oracle Java 8 update 40 for Linux X64 3. Scala 2.10.5 Spark 1.3 + Hadoop 2.6 Can't work on Java 8_40 -- Key: SPARK-6388 URL: https://issues.apache.org/jira/browse/SPARK-6388 Project: Spark Issue Type: Bug Components: Block Manager, Spark Submit, YARN Affects Versions: 1.3.0 Environment: 1. Linux version 3.16.0-30-generic (buildd@komainu) (gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6) ) #40-Ubuntu SMP Mon Jan 12 22:06:37 UTC 2015 2. Oracle Java 8 update 40 for Linux X64 3. Scala 2.10.5 4. Hadoop 2.6 (pre-build version) Reporter: John Original Estimate: 24h Remaining Estimate: 24h I build Apache Spark 1.3 munally. --- JAVA_HOME=PATH_TO_JAVA8 mvn clean package -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests --- Something goes wrong, akka always tell me --- 15/03/17 21:28:10 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM@Server2:42161] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. --- I build another version of Spark 1.3 + Hadoop 2.6 under Java 7. Everything goes well. Logs --- 15/03/17 21:27:06 INFO spark.SparkContext: Running Spark version 1.3.0 15/03/17 21:27:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/03/17 21:27:08 INFO spark.SecurityManager: Changing view Servers to: hduser 15/03/17 21:27:08 INFO spark.SecurityManager: Changing modify Servers to: hduser 15/03/17 21:27:08 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui Servers disabled; users with view permissions: Set(hduser); users with modify permissions: Set(hduser) 15/03/17 21:27:08 INFO slf4j.Slf4jLogger: Slf4jLogger started 15/03/17 21:27:08 INFO Remoting: Starting remoting 15/03/17 21:27:09 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@Server3:37951] 15/03/17 21:27:09 INFO util.Utils: Successfully started service 'sparkDriver' on port 37951. 15/03/17 21:27:09 INFO spark.SparkEnv: Registering MapOutputTracker 15/03/17 21:27:09 INFO spark.SparkEnv: Registering BlockManagerMaster 15/03/17 21:27:09 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-0db692bb-cd02-40c8-a8f0-3813c6da18e2/blockmgr-a1d0ad23-ab76-4177-80a0-a6f982a64d80 15/03/17 21:27:09 INFO storage.MemoryStore: MemoryStore started with capacity 265.1 MB 15/03/17 21:27:09 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-502ef3f8-b8cd-45cf-b1df-97df297cdb35/httpd-6303e24d-4b2b-4614-bb1d-74e8d331189b 15/03/17 21:27:09 INFO spark.HttpServer: Starting HTTP Server 15/03/17 21:27:09 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/03/17 21:27:10 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:48000 15/03/17 21:27:10 INFO util.Utils: Successfully started service 'HTTP file server' on port 48000. 15/03/17 21:27:10 INFO spark.SparkEnv: Registering OutputCommitCoordinator 15/03/17 21:27:10 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/03/17 21:27:10 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040 15/03/17 21:27:10 INFO util.Utils: Successfully started service 'SparkUI' on port 4040. 15/03/17 21:27:10 INFO ui.SparkUI: Started SparkUI at http://Server3:4040 15/03/17 21:27:10 INFO spark.SparkContext: Added JAR file:/home/hduser/spark-java2.jar at http://192.168.11.42:48000/jars/spark-java2.jar with timestamp 1426598830307 15/03/17 21:27:10 INFO client.RMProxy: Connecting to ResourceManager at Server3/192.168.11.42:8050 15/03/17 21:27:11 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers 15/03/17 21:27:11 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 15/03/17 21:27:11 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 15/03/17 21:27:11 INFO yarn.Client: Setting up container launch context for our AM 15/03/17 21:27:11 INFO yarn.Client: Preparing resources for our AM container 15/03/17 21:27:12 INFO
[jira] [Commented] (SPARK-6388) Spark 1.3 + Hadoop 2.6 Can't work on Java 8_40
[ https://issues.apache.org/jira/browse/SPARK-6388?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14365211#comment-14365211 ] John commented on SPARK-6388: - Thanks, I will try it later Spark 1.3 + Hadoop 2.6 Can't work on Java 8_40 -- Key: SPARK-6388 URL: https://issues.apache.org/jira/browse/SPARK-6388 Project: Spark Issue Type: Bug Components: Block Manager, Spark Submit, YARN Affects Versions: 1.3.0 Environment: 1. Linux version 3.16.0-30-generic (buildd@komainu) (gcc version 4.9.1 (Ubuntu 4.9.1-16ubuntu6) ) #40-Ubuntu SMP Mon Jan 12 22:06:37 UTC 2015 2. Oracle Java 8 update 40 for Linux X64 3. Scala 2.10.5 4. Hadoop 2.6 (pre-build version) Reporter: John Original Estimate: 24h Remaining Estimate: 24h I build Apache Spark 1.3 munally. --- JAVA_HOME=PATH_TO_JAVA8 mvn clean package -Pyarn -Phadoop-2.4 -Dhadoop.version=2.6.0 -DskipTests --- Something goes wrong, akka always tell me --- 15/03/17 21:28:10 WARN remote.ReliableDeliverySupervisor: Association with remote system [akka.tcp://sparkYarnAM@Server2:42161] has failed, address is now gated for [5000] ms. Reason is: [Disassociated]. --- I build another version of Spark 1.3 + Hadoop 2.6 under Java 7. Everything goes well. Logs --- 15/03/17 21:27:06 INFO spark.SparkContext: Running Spark version 1.3.0 15/03/17 21:27:07 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 15/03/17 21:27:08 INFO spark.SecurityManager: Changing view Servers to: hduser 15/03/17 21:27:08 INFO spark.SecurityManager: Changing modify Servers to: hduser 15/03/17 21:27:08 INFO spark.SecurityManager: SecurityManager: authentication disabled; ui Servers disabled; users with view permissions: Set(hduser); users with modify permissions: Set(hduser) 15/03/17 21:27:08 INFO slf4j.Slf4jLogger: Slf4jLogger started 15/03/17 21:27:08 INFO Remoting: Starting remoting 15/03/17 21:27:09 INFO Remoting: Remoting started; listening on addresses :[akka.tcp://sparkDriver@Server3:37951] 15/03/17 21:27:09 INFO util.Utils: Successfully started service 'sparkDriver' on port 37951. 15/03/17 21:27:09 INFO spark.SparkEnv: Registering MapOutputTracker 15/03/17 21:27:09 INFO spark.SparkEnv: Registering BlockManagerMaster 15/03/17 21:27:09 INFO storage.DiskBlockManager: Created local directory at /tmp/spark-0db692bb-cd02-40c8-a8f0-3813c6da18e2/blockmgr-a1d0ad23-ab76-4177-80a0-a6f982a64d80 15/03/17 21:27:09 INFO storage.MemoryStore: MemoryStore started with capacity 265.1 MB 15/03/17 21:27:09 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-502ef3f8-b8cd-45cf-b1df-97df297cdb35/httpd-6303e24d-4b2b-4614-bb1d-74e8d331189b 15/03/17 21:27:09 INFO spark.HttpServer: Starting HTTP Server 15/03/17 21:27:09 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/03/17 21:27:10 INFO server.AbstractConnector: Started SocketConnector@0.0.0.0:48000 15/03/17 21:27:10 INFO util.Utils: Successfully started service 'HTTP file server' on port 48000. 15/03/17 21:27:10 INFO spark.SparkEnv: Registering OutputCommitCoordinator 15/03/17 21:27:10 INFO server.Server: jetty-8.y.z-SNAPSHOT 15/03/17 21:27:10 INFO server.AbstractConnector: Started SelectChannelConnector@0.0.0.0:4040 15/03/17 21:27:10 INFO util.Utils: Successfully started service 'SparkUI' on port 4040. 15/03/17 21:27:10 INFO ui.SparkUI: Started SparkUI at http://Server3:4040 15/03/17 21:27:10 INFO spark.SparkContext: Added JAR file:/home/hduser/spark-java2.jar at http://192.168.11.42:48000/jars/spark-java2.jar with timestamp 1426598830307 15/03/17 21:27:10 INFO client.RMProxy: Connecting to ResourceManager at Server3/192.168.11.42:8050 15/03/17 21:27:11 INFO yarn.Client: Requesting a new application from cluster with 3 NodeManagers 15/03/17 21:27:11 INFO yarn.Client: Verifying our application has not requested more than the maximum memory capability of the cluster (8192 MB per container) 15/03/17 21:27:11 INFO yarn.Client: Will allocate AM container, with 896 MB memory including 384 MB overhead 15/03/17 21:27:11 INFO yarn.Client: Setting up container launch context for our AM 15/03/17 21:27:11 INFO yarn.Client: Preparing resources for our AM container 15/03/17 21:27:12 INFO yarn.Client: Uploading resource file:/home/hduser/spark-1.3.0/assembly/target/scala-2.10/spark-assembly-1.3.0-hadoop2.6.0.jar - hdfs://Server3:9000/user/hduser/.sparkStaging/application_1426595477608_0002/spark-assembly-1.3.0-hadoop2.6.0.jar 15/03/17 21:27:21 INFO yarn.Client: Setting up the launch environment for our AM container 15/03/17 21:27:21 INFO spark.SecurityManager: