[jira] [Updated] (SPARK-22146) FileNotFoundException while reading ORC files containing '%'
[ https://issues.apache.org/jira/browse/SPARK-22146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li updated SPARK-22146: Fix Version/s: 2.3.0 > FileNotFoundException while reading ORC files containing '%' > > > Key: SPARK-22146 > URL: https://issues.apache.org/jira/browse/SPARK-22146 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Marco Gaido > Fix For: 2.3.0 > > > Reading ORC files containing "strange" characters like '%' fails with a > FileNotFoundException. > For instance, if you have: > {noformat} > /tmp/orc_test/folder %3Aa/orc1.orc > /tmp/orc_test/folder %3Ab/orc2.orc > {noformat} > and you try to read the ORC files with: > {noformat} > spark.read.format("orc").load("/tmp/orc_test/*/*").show > {noformat} > you will get a: > {noformat} > java.io.FileNotFoundException: File > file:/tmp/orc_test/folder%20%253Aa/orc1.orc does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at > org.apache.spark.deploy.SparkHadoopUtil.listLeafStatuses(SparkHadoopUtil.scala:194) > at > org.apache.spark.sql.hive.orc.OrcFileOperator$.listOrcFiles(OrcFileOperator.scala:94) > at > org.apache.spark.sql.hive.orc.OrcFileOperator$.getFileReader(OrcFileOperator.scala:67) > at > org.apache.spark.sql.hive.orc.OrcFileOperator$$anonfun$readSchema$1.apply(OrcFileOperator.scala:77) > at > org.apache.spark.sql.hive.orc.OrcFileOperator$$anonfun$readSchema$1.apply(OrcFileOperator.scala:77) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) > at > org.apache.spark.sql.hive.orc.OrcFileOperator$.readSchema(OrcFileOperator.scala:77) > at > org.apache.spark.sql.hive.orc.OrcFileFormat.inferSchema(OrcFileFormat.scala:60) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$8.apply(DataSource.scala:197) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$8.apply(DataSource.scala:197) > at scala.Option.orElse(Option.scala:289) > at > org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:196) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:387) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:190) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:168) > ... 48 elided > {noformat} > Note that the same code works for Parquet and text files. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-22146) FileNotFoundException while reading ORC files containing '%'
[ https://issues.apache.org/jira/browse/SPARK-22146?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiao Li resolved SPARK-22146. - Resolution: Fixed Assignee: Marco Gaido > FileNotFoundException while reading ORC files containing '%' > > > Key: SPARK-22146 > URL: https://issues.apache.org/jira/browse/SPARK-22146 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Marco Gaido >Assignee: Marco Gaido > Fix For: 2.3.0 > > > Reading ORC files containing "strange" characters like '%' fails with a > FileNotFoundException. > For instance, if you have: > {noformat} > /tmp/orc_test/folder %3Aa/orc1.orc > /tmp/orc_test/folder %3Ab/orc2.orc > {noformat} > and you try to read the ORC files with: > {noformat} > spark.read.format("orc").load("/tmp/orc_test/*/*").show > {noformat} > you will get a: > {noformat} > java.io.FileNotFoundException: File > file:/tmp/orc_test/folder%20%253Aa/orc1.orc does not exist > at > org.apache.hadoop.fs.RawLocalFileSystem.deprecatedGetFileStatus(RawLocalFileSystem.java:611) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileLinkStatusInternal(RawLocalFileSystem.java:824) > at > org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:601) > at > org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:421) > at > org.apache.spark.deploy.SparkHadoopUtil.listLeafStatuses(SparkHadoopUtil.scala:194) > at > org.apache.spark.sql.hive.orc.OrcFileOperator$.listOrcFiles(OrcFileOperator.scala:94) > at > org.apache.spark.sql.hive.orc.OrcFileOperator$.getFileReader(OrcFileOperator.scala:67) > at > org.apache.spark.sql.hive.orc.OrcFileOperator$$anonfun$readSchema$1.apply(OrcFileOperator.scala:77) > at > org.apache.spark.sql.hive.orc.OrcFileOperator$$anonfun$readSchema$1.apply(OrcFileOperator.scala:77) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:241) > at > scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) > at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) > at scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:241) > at scala.collection.AbstractTraversable.flatMap(Traversable.scala:104) > at > org.apache.spark.sql.hive.orc.OrcFileOperator$.readSchema(OrcFileOperator.scala:77) > at > org.apache.spark.sql.hive.orc.OrcFileFormat.inferSchema(OrcFileFormat.scala:60) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$8.apply(DataSource.scala:197) > at > org.apache.spark.sql.execution.datasources.DataSource$$anonfun$8.apply(DataSource.scala:197) > at scala.Option.orElse(Option.scala:289) > at > org.apache.spark.sql.execution.datasources.DataSource.getOrInferFileFormatSchema(DataSource.scala:196) > at > org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:387) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:190) > at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:168) > ... 48 elided > {noformat} > Note that the same code works for Parquet and text files. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22165) Type conflicts between dates, timestamps and date in partition column
[ https://issues.apache.org/jira/browse/SPARK-22165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185388#comment-16185388 ] Apache Spark commented on SPARK-22165: -- User 'HyukjinKwon' has created a pull request for this issue: https://github.com/apache/spark/pull/19389 > Type conflicts between dates, timestamps and date in partition column > - > > Key: SPARK-22165 > URL: https://issues.apache.org/jira/browse/SPARK-22165 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.1, 2.2.0, 2.3.0 >Reporter: Hyukjin Kwon >Priority: Minor > > It looks we have some bugs when resolving type conflicts in partition column. > I found few corner cases as below: > Case 1: timestamp should be inferred but date type is inferred. > {code} > val df = Seq((1, "2015-01-01"), (2, "2016-01-01 00:00:00")).toDF("i", "ts") > df.write.format("parquet").partitionBy("ts").save("/tmp/foo") > spark.read.load("/tmp/foo").printSchema() > {code} > {code} > root > |-- i: integer (nullable = true) > |-- ts: date (nullable = true) > {code} > Case 2: decimal should be inferred but integer is inferred. > {code} > val df = Seq((1, "1"), (2, "1" * 30)).toDF("i", "decimal") > df.write.format("parquet").partitionBy("decimal").save("/tmp/bar") > spark.read.load("/tmp/bar").printSchema() > {code} > {code} > root > |-- i: integer (nullable = true) > |-- decimal: integer (nullable = true) > {code} > Looks we should de-duplicate type resolution logic if possible rather than > separate numeric precedence-like comparison alone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22165) Type conflicts between dates, timestamps and date in partition column
[ https://issues.apache.org/jira/browse/SPARK-22165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22165: Assignee: Apache Spark > Type conflicts between dates, timestamps and date in partition column > - > > Key: SPARK-22165 > URL: https://issues.apache.org/jira/browse/SPARK-22165 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.1, 2.2.0, 2.3.0 >Reporter: Hyukjin Kwon >Assignee: Apache Spark >Priority: Minor > > It looks we have some bugs when resolving type conflicts in partition column. > I found few corner cases as below: > Case 1: timestamp should be inferred but date type is inferred. > {code} > val df = Seq((1, "2015-01-01"), (2, "2016-01-01 00:00:00")).toDF("i", "ts") > df.write.format("parquet").partitionBy("ts").save("/tmp/foo") > spark.read.load("/tmp/foo").printSchema() > {code} > {code} > root > |-- i: integer (nullable = true) > |-- ts: date (nullable = true) > {code} > Case 2: decimal should be inferred but integer is inferred. > {code} > val df = Seq((1, "1"), (2, "1" * 30)).toDF("i", "decimal") > df.write.format("parquet").partitionBy("decimal").save("/tmp/bar") > spark.read.load("/tmp/bar").printSchema() > {code} > {code} > root > |-- i: integer (nullable = true) > |-- decimal: integer (nullable = true) > {code} > Looks we should de-duplicate type resolution logic if possible rather than > separate numeric precedence-like comparison alone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22165) Type conflicts between dates, timestamps and date in partition column
[ https://issues.apache.org/jira/browse/SPARK-22165?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22165: Assignee: (was: Apache Spark) > Type conflicts between dates, timestamps and date in partition column > - > > Key: SPARK-22165 > URL: https://issues.apache.org/jira/browse/SPARK-22165 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.1, 2.2.0, 2.3.0 >Reporter: Hyukjin Kwon >Priority: Minor > > It looks we have some bugs when resolving type conflicts in partition column. > I found few corner cases as below: > Case 1: timestamp should be inferred but date type is inferred. > {code} > val df = Seq((1, "2015-01-01"), (2, "2016-01-01 00:00:00")).toDF("i", "ts") > df.write.format("parquet").partitionBy("ts").save("/tmp/foo") > spark.read.load("/tmp/foo").printSchema() > {code} > {code} > root > |-- i: integer (nullable = true) > |-- ts: date (nullable = true) > {code} > Case 2: decimal should be inferred but integer is inferred. > {code} > val df = Seq((1, "1"), (2, "1" * 30)).toDF("i", "decimal") > df.write.format("parquet").partitionBy("decimal").save("/tmp/bar") > spark.read.load("/tmp/bar").printSchema() > {code} > {code} > root > |-- i: integer (nullable = true) > |-- decimal: integer (nullable = true) > {code} > Looks we should de-duplicate type resolution logic if possible rather than > separate numeric precedence-like comparison alone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22165) Type conflicts between dates, timestamps and date in partition column
Hyukjin Kwon created SPARK-22165: Summary: Type conflicts between dates, timestamps and date in partition column Key: SPARK-22165 URL: https://issues.apache.org/jira/browse/SPARK-22165 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.0, 2.1.1, 2.3.0 Reporter: Hyukjin Kwon Priority: Minor It looks we have some bugs when resolving type conflicts in partition column. I found few corner cases as below: Case 1: timestamp should be inferred but date type is inferred. {code} val df = Seq((1, "2015-01-01"), (2, "2016-01-01 00:00:00")).toDF("i", "ts") df.write.format("parquet").partitionBy("ts").save("/tmp/foo") spark.read.load("/tmp/foo").printSchema() {code} {code} root |-- i: integer (nullable = true) |-- ts: date (nullable = true) {code} Case 2: decimal should be inferred but integer is inferred. {code} val df = Seq((1, "1"), (2, "1" * 30)).toDF("i", "decimal") df.write.format("parquet").partitionBy("decimal").save("/tmp/bar") spark.read.load("/tmp/bar").printSchema() {code} {code} root |-- i: integer (nullable = true) |-- decimal: integer (nullable = true) {code} Looks we should de-duplicate type resolution logic if possible rather than separate numeric precedence-like comparison alone. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-22160) Allow changing sample points per partition in range shuffle exchange
[ https://issues.apache.org/jira/browse/SPARK-22160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-22160. - Resolution: Fixed Fix Version/s: 2.3.0 > Allow changing sample points per partition in range shuffle exchange > > > Key: SPARK-22160 > URL: https://issues.apache.org/jira/browse/SPARK-22160 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.2.0 >Reporter: Reynold Xin >Assignee: Reynold Xin > Fix For: 2.3.0 > > > Spark's RangePartitioner hard codes the number of sampling points per > partition to be 20. This is sometimes too low. This ticket makes it > configurable, via spark.sql.execution.rangeExchange.sampleSizePerPartition, > and raises the default in Spark SQL to be 100. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-15687) Columnar execution engine
[ https://issues.apache.org/jira/browse/SPARK-15687?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin closed SPARK-15687. --- Resolution: Later Actually closing this, since with whole-stage code generation, it's unclear what it means to have a columnar execution engine. In many cases whole-stage code generation is more advanced what a column-oriented execution engine would do. There are still certain cases in which it might make sense (e.g. late binding of dictionary encoded data) to use techniques from a column engine, but those would be smaller things to track and build. > Columnar execution engine > - > > Key: SPARK-15687 > URL: https://issues.apache.org/jira/browse/SPARK-15687 > Project: Spark > Issue Type: New Feature > Components: SQL >Reporter: Reynold Xin >Priority: Critical > > This ticket tracks progress in making the entire engine columnar, especially > in the context of nested data type support. > In Spark 2.0, we have used the internal column batch interface in Parquet > reading (via a vectorized Parquet decoder) and low cardinality aggregation. > Other parts of the engine are already using whole-stage code generation, > which is in many ways more efficient than a columnar execution engine for > flat data types. > The goal here is to figure out a story to work towards making column batch > the common data exchange format between operators outside whole-stage code > generation, as well as with external systems (e.g. Pandas). > Some of the important questions to answer are: > From the architectural perspective: > - What is the end state architecture? > - Should aggregation be columnar? > - Should sorting be columnar? > - How do we encode nested data? What are the operations on nested data, and > how do we handle these operations in a columnar format? > - What is the transition plan towards the end state? > From an external API perspective: > - Can we expose a more efficient column batch user-defined function API? > - How do we leverage this to integrate with 3rd party tools? > - Can we have a spec for a fixed version of the column batch format that can > be externalized and use that in data source API v2? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185307#comment-16185307 ] jackyoh commented on SPARK-18935: - It's clear to me now. Thank you for your kind assistance. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as shown below: > 16/12/19 22:33:28 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > have sufficient resources -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Closed] (SPARK-21971) Too many open files in Spark due to concurrent files being opened
[ https://issues.apache.org/jira/browse/SPARK-21971?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Takeshi Yamamuro closed SPARK-21971. Resolution: Not A Problem > Too many open files in Spark due to concurrent files being opened > - > > Key: SPARK-21971 > URL: https://issues.apache.org/jira/browse/SPARK-21971 > Project: Spark > Issue Type: Bug > Components: Spark Core, SQL >Affects Versions: 2.1.0, 2.2.0 >Reporter: Rajesh Balamohan >Priority: Minor > > When running Q67 of TPC-DS at 1 TB dataset on multi node cluster, it > consistently fails with "too many open files" exception. > {noformat} > O scheduler.TaskSetManager: Finished task 25.0 in stage 844.0 (TID 243786) in > 394 ms on machine111.xyz (executor 2) (189/200) > 17/08/20 10:33:45 INFO scheduler.TaskSetManager: Finished task 172.0 in stage > 844.0 (TID 243932) in 11996 ms on cn116-10.l42scl.hortonworks.com (executor > 6) (190/200) > 17/08/20 10:37:40 WARN scheduler.TaskSetManager: Lost task 144.0 in stage > 844.0 (TID 243904, machine1.xyz, executor 1): > java.nio.file.FileSystemException: > /grid/3/hadoop/yarn/local/usercache/rbalamohan/appcache/application_1490656001509_7207/blockmgr-5180e3f0-f7ed-44bb-affc-8f99f09ba7bc/28/temp_local_690afbf7-172d-4fdb-8492-3e2ebd8d5183: > Too many open files > at > sun.nio.fs.UnixException.translateToIOException(UnixException.java:91) > at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102) > at > sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107) > at > sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177) > at java.nio.channels.FileChannel.open(FileChannel.java:287) > at java.nio.channels.FileChannel.open(FileChannel.java:335) > at > org.apache.spark.io.NioBufferedFileInputStream.(NioBufferedFileInputStream.java:43) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillReader.(UnsafeSorterSpillReader.java:75) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeSorterSpillWriter.getReader(UnsafeSorterSpillWriter.java:150) > at > org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.getIterator(UnsafeExternalSorter.java:607) > at > org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray.generateIterator(ExternalAppendOnlyUnsafeRowArray.scala:169) > at > org.apache.spark.sql.execution.ExternalAppendOnlyUnsafeRowArray.generateIterator(ExternalAppendOnlyUnsafeRowArray.scala:173) > {noformat} > Cluster was configured with multiple cores per executor. > Window function uses "spark.sql.windowExec.buffer.spill.threshold=4096" which > causes large number of spills in larger dataset. With multiple cores per > executor, this reproduces easily. > {{UnsafeExternalSorter::getIterator()}} invokes {{spillWriter.getReader}} for > all the available spillWriters. {{UnsafeSorterSpillReader}} opens up the file > in its constructor and closes the file later as a part of its close() call. > This causes too many open files issue. > Note that this is not a file leak, but more of concurrent files being open at > any given time depending on the dataset being processed. > One option could be to increase "spark.sql.windowExec.buffer.spill.threshold" > so that fewer spill files are generated, but it is hard to determine the > sweetspot for all workload. Another option is to set ulimit to "unlimited" > for files, but that would not be a good production setting. It would be good > to consider reducing the number of concurrent > "UnsafeExternalSorter::getIterator". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Resolved] (SPARK-21999) ConcurrentModificationException - Spark Streaming
[ https://issues.apache.org/jira/browse/SPARK-21999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-21999. --- Resolution: Not A Problem > ConcurrentModificationException - Spark Streaming > - > > Key: SPARK-21999 > URL: https://issues.apache.org/jira/browse/SPARK-21999 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0 >Reporter: Michael N >Priority: Critical > > Hi, > I am using Spark Streaming v2.1.0 with Kafka 0.8. I am getting > ConcurrentModificationException intermittently. When it occurs, Spark does > not honor the specified value of spark.task.maxFailures. So Spark aborts the > current batch and fetch the next batch, so it results in lost data. Its > exception stack is listed below. > This instance of ConcurrentModificationException is similar to the issue at > https://issues.apache.org/jira/browse/SPARK-17463, which was about > Serialization of accumulators in heartbeats. However, my Spark stream app > does not use accumulators. > The stack trace listed below occurred on the Spark master in Spark streaming > driver at the time of data loss. > From the line of code in the first stack trace, can you tell which object > Spark was trying to serialize ? What is the root cause for this issue ? > Because this issue results in lost data as described above, could you have > this issue fixed ASAP ? > Thanks. > Michael N., > > Stack trace of Spark Streaming driver > ERROR JobScheduler:91: Error generating jobs for time 150522493 ms > org.apache.spark.SparkException: Task not serializable > at > org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298) > at > org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288) > at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108) > at org.apache.spark.SparkContext.clean(SparkContext.scala:2094) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:793) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:792) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:362) > at org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:792) > at > org.apache.spark.streaming.dstream.MapPartitionedDStream$$anonfun$compute$1.apply(MapPartitionedDStream.scala:37) > at > org.apache.spark.streaming.dstream.MapPartitionedDStream$$anonfun$compute$1.apply(MapPartitionedDStream.scala:37) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.streaming.dstream.MapPartitionedDStream.compute(MapPartitionedDStream.scala:37) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:340) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:340) > at > org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:415) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:335) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:333) > at scala.Option.orElse(Option.scala:289) > at > org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:330) > at > org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) > at > org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.immutable.List.map(List.scala:285) > at > org.apache.spark.streaming.dstream.TransformedDStream.compute(TransformedDStream.scala:42) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341) > at > org.
[jira] [Commented] (SPARK-21999) ConcurrentModificationException - Spark Streaming
[ https://issues.apache.org/jira/browse/SPARK-21999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185177#comment-16185177 ] Sean Owen commented on SPARK-21999: --- How would Spark do this synchronously w.r.t your app -- how can it lock your locks? > ConcurrentModificationException - Spark Streaming > - > > Key: SPARK-21999 > URL: https://issues.apache.org/jira/browse/SPARK-21999 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0 >Reporter: Michael N >Priority: Critical > > Hi, > I am using Spark Streaming v2.1.0 with Kafka 0.8. I am getting > ConcurrentModificationException intermittently. When it occurs, Spark does > not honor the specified value of spark.task.maxFailures. So Spark aborts the > current batch and fetch the next batch, so it results in lost data. Its > exception stack is listed below. > This instance of ConcurrentModificationException is similar to the issue at > https://issues.apache.org/jira/browse/SPARK-17463, which was about > Serialization of accumulators in heartbeats. However, my Spark stream app > does not use accumulators. > The stack trace listed below occurred on the Spark master in Spark streaming > driver at the time of data loss. > From the line of code in the first stack trace, can you tell which object > Spark was trying to serialize ? What is the root cause for this issue ? > Because this issue results in lost data as described above, could you have > this issue fixed ASAP ? > Thanks. > Michael N., > > Stack trace of Spark Streaming driver > ERROR JobScheduler:91: Error generating jobs for time 150522493 ms > org.apache.spark.SparkException: Task not serializable > at > org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298) > at > org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288) > at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108) > at org.apache.spark.SparkContext.clean(SparkContext.scala:2094) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:793) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:792) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:362) > at org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:792) > at > org.apache.spark.streaming.dstream.MapPartitionedDStream$$anonfun$compute$1.apply(MapPartitionedDStream.scala:37) > at > org.apache.spark.streaming.dstream.MapPartitionedDStream$$anonfun$compute$1.apply(MapPartitionedDStream.scala:37) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.streaming.dstream.MapPartitionedDStream.compute(MapPartitionedDStream.scala:37) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:340) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:340) > at > org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:415) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:335) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:333) > at scala.Option.orElse(Option.scala:289) > at > org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:330) > at > org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) > at > org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at scala.collection.immutable.List.foreach(List.scala:381) > at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) > at scala.collection.immutable.List.map(List.scala:285) > at > org.apache.spark.streaming.dstream.TransformedDStream.compute(TransformedDStream.scala:42) > at > org.apache.spark.streaming.
[jira] [Resolved] (SPARK-22163) Design Issue of Spark Streaming that Causes Random Run-time Exception
[ https://issues.apache.org/jira/browse/SPARK-22163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Sean Owen resolved SPARK-22163. --- Resolution: Duplicate Please don't fork the issue. This is not a bug. > Design Issue of Spark Streaming that Causes Random Run-time Exception > - > > Key: SPARK-22163 > URL: https://issues.apache.org/jira/browse/SPARK-22163 > Project: Spark > Issue Type: Bug > Components: DStreams, Structured Streaming >Affects Versions: 2.2.0 > Environment: Spark Streaming > Kafka > Linux >Reporter: Michael N >Priority: Critical > > The application objects can contain List and can be modified dynamically as > well. However, Spark Streaming framework asynchronously serializes the > application's objects as the application runs. Therefore, it causes random > run-time exception on the List when Spark Streaming framework happens to > serializes the application's objects while the application modifies a List in > its own object. > In fact, there are multiple bugs reported about > Caused by: java.util.ConcurrentModificationException > at java.util.ArrayList.writeObject > that are permutation of the same root cause. So the design issue of Spark > streaming framework is that it should do this serialization asynchronously. > Instead, it should either > 1. do this serialization synchronously. This is preferred to eliminate the > issue completely. Or > 2. Allow it to be configured per application whether to do this serialization > synchronously or asynchronously, depending on the nature of each application. > Also, Spark documentation should describe the conditions that trigger Spark > to do this type of serialization asynchronously, so the applications can work > around them until the fix is provided. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185169#comment-16185169 ] Hyukjin Kwon commented on SPARK-21190: -- Will keep in mind and suggest to fix it or fix it by myself when we happen to fix some codes around here (of course before the release). > SPIP: Vectorized UDFs in Python > --- > > Key: SPARK-21190 > URL: https://issues.apache.org/jira/browse/SPARK-21190 > Project: Spark > Issue Type: New Feature > Components: PySpark, SQL >Affects Versions: 2.2.0 >Reporter: Reynold Xin >Assignee: Bryan Cutler > Labels: SPIP > Fix For: 2.3.0 > > Attachments: SPIPVectorizedUDFsforPython (1).pdf > > > *Background and Motivation* > Python is one of the most popular programming languages among Spark users. > Spark currently exposes a row-at-a-time interface for defining and executing > user-defined functions (UDFs). This introduces high overhead in serialization > and deserialization, and also makes it difficult to leverage Python libraries > (e.g. numpy, Pandas) that are written in native code. > > This proposal advocates introducing new APIs to support vectorized UDFs in > Python, in which a block of data is transferred over to Python in some > columnar format for execution. > > > *Target Personas* > Data scientists, data engineers, library developers. > > *Goals* > - Support vectorized UDFs that apply on chunks of the data frame > - Low system overhead: Substantially reduce serialization and deserialization > overhead when compared with row-at-a-time interface > - UDF performance: Enable users to leverage native libraries in Python (e.g. > numpy, Pandas) for data manipulation in these UDFs > > *Non-Goals* > The following are explicitly out of scope for the current SPIP, and should be > done in future SPIPs. Nonetheless, it would be good to consider these future > use cases during API design, so we can achieve some consistency when rolling > out new APIs. > > - Define block oriented UDFs in other languages (that are not Python). > - Define aggregate UDFs > - Tight integration with machine learning frameworks > > *Proposed API Changes* > The following sketches some possibilities. I haven’t spent a lot of time > thinking about the API (wrote it down in 5 mins) and I am not attached to > this design at all. The main purpose of the SPIP is to get feedback on use > cases and see how they can impact API design. > > A few things to consider are: > > 1. Python is dynamically typed, whereas DataFrames/SQL requires static, > analysis time typing. This means users would need to specify the return type > of their UDFs. > > 2. Ratio of input rows to output rows. We propose initially we require number > of output rows to be the same as the number of input rows. In the future, we > can consider relaxing this constraint with support for vectorized aggregate > UDFs. > 3. How do we handle null values, since Pandas doesn't have the concept of > nulls? > > Proposed API sketch (using examples): > > Use case 1. A function that defines all the columns of a DataFrame (similar > to a “map” function): > > {code} > @spark_udf(some way to describe the return schema) > def my_func_on_entire_df(input): > """ Some user-defined function. > > :param input: A Pandas DataFrame with two columns, a and b. > :return: :class: A Pandas data frame. > """ > input[c] = input[a] + input[b] > Input[d] = input[a] - input[b] > return input > > spark.range(1000).selectExpr("id a", "id / 2 b") > .mapBatches(my_func_on_entire_df) > {code} > > Use case 2. A function that defines only one column (similar to existing > UDFs): > > {code} > @spark_udf(some way to describe the return schema) > def my_func_that_returns_one_column(input): > """ Some user-defined function. > > :param input: A Pandas DataFrame with two columns, a and b. > :return: :class: A numpy array > """ > return input[a] + input[b] > > my_func = udf(my_func_that_returns_one_column) > > df = spark.range(1000).selectExpr("id a", "id / 2 b") > df.withColumn("c", my_func(df.a, df.b)) > {code} > > > > *Optional Design Sketch* > I’m more concerned about getting proper feedback for API design. The > implementation should be pretty straightforward and is not a huge concern at > this point. We can leverage the same implementation for faster toPandas > (using Arrow). > > > *Optional Rejected Designs* > See above. > > > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22164) support histogram in estimating the cardinality of aggregate (or group-by) operator
Ron Hu created SPARK-22164: -- Summary: support histogram in estimating the cardinality of aggregate (or group-by) operator Key: SPARK-22164 URL: https://issues.apache.org/jira/browse/SPARK-22164 Project: Spark Issue Type: Sub-task Components: SQL Affects Versions: 2.2.0 Reporter: Ron Hu Histogram is effective in dealing with skewed distribution. After we generate histogram information for column statistics, we need to adjust aggregate (or group-by) cardinality estimation based on equi-height histogram information. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185166#comment-16185166 ] Reynold Xin commented on SPARK-21190: - OK it would be great to have a better error message, e.g. remove "currently", and tell users to work around the limitation, they can create an 1-arg udf and ignore the arg. > SPIP: Vectorized UDFs in Python > --- > > Key: SPARK-21190 > URL: https://issues.apache.org/jira/browse/SPARK-21190 > Project: Spark > Issue Type: New Feature > Components: PySpark, SQL >Affects Versions: 2.2.0 >Reporter: Reynold Xin >Assignee: Bryan Cutler > Labels: SPIP > Fix For: 2.3.0 > > Attachments: SPIPVectorizedUDFsforPython (1).pdf > > > *Background and Motivation* > Python is one of the most popular programming languages among Spark users. > Spark currently exposes a row-at-a-time interface for defining and executing > user-defined functions (UDFs). This introduces high overhead in serialization > and deserialization, and also makes it difficult to leverage Python libraries > (e.g. numpy, Pandas) that are written in native code. > > This proposal advocates introducing new APIs to support vectorized UDFs in > Python, in which a block of data is transferred over to Python in some > columnar format for execution. > > > *Target Personas* > Data scientists, data engineers, library developers. > > *Goals* > - Support vectorized UDFs that apply on chunks of the data frame > - Low system overhead: Substantially reduce serialization and deserialization > overhead when compared with row-at-a-time interface > - UDF performance: Enable users to leverage native libraries in Python (e.g. > numpy, Pandas) for data manipulation in these UDFs > > *Non-Goals* > The following are explicitly out of scope for the current SPIP, and should be > done in future SPIPs. Nonetheless, it would be good to consider these future > use cases during API design, so we can achieve some consistency when rolling > out new APIs. > > - Define block oriented UDFs in other languages (that are not Python). > - Define aggregate UDFs > - Tight integration with machine learning frameworks > > *Proposed API Changes* > The following sketches some possibilities. I haven’t spent a lot of time > thinking about the API (wrote it down in 5 mins) and I am not attached to > this design at all. The main purpose of the SPIP is to get feedback on use > cases and see how they can impact API design. > > A few things to consider are: > > 1. Python is dynamically typed, whereas DataFrames/SQL requires static, > analysis time typing. This means users would need to specify the return type > of their UDFs. > > 2. Ratio of input rows to output rows. We propose initially we require number > of output rows to be the same as the number of input rows. In the future, we > can consider relaxing this constraint with support for vectorized aggregate > UDFs. > 3. How do we handle null values, since Pandas doesn't have the concept of > nulls? > > Proposed API sketch (using examples): > > Use case 1. A function that defines all the columns of a DataFrame (similar > to a “map” function): > > {code} > @spark_udf(some way to describe the return schema) > def my_func_on_entire_df(input): > """ Some user-defined function. > > :param input: A Pandas DataFrame with two columns, a and b. > :return: :class: A Pandas data frame. > """ > input[c] = input[a] + input[b] > Input[d] = input[a] - input[b] > return input > > spark.range(1000).selectExpr("id a", "id / 2 b") > .mapBatches(my_func_on_entire_df) > {code} > > Use case 2. A function that defines only one column (similar to existing > UDFs): > > {code} > @spark_udf(some way to describe the return schema) > def my_func_that_returns_one_column(input): > """ Some user-defined function. > > :param input: A Pandas DataFrame with two columns, a and b. > :return: :class: A numpy array > """ > return input[a] + input[b] > > my_func = udf(my_func_that_returns_one_column) > > df = spark.range(1000).selectExpr("id a", "id / 2 b") > df.withColumn("c", my_func(df.a, df.b)) > {code} > > > > *Optional Design Sketch* > I’m more concerned about getting proper feedback for API design. The > implementation should be pretty straightforward and is not a huge concern at > this point. We can leverage the same implementation for faster toPandas > (using Arrow). > > > *Optional Rejected Designs* > See above. > > > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-21190) SPIP: Vectorized UDFs in Python
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185164#comment-16185164 ] Hyukjin Kwon edited comment on SPARK-21190 at 9/29/17 12:33 AM: [~rxin], I suggested to disallow it [here|https://github.com/apache/spark/pull/18659#discussion_r139856165] and I think reviewers and committers agreed with it for the similar reasons. So, it was separately fixed in [this commit|https://github.com/apache/spark/commit/d8e825e3bc5fdb8ba00eba431512fa7f771417f1]. Therefore, the cases below are not allowed: {code} >>> from pyspark.sql.functions import pandas_udf >>> from pyspark.sql.types import * >>> @pandas_udf(returnType=LongType()) ... def add_one(): ... return 1 ... NotImplementedError: 0-parameter pandas_udfs are not currently supported >>> pandas_udf(lambda: 1, "long") ... NotImplementedError: 0-parameter pandas_udfs are not currently supported >>> pandas_udf(lambda: 1, LongType()) ... NotImplementedError: 0-parameter pandas_udfs are not currently supported {code} was (Author: hyukjin.kwon): [~rxin], I suggested to disallow it [here|https://github.com/apache/spark/pull/18659#discussion_r139856165] and I think reviewers and committers agreed with it for the similar reasons. So, it was separately fixed in [this commit|https://github.com/apache/spark/commit/d8e825e3bc5fdb8ba00eba431512fa7f771417f1]. Therefore, the cases below are not allowed: {code} >>> from pyspark.sql.functions import pandas_udf >>> from pyspark.sql.types import * >>> @pandas_udf(returnType=LongType()) ... def add_one(): ... return 1 ... Traceback (most recent call last): File "", line 1, in File "/.../spark/python/pyspark/sql/functions.py", line 2133, in _udf raise NotImplementedError("0-parameter pandas_udfs are not currently supported") NotImplementedError: 0-parameter pandas_udfs are not currently supported >>> pandas_udf(lambda: 1, "long") Traceback (most recent call last): File "", line 1, in File "/.../spark/python/pyspark/sql/functions.py", line 2209, in pandas_udf wrapped_udf = _create_udf(f, returnType=returnType, vectorized=True) File "/.../spark/python/pyspark/sql/functions.py", line 2144, in _create_udf return _udf(f=f, returnType=returnType, vectorized=vectorized) File "/.../spark/python/pyspark/sql/functions.py", line 2133, in _udf raise NotImplementedError("0-parameter pandas_udfs are not currently supported") NotImplementedError: 0-parameter pandas_udfs are not currently supported >>> pandas_udf(lambda: 1, LongType()) Traceback (most recent call last): File "", line 1, in File "/.../spark/python/pyspark/sql/functions.py", line 2209, in pandas_udf wrapped_udf = _create_udf(f, returnType=returnType, vectorized=True) File "/.../spark/python/pyspark/sql/functions.py", line 2144, in _create_udf return _udf(f=f, returnType=returnType, vectorized=vectorized) File "/.../spark/python/pyspark/sql/functions.py", line 2133, in _udf raise NotImplementedError("0-parameter pandas_udfs are not currently supported") NotImplementedError: 0-parameter pandas_udfs are not currently supported {code} > SPIP: Vectorized UDFs in Python > --- > > Key: SPARK-21190 > URL: https://issues.apache.org/jira/browse/SPARK-21190 > Project: Spark > Issue Type: New Feature > Components: PySpark, SQL >Affects Versions: 2.2.0 >Reporter: Reynold Xin >Assignee: Bryan Cutler > Labels: SPIP > Fix For: 2.3.0 > > Attachments: SPIPVectorizedUDFsforPython (1).pdf > > > *Background and Motivation* > Python is one of the most popular programming languages among Spark users. > Spark currently exposes a row-at-a-time interface for defining and executing > user-defined functions (UDFs). This introduces high overhead in serialization > and deserialization, and also makes it difficult to leverage Python libraries > (e.g. numpy, Pandas) that are written in native code. > > This proposal advocates introducing new APIs to support vectorized UDFs in > Python, in which a block of data is transferred over to Python in some > columnar format for execution. > > > *Target Personas* > Data scientists, data engineers, library developers. > > *Goals* > - Support vectorized UDFs that apply on chunks of the data frame > - Low system overhead: Substantially reduce serialization and deserialization > overhead when compared with row-at-a-time interface > - UDF performance: Enable users to leverage native libraries in Python (e.g. > numpy, Pandas) for data manipulation in these UDFs > > *Non-Goals* > The following are explicitly out of scope for the current SPIP, and should be > done in future SPIPs. Nonetheless, it would be good to consider these future > use cases during API design, so we can achiev
[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185164#comment-16185164 ] Hyukjin Kwon commented on SPARK-21190: -- [~rxin], I suggested to disallow it [here|https://github.com/apache/spark/pull/18659#discussion_r139856165] and I think reviewers and committers agreed with it for the similar reasons. So, it was separately fixed in [this commit|https://github.com/apache/spark/commit/d8e825e3bc5fdb8ba00eba431512fa7f771417f1]. Therefore, the cases below are not allowed: {code} >>> from pyspark.sql.functions import pandas_udf >>> from pyspark.sql.types import * >>> @pandas_udf(returnType=LongType()) ... def add_one(): ... return 1 ... Traceback (most recent call last): File "", line 1, in File "/.../spark/python/pyspark/sql/functions.py", line 2133, in _udf raise NotImplementedError("0-parameter pandas_udfs are not currently supported") NotImplementedError: 0-parameter pandas_udfs are not currently supported >>> pandas_udf(lambda: 1, "long") Traceback (most recent call last): File "", line 1, in File "/.../spark/python/pyspark/sql/functions.py", line 2209, in pandas_udf wrapped_udf = _create_udf(f, returnType=returnType, vectorized=True) File "/.../spark/python/pyspark/sql/functions.py", line 2144, in _create_udf return _udf(f=f, returnType=returnType, vectorized=vectorized) File "/.../spark/python/pyspark/sql/functions.py", line 2133, in _udf raise NotImplementedError("0-parameter pandas_udfs are not currently supported") NotImplementedError: 0-parameter pandas_udfs are not currently supported >>> pandas_udf(lambda: 1, LongType()) Traceback (most recent call last): File "", line 1, in File "/.../spark/python/pyspark/sql/functions.py", line 2209, in pandas_udf wrapped_udf = _create_udf(f, returnType=returnType, vectorized=True) File "/.../spark/python/pyspark/sql/functions.py", line 2144, in _create_udf return _udf(f=f, returnType=returnType, vectorized=vectorized) File "/.../spark/python/pyspark/sql/functions.py", line 2133, in _udf raise NotImplementedError("0-parameter pandas_udfs are not currently supported") NotImplementedError: 0-parameter pandas_udfs are not currently supported {code} > SPIP: Vectorized UDFs in Python > --- > > Key: SPARK-21190 > URL: https://issues.apache.org/jira/browse/SPARK-21190 > Project: Spark > Issue Type: New Feature > Components: PySpark, SQL >Affects Versions: 2.2.0 >Reporter: Reynold Xin >Assignee: Bryan Cutler > Labels: SPIP > Fix For: 2.3.0 > > Attachments: SPIPVectorizedUDFsforPython (1).pdf > > > *Background and Motivation* > Python is one of the most popular programming languages among Spark users. > Spark currently exposes a row-at-a-time interface for defining and executing > user-defined functions (UDFs). This introduces high overhead in serialization > and deserialization, and also makes it difficult to leverage Python libraries > (e.g. numpy, Pandas) that are written in native code. > > This proposal advocates introducing new APIs to support vectorized UDFs in > Python, in which a block of data is transferred over to Python in some > columnar format for execution. > > > *Target Personas* > Data scientists, data engineers, library developers. > > *Goals* > - Support vectorized UDFs that apply on chunks of the data frame > - Low system overhead: Substantially reduce serialization and deserialization > overhead when compared with row-at-a-time interface > - UDF performance: Enable users to leverage native libraries in Python (e.g. > numpy, Pandas) for data manipulation in these UDFs > > *Non-Goals* > The following are explicitly out of scope for the current SPIP, and should be > done in future SPIPs. Nonetheless, it would be good to consider these future > use cases during API design, so we can achieve some consistency when rolling > out new APIs. > > - Define block oriented UDFs in other languages (that are not Python). > - Define aggregate UDFs > - Tight integration with machine learning frameworks > > *Proposed API Changes* > The following sketches some possibilities. I haven’t spent a lot of time > thinking about the API (wrote it down in 5 mins) and I am not attached to > this design at all. The main purpose of the SPIP is to get feedback on use > cases and see how they can impact API design. > > A few things to consider are: > > 1. Python is dynamically typed, whereas DataFrames/SQL requires static, > analysis time typing. This means users would need to specify the return type > of their UDFs. > > 2. Ratio of input rows to output rows. We propose initially we require number > of output rows to be the same as the number of input rows. In the future, we > can consider relaxing thi
[jira] [Commented] (SPARK-21190) SPIP: Vectorized UDFs in Python
[ https://issues.apache.org/jira/browse/SPARK-21190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185155#comment-16185155 ] Reynold Xin commented on SPARK-21190: - Where did we settle on 0-arg UDFs? I think we should just disallow it, since users can trivially workaround it by creating a 1-arg UDF that ignores the arg. > SPIP: Vectorized UDFs in Python > --- > > Key: SPARK-21190 > URL: https://issues.apache.org/jira/browse/SPARK-21190 > Project: Spark > Issue Type: New Feature > Components: PySpark, SQL >Affects Versions: 2.2.0 >Reporter: Reynold Xin >Assignee: Bryan Cutler > Labels: SPIP > Fix For: 2.3.0 > > Attachments: SPIPVectorizedUDFsforPython (1).pdf > > > *Background and Motivation* > Python is one of the most popular programming languages among Spark users. > Spark currently exposes a row-at-a-time interface for defining and executing > user-defined functions (UDFs). This introduces high overhead in serialization > and deserialization, and also makes it difficult to leverage Python libraries > (e.g. numpy, Pandas) that are written in native code. > > This proposal advocates introducing new APIs to support vectorized UDFs in > Python, in which a block of data is transferred over to Python in some > columnar format for execution. > > > *Target Personas* > Data scientists, data engineers, library developers. > > *Goals* > - Support vectorized UDFs that apply on chunks of the data frame > - Low system overhead: Substantially reduce serialization and deserialization > overhead when compared with row-at-a-time interface > - UDF performance: Enable users to leverage native libraries in Python (e.g. > numpy, Pandas) for data manipulation in these UDFs > > *Non-Goals* > The following are explicitly out of scope for the current SPIP, and should be > done in future SPIPs. Nonetheless, it would be good to consider these future > use cases during API design, so we can achieve some consistency when rolling > out new APIs. > > - Define block oriented UDFs in other languages (that are not Python). > - Define aggregate UDFs > - Tight integration with machine learning frameworks > > *Proposed API Changes* > The following sketches some possibilities. I haven’t spent a lot of time > thinking about the API (wrote it down in 5 mins) and I am not attached to > this design at all. The main purpose of the SPIP is to get feedback on use > cases and see how they can impact API design. > > A few things to consider are: > > 1. Python is dynamically typed, whereas DataFrames/SQL requires static, > analysis time typing. This means users would need to specify the return type > of their UDFs. > > 2. Ratio of input rows to output rows. We propose initially we require number > of output rows to be the same as the number of input rows. In the future, we > can consider relaxing this constraint with support for vectorized aggregate > UDFs. > 3. How do we handle null values, since Pandas doesn't have the concept of > nulls? > > Proposed API sketch (using examples): > > Use case 1. A function that defines all the columns of a DataFrame (similar > to a “map” function): > > {code} > @spark_udf(some way to describe the return schema) > def my_func_on_entire_df(input): > """ Some user-defined function. > > :param input: A Pandas DataFrame with two columns, a and b. > :return: :class: A Pandas data frame. > """ > input[c] = input[a] + input[b] > Input[d] = input[a] - input[b] > return input > > spark.range(1000).selectExpr("id a", "id / 2 b") > .mapBatches(my_func_on_entire_df) > {code} > > Use case 2. A function that defines only one column (similar to existing > UDFs): > > {code} > @spark_udf(some way to describe the return schema) > def my_func_that_returns_one_column(input): > """ Some user-defined function. > > :param input: A Pandas DataFrame with two columns, a and b. > :return: :class: A numpy array > """ > return input[a] + input[b] > > my_func = udf(my_func_that_returns_one_column) > > df = spark.range(1000).selectExpr("id a", "id / 2 b") > df.withColumn("c", my_func(df.a, df.b)) > {code} > > > > *Optional Design Sketch* > I’m more concerned about getting proper feedback for API design. The > implementation should be pretty straightforward and is not a huge concern at > this point. We can leverage the same implementation for faster toPandas > (using Arrow). > > > *Optional Rejected Designs* > See above. > > > > -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21999) ConcurrentModificationException - Spark Streaming
[ https://issues.apache.org/jira/browse/SPARK-21999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185130#comment-16185130 ] Michael N commented on SPARK-21999: --- Thanks Shixiong Zhu for tracking this issue to DStream.mapPartitions. My code contains a list of objects and that list can be changed dynamically at run-time. That explains this issue. However, there is a broader aspect about the design of Spark Streaming framework where it should not serialize application's objects asynchronously, which trigger this issue. Therefore, I created another ticket at https://issues.apache.org/jira/browse/SPARK-22163 to address that aspect. > ConcurrentModificationException - Spark Streaming > - > > Key: SPARK-21999 > URL: https://issues.apache.org/jira/browse/SPARK-21999 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0 >Reporter: Michael N >Priority: Critical > > Hi, > I am using Spark Streaming v2.1.0 with Kafka 0.8. I am getting > ConcurrentModificationException intermittently. When it occurs, Spark does > not honor the specified value of spark.task.maxFailures. So Spark aborts the > current batch and fetch the next batch, so it results in lost data. Its > exception stack is listed below. > This instance of ConcurrentModificationException is similar to the issue at > https://issues.apache.org/jira/browse/SPARK-17463, which was about > Serialization of accumulators in heartbeats. However, my Spark stream app > does not use accumulators. > The stack trace listed below occurred on the Spark master in Spark streaming > driver at the time of data loss. > From the line of code in the first stack trace, can you tell which object > Spark was trying to serialize ? What is the root cause for this issue ? > Because this issue results in lost data as described above, could you have > this issue fixed ASAP ? > Thanks. > Michael N., > > Stack trace of Spark Streaming driver > ERROR JobScheduler:91: Error generating jobs for time 150522493 ms > org.apache.spark.SparkException: Task not serializable > at > org.apache.spark.util.ClosureCleaner$.ensureSerializable(ClosureCleaner.scala:298) > at > org.apache.spark.util.ClosureCleaner$.org$apache$spark$util$ClosureCleaner$$clean(ClosureCleaner.scala:288) > at org.apache.spark.util.ClosureCleaner$.clean(ClosureCleaner.scala:108) > at org.apache.spark.SparkContext.clean(SparkContext.scala:2094) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:793) > at > org.apache.spark.rdd.RDD$$anonfun$mapPartitions$1.apply(RDD.scala:792) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) > at > org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112) > at org.apache.spark.rdd.RDD.withScope(RDD.scala:362) > at org.apache.spark.rdd.RDD.mapPartitions(RDD.scala:792) > at > org.apache.spark.streaming.dstream.MapPartitionedDStream$$anonfun$compute$1.apply(MapPartitionedDStream.scala:37) > at > org.apache.spark.streaming.dstream.MapPartitionedDStream$$anonfun$compute$1.apply(MapPartitionedDStream.scala:37) > at scala.Option.map(Option.scala:146) > at > org.apache.spark.streaming.dstream.MapPartitionedDStream.compute(MapPartitionedDStream.scala:37) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1$$anonfun$apply$7.apply(DStream.scala:341) > at scala.util.DynamicVariable.withValue(DynamicVariable.scala:58) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:340) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1$$anonfun$1.apply(DStream.scala:340) > at > org.apache.spark.streaming.dstream.DStream.createRDDWithLocalProperties(DStream.scala:415) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:335) > at > org.apache.spark.streaming.dstream.DStream$$anonfun$getOrCompute$1.apply(DStream.scala:333) > at scala.Option.orElse(Option.scala:289) > at > org.apache.spark.streaming.dstream.DStream.getOrCompute(DStream.scala:330) > at > org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) > at > org.apache.spark.streaming.dstream.TransformedDStream$$anonfun$6.apply(TransformedDStream.scala:42) > at > scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) > at > scala.collection.TraversableLike$$anonfun
[jira] [Created] (SPARK-22163) Design Issue of Spark Streaming that Causes Random Run-time Exception
Michael N created SPARK-22163: - Summary: Design Issue of Spark Streaming that Causes Random Run-time Exception Key: SPARK-22163 URL: https://issues.apache.org/jira/browse/SPARK-22163 Project: Spark Issue Type: Bug Components: DStreams, Structured Streaming Affects Versions: 2.2.0 Environment: Spark Streaming Kafka Linux Reporter: Michael N Priority: Critical The application objects can contain List and can be modified dynamically as well. However, Spark Streaming framework asynchronously serializes the application's objects as the application runs. Therefore, it causes random run-time exception on the List when Spark Streaming framework happens to serializes the application's objects while the application modifies a List in its own object. In fact, there are multiple bugs reported about Caused by: java.util.ConcurrentModificationException at java.util.ArrayList.writeObject that are permutation of the same root cause. So the design issue of Spark streaming framework is that it should do this serialization asynchronously. Instead, it should either 1. do this serialization synchronously. This is preferred to eliminate the issue completely. Or 2. Allow it to be configured per application whether to do this serialization synchronously or asynchronously, depending on the nature of each application. Also, Spark documentation should describe the conditions that trigger Spark to do this type of serialization asynchronously, so the applications can work around them until the fix is provided. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22162) Executors and the driver use inconsistent Job IDs during the new RDD commit protocol
[ https://issues.apache.org/jira/browse/SPARK-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22162: Assignee: Apache Spark > Executors and the driver use inconsistent Job IDs during the new RDD commit > protocol > > > Key: SPARK-22162 > URL: https://issues.apache.org/jira/browse/SPARK-22162 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0, 2.3.0 >Reporter: Reza Safi >Assignee: Apache Spark > > After SPARK-18191 commit in pull request 15769, using the new commit protocol > it is possible that driver and executors uses different jobIds during a rdd > commit. > In the old code, the variable stageId is part of the closure used to define > the task as you can see here: > > [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1098] > As a result, a TaskAttemptId is constructed in executors using the same > "stageId" as the driver, since it is a value that is serialized in the > driver. Also the value of stageID is actually the rdd.id which is assigned > here: > [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1084] > However, after the change in pull request 15769, the value is no longer part > of the task closure, which gets serialized by the driver. Instead, it is > pulled from the taskContext as you can see > here:[https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R103] > and then that value is used to construct the TaskAttemptId on the executors: > [https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R134] > taskContext has a stageID value which will be set in DAGScheduler. So after > the change unlike the old code which a rdd.id was used, an actual stage.id is > used which can be different between executors and the driver since it is no > longer serialized. > In summary, the old code consistently used rddId, and just incorrectly named > it "stageId". > The new code uses a mix of rddId and stageId. There should be a consistent ID > between executors and the drivers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22162) Executors and the driver use inconsistent Job IDs during the new RDD commit protocol
[ https://issues.apache.org/jira/browse/SPARK-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22162: Assignee: (was: Apache Spark) > Executors and the driver use inconsistent Job IDs during the new RDD commit > protocol > > > Key: SPARK-22162 > URL: https://issues.apache.org/jira/browse/SPARK-22162 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0, 2.3.0 >Reporter: Reza Safi > > After SPARK-18191 commit in pull request 15769, using the new commit protocol > it is possible that driver and executors uses different jobIds during a rdd > commit. > In the old code, the variable stageId is part of the closure used to define > the task as you can see here: > > [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1098] > As a result, a TaskAttemptId is constructed in executors using the same > "stageId" as the driver, since it is a value that is serialized in the > driver. Also the value of stageID is actually the rdd.id which is assigned > here: > [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1084] > However, after the change in pull request 15769, the value is no longer part > of the task closure, which gets serialized by the driver. Instead, it is > pulled from the taskContext as you can see > here:[https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R103] > and then that value is used to construct the TaskAttemptId on the executors: > [https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R134] > taskContext has a stageID value which will be set in DAGScheduler. So after > the change unlike the old code which a rdd.id was used, an actual stage.id is > used which can be different between executors and the driver since it is no > longer serialized. > In summary, the old code consistently used rddId, and just incorrectly named > it "stageId". > The new code uses a mix of rddId and stageId. There should be a consistent ID > between executors and the drivers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22162) Executors and the driver use inconsistent Job IDs during the new RDD commit protocol
[ https://issues.apache.org/jira/browse/SPARK-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185096#comment-16185096 ] Apache Spark commented on SPARK-22162: -- User 'rezasafi' has created a pull request for this issue: https://github.com/apache/spark/pull/19388 > Executors and the driver use inconsistent Job IDs during the new RDD commit > protocol > > > Key: SPARK-22162 > URL: https://issues.apache.org/jira/browse/SPARK-22162 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0, 2.3.0 >Reporter: Reza Safi > > After SPARK-18191 commit in pull request 15769, using the new commit protocol > it is possible that driver and executors uses different jobIds during a rdd > commit. > In the old code, the variable stageId is part of the closure used to define > the task as you can see here: > > [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1098] > As a result, a TaskAttemptId is constructed in executors using the same > "stageId" as the driver, since it is a value that is serialized in the > driver. Also the value of stageID is actually the rdd.id which is assigned > here: > [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1084] > However, after the change in pull request 15769, the value is no longer part > of the task closure, which gets serialized by the driver. Instead, it is > pulled from the taskContext as you can see > here:[https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R103] > and then that value is used to construct the TaskAttemptId on the executors: > [https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R134] > taskContext has a stageID value which will be set in DAGScheduler. So after > the change unlike the old code which a rdd.id was used, an actual stage.id is > used which can be different between executors and the driver since it is no > longer serialized. > In summary, the old code consistently used rddId, and just incorrectly named > it "stageId". > The new code uses a mix of rddId and stageId. There should be a consistent ID > between executors and the drivers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22160) Allow changing sample points per partition in range shuffle exchange
[ https://issues.apache.org/jira/browse/SPARK-22160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22160: Assignee: Reynold Xin (was: Apache Spark) > Allow changing sample points per partition in range shuffle exchange > > > Key: SPARK-22160 > URL: https://issues.apache.org/jira/browse/SPARK-22160 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.2.0 >Reporter: Reynold Xin >Assignee: Reynold Xin > > Spark's RangePartitioner hard codes the number of sampling points per > partition to be 20. This is sometimes too low. This ticket makes it > configurable, via spark.sql.execution.rangeExchange.sampleSizePerPartition, > and raises the default in Spark SQL to be 100. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22160) Allow changing sample points per partition in range shuffle exchange
[ https://issues.apache.org/jira/browse/SPARK-22160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22160: Assignee: Apache Spark (was: Reynold Xin) > Allow changing sample points per partition in range shuffle exchange > > > Key: SPARK-22160 > URL: https://issues.apache.org/jira/browse/SPARK-22160 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.2.0 >Reporter: Reynold Xin >Assignee: Apache Spark > > Spark's RangePartitioner hard codes the number of sampling points per > partition to be 20. This is sometimes too low. This ticket makes it > configurable, via spark.sql.execution.rangeExchange.sampleSizePerPartition, > and raises the default in Spark SQL to be 100. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22162) Executors and the driver should use inconsistent Job IDs during the new RDD commit protocol
[ https://issues.apache.org/jira/browse/SPARK-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reza Safi updated SPARK-22162: -- Summary: Executors and the driver should use inconsistent Job IDs during the new RDD commit protocol (was: Executors and the driver use inconsistent Job IDs during the new RDD commit protocol) > Executors and the driver should use inconsistent Job IDs during the new RDD > commit protocol > --- > > Key: SPARK-22162 > URL: https://issues.apache.org/jira/browse/SPARK-22162 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0, 2.3.0 >Reporter: Reza Safi > > After SPARK-18191 commit in pull request 15769, using the new commit protocol > it is possible that driver and executors uses different jobIds during a rdd > commit. > In the old code, the variable stageId is part of the closure used to define > the task as you can see here: > > [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1098] > As a result, a TaskAttemptId is constructed in executors using the same > "stageId" as the driver, since it is a value that is serialized in the > driver. Also the value of stageID is actually the rdd.id which is assigned > here: > [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1084] > However, after the change in pull request 15769, the value is no longer part > of the task closure, which gets serialized by the driver. Instead, it is > pulled from the taskContext as you can see > here:[https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R103] > and then that value is used to construct the TaskAttemptId on the executors: > [https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R134] > taskContext has a stageID value which will be set in DAGScheduler. So after > the change unlike the old code which a rdd.id was used, an actual stage.id is > used which can be different between executors and the driver since it is no > longer serialized. > In summary, the old code consistently used rddId, and just incorrectly named > it "stageId". > The new code uses a mix of rddId and stageId. There should be a consistent ID > between executors and the drivers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22162) Executors and the driver use inconsistent Job IDs during the new RDD commit protocol
[ https://issues.apache.org/jira/browse/SPARK-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reza Safi updated SPARK-22162: -- Summary: Executors and the driver use inconsistent Job IDs during the new RDD commit protocol (was: Executors and the driver should use inconsistent Job IDs during the new RDD commit protocol) > Executors and the driver use inconsistent Job IDs during the new RDD commit > protocol > > > Key: SPARK-22162 > URL: https://issues.apache.org/jira/browse/SPARK-22162 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0, 2.3.0 >Reporter: Reza Safi > > After SPARK-18191 commit in pull request 15769, using the new commit protocol > it is possible that driver and executors uses different jobIds during a rdd > commit. > In the old code, the variable stageId is part of the closure used to define > the task as you can see here: > > [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1098] > As a result, a TaskAttemptId is constructed in executors using the same > "stageId" as the driver, since it is a value that is serialized in the > driver. Also the value of stageID is actually the rdd.id which is assigned > here: > [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1084] > However, after the change in pull request 15769, the value is no longer part > of the task closure, which gets serialized by the driver. Instead, it is > pulled from the taskContext as you can see > here:[https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R103] > and then that value is used to construct the TaskAttemptId on the executors: > [https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R134] > taskContext has a stageID value which will be set in DAGScheduler. So after > the change unlike the old code which a rdd.id was used, an actual stage.id is > used which can be different between executors and the driver since it is no > longer serialized. > In summary, the old code consistently used rddId, and just incorrectly named > it "stageId". > The new code uses a mix of rddId and stageId. There should be a consistent ID > between executors and the drivers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22160) Allow changing sample points per partition in range shuffle exchange
[ https://issues.apache.org/jira/browse/SPARK-22160?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185051#comment-16185051 ] Apache Spark commented on SPARK-22160: -- User 'rxin' has created a pull request for this issue: https://github.com/apache/spark/pull/19387 > Allow changing sample points per partition in range shuffle exchange > > > Key: SPARK-22160 > URL: https://issues.apache.org/jira/browse/SPARK-22160 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.2.0 >Reporter: Reynold Xin >Assignee: Reynold Xin > > Spark's RangePartitioner hard codes the number of sampling points per > partition to be 20. This is sometimes too low. This ticket makes it > configurable, via spark.sql.execution.rangeExchange.sampleSizePerPartition, > and raises the default in Spark SQL to be 100. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22162) Executors and the driver use inconsistent Job IDs during the new RDD commit protocol
Reza Safi created SPARK-22162: - Summary: Executors and the driver use inconsistent Job IDs during the new RDD commit protocol Key: SPARK-22162 URL: https://issues.apache.org/jira/browse/SPARK-22162 Project: Spark Issue Type: Bug Components: Spark Core Affects Versions: 2.2.0, 2.3.0 Reporter: Reza Safi After SPARK-18191 commit in pull request 15769, using the new commit protocol it is possible that driver and executors uses different jobIds during a rdd commit. In the old code, the variable stageId is part of the closure used to define the task as you can see here: [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1098] As a result, a TaskAttemptId is constructed in executors using the same "stageId" as the driver, since it is a value that is serialized in the driver. Also the value of stageID is actually the rdd.id which is assigned here: [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1084] However, after the change in pull request 15769, the value is no longer part of the task closure, which gets serialized by the driver. Instead, it is pulled from the taskContext as you can see here:[https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R103] and then that value is used to construct the TaskAttemptId on the executors: [https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R134] taskContext has a stageID value which will be set in DAGScheduler. So after the change unlike the old code which a rdd.id was used, an actual stage.id is used which can be different between executors and the driver since it is no longer serialized. In summary, the old code consistently used rddId, and just incorrectly named it "stageId". The new code uses a mix of rddId and stageId. There should be a consistent ID between executors and the drivers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22162) Executors and the driver use inconsistent Job IDs during the new RDD commit protocol
[ https://issues.apache.org/jira/browse/SPARK-22162?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185036#comment-16185036 ] Reza Safi commented on SPARK-22162: --- I will send a pull request shortly for this issue. > Executors and the driver use inconsistent Job IDs during the new RDD commit > protocol > > > Key: SPARK-22162 > URL: https://issues.apache.org/jira/browse/SPARK-22162 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.2.0, 2.3.0 >Reporter: Reza Safi > > After SPARK-18191 commit in pull request 15769, using the new commit protocol > it is possible that driver and executors uses different jobIds during a rdd > commit. > In the old code, the variable stageId is part of the closure used to define > the task as you can see here: > > [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1098] > As a result, a TaskAttemptId is constructed in executors using the same > "stageId" as the driver, since it is a value that is serialized in the > driver. Also the value of stageID is actually the rdd.id which is assigned > here: > [https://github.com/apache/spark/blob/9c8deef64efee20a0ddc9b612f90e77c80aede60/core/src/main/scala/org/apache/spark/rdd/PairRDDFunctions.scala#L1084] > However, after the change in pull request 15769, the value is no longer part > of the task closure, which gets serialized by the driver. Instead, it is > pulled from the taskContext as you can see > here:[https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R103] > and then that value is used to construct the TaskAttemptId on the executors: > [https://github.com/apache/spark/pull/15769/files#diff-dff185cb90c666bce445e3212a21d765R134] > taskContext has a stageID value which will be set in DAGScheduler. So after > the change unlike the old code which a rdd.id was used, an actual stage.id is > used which can be different between executors and the driver since it is no > longer serialized. > In summary, the old code consistently used rddId, and just incorrectly named > it "stageId". > The new code uses a mix of rddId and stageId. There should be a consistent ID > between executors and the drivers. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18660) Parquet complains "Can not initialize counter due to context is not a instance of TaskInputOutputContext, but is org.apache.hadoop.mapreduce.task.TaskAttemptContextImp
[ https://issues.apache.org/jira/browse/SPARK-18660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16185024#comment-16185024 ] Randy Tidd commented on SPARK-18660: I am experiencing this problem now. However, it occurs at a time when Spark is writing the wrong number of rows to the parquet files. My data set has 6 rows, and it is writing 532. So I am not sure it is a frivolous error. It's logged as an error in ParquetRecordReader but eventually logs as a warning in Spark. > Parquet complains "Can not initialize counter due to context is not a > instance of TaskInputOutputContext, but is > org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl " > -- > > Key: SPARK-18660 > URL: https://issues.apache.org/jira/browse/SPARK-18660 > Project: Spark > Issue Type: Bug > Components: SQL >Reporter: Yin Huai > > Parquet record reader always complain "Can not initialize counter due to > context is not a instance of TaskInputOutputContext, but is > org.apache.hadoop.mapreduce.task.TaskAttemptContextImpl". Looks like we > always create TaskAttemptContextImpl > (https://github.com/apache/spark/blob/2f7461f31331cfc37f6cfa3586b7bbefb3af5547/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/parquet/ParquetFileFormat.scala#L368). > But, Parquet wants to use TaskInputOutputContext, which is a subclass of > TaskAttemptContextImpl. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184979#comment-16184979 ] Stavros Kontopoulos commented on SPARK-18935: - Using a principal does not make any difference: {noformat} I0928 22:39:34.785213 11269 master.cpp:6532] Sending status update TASK_ERROR for task 0 of framework f20de49b-dee3-45dd-a3c1-73418b7de891- 'Total resources cpus(spark-prive)(allocated: spark-prive):8; mem(spark-prive)(allocated: spark-prive):1408 required by task and its executor is more than available ports(spark-prive, test)(allocated: spark-prive):[31000-32000]; disk(spark-prive, test)(allocated: spark-prive):1000; cpus(spark-prive, test)(allocated: spark-prive):8; mem(spark-prive, test)(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216' I0928 22:39:34.785848 11270 hierarchical.cpp:850] Updated allocation of framework f20de49b-dee3-45dd-a3c1-73418b7de891- on agent 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 from ports(spark-prive, test)(allocated: spark-prive):[31000-32000]; disk(spark-prive, test)(allocated: spark-prive):1000; cpus(spark-prive, test)(allocated: spark-prive):8; mem(spark-prive, test)(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 to ports(spark-prive, test)(allocated: spark-prive):[31000-32000]; disk(spark-prive, test)(allocated: spark-prive):1000; cpus(spark-prive, test)(allocated: spark-prive):8; mem(spark-prive, test)(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 ./spark-submit --verbose --class org.apache.spark.examples.SparkPi --master mesos://universe:5050 --conf spark.mesos.role=spark-prive --conf spark.mesos.principal=test --conf spark.mesos.secret=test ../examples/jars/spark-examples_2.11-2.3.0-SNAPSHOT.jar 10``` {noformat} The problem was at the scheduler side because we dont propagate the reservationInfo to the used resources. We received resources of the form: {noformat} name: "disk" type: SCALAR scalar { value: 1000.0 } role: "spark-prive" reservation { principal: "test" } allocation_info { role: "spark-prive" } {noformat} {noformat} And when we launch the task we pass resources of the form: type: SCALAR scalar { value: 8.0 } role: "spark-prive" ,name: "mem" type: SCALAR scalar { value: 1408.0 } role: "spark-prive" {noformat} I made a fix and verified it works with and without an explicit principal. The method here: https://github.com/apache/spark/blob/d74dee1336e7152cc0fb7d2b3bf1a44f4f452025/resource-managers/mesos/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerUtils.scala#L178 just needs to preserve reservation and allocation info. I will have a PR shortly. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as shown below: > 16/12/19 22:33:28 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > have sufficient resources -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22161) Add Impala-modified TPC-DS queries
[ https://issues.apache.org/jira/browse/SPARK-22161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22161: Assignee: Apache Spark (was: Xiao Li) > Add Impala-modified TPC-DS queries > -- > > Key: SPARK-22161 > URL: https://issues.apache.org/jira/browse/SPARK-22161 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 2.3.0 >Reporter: Xiao Li >Assignee: Apache Spark > > Added IMPALA-modified TPCDS queries to TPC-DS query suites. > - Ref: https://github.com/cloudera/impala-tpcds-kit/tree/master/queries -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22161) Add Impala-modified TPC-DS queries
[ https://issues.apache.org/jira/browse/SPARK-22161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22161: Assignee: Xiao Li (was: Apache Spark) > Add Impala-modified TPC-DS queries > -- > > Key: SPARK-22161 > URL: https://issues.apache.org/jira/browse/SPARK-22161 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 2.3.0 >Reporter: Xiao Li >Assignee: Xiao Li > > Added IMPALA-modified TPCDS queries to TPC-DS query suites. > - Ref: https://github.com/cloudera/impala-tpcds-kit/tree/master/queries -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22161) Add Impala-modified TPC-DS queries
[ https://issues.apache.org/jira/browse/SPARK-22161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184987#comment-16184987 ] Apache Spark commented on SPARK-22161: -- User 'gatorsmile' has created a pull request for this issue: https://github.com/apache/spark/pull/19386 > Add Impala-modified TPC-DS queries > -- > > Key: SPARK-22161 > URL: https://issues.apache.org/jira/browse/SPARK-22161 > Project: Spark > Issue Type: Test > Components: SQL >Affects Versions: 2.3.0 >Reporter: Xiao Li >Assignee: Xiao Li > > Added IMPALA-modified TPCDS queries to TPC-DS query suites. > - Ref: https://github.com/cloudera/impala-tpcds-kit/tree/master/queries -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22161) Add Impala-modified TPC-DS queries
Xiao Li created SPARK-22161: --- Summary: Add Impala-modified TPC-DS queries Key: SPARK-22161 URL: https://issues.apache.org/jira/browse/SPARK-22161 Project: Spark Issue Type: Test Components: SQL Affects Versions: 2.3.0 Reporter: Xiao Li Assignee: Xiao Li Added IMPALA-modified TPCDS queries to TPC-DS query suites. - Ref: https://github.com/cloudera/impala-tpcds-kit/tree/master/queries -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-11034) Launcher: add support for monitoring Mesos apps
[ https://issues.apache.org/jira/browse/SPARK-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11034: Assignee: (was: Apache Spark) > Launcher: add support for monitoring Mesos apps > --- > > Key: SPARK-11034 > URL: https://issues.apache.org/jira/browse/SPARK-11034 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 1.6.0 >Reporter: Marcelo Vanzin > > The code to monitor apps launched using the launcher library was added in > SPARK-8673, but the backend does not support monitoring apps launched through > Mesos yet. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-11034) Launcher: add support for monitoring Mesos apps
[ https://issues.apache.org/jira/browse/SPARK-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184972#comment-16184972 ] Apache Spark commented on SPARK-11034: -- User 'devaraj-kavali' has created a pull request for this issue: https://github.com/apache/spark/pull/19385 > Launcher: add support for monitoring Mesos apps > --- > > Key: SPARK-11034 > URL: https://issues.apache.org/jira/browse/SPARK-11034 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 1.6.0 >Reporter: Marcelo Vanzin > > The code to monitor apps launched using the launcher library was added in > SPARK-8673, but the backend does not support monitoring apps launched through > Mesos yet. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-11034) Launcher: add support for monitoring Mesos apps
[ https://issues.apache.org/jira/browse/SPARK-11034?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-11034: Assignee: Apache Spark > Launcher: add support for monitoring Mesos apps > --- > > Key: SPARK-11034 > URL: https://issues.apache.org/jira/browse/SPARK-11034 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 1.6.0 >Reporter: Marcelo Vanzin >Assignee: Apache Spark > > The code to monitor apps launched using the launcher library was added in > SPARK-8673, but the backend does not support monitoring apps launched through > Mesos yet. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22160) Allow changing sample points per partition in range shuffle exchange
Reynold Xin created SPARK-22160: --- Summary: Allow changing sample points per partition in range shuffle exchange Key: SPARK-22160 URL: https://issues.apache.org/jira/browse/SPARK-22160 Project: Spark Issue Type: New Feature Components: SQL Affects Versions: 2.2.0 Reporter: Reynold Xin Assignee: Reynold Xin Spark's RangePartitioner hard codes the number of sampling points per partition to be 20. This is sometimes too low. This ticket makes it configurable, via spark.sql.execution.rangeExchange.sampleSizePerPartition, and raises the default in Spark SQL to be 100. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22159) spark.sql.execution.arrow.enable and spark.sql.codegen.aggregate.map.twolevel.enable -> enabled
[ https://issues.apache.org/jira/browse/SPARK-22159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22159: Assignee: Apache Spark (was: Reynold Xin) > spark.sql.execution.arrow.enable and > spark.sql.codegen.aggregate.map.twolevel.enable -> enabled > --- > > Key: SPARK-22159 > URL: https://issues.apache.org/jira/browse/SPARK-22159 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Reynold Xin >Assignee: Apache Spark > > We should make the config names consistent. They are supposed to end with > ".enabled", rather than "enable". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22159) spark.sql.execution.arrow.enable and spark.sql.codegen.aggregate.map.twolevel.enable -> enabled
[ https://issues.apache.org/jira/browse/SPARK-22159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22159: Assignee: Reynold Xin (was: Apache Spark) > spark.sql.execution.arrow.enable and > spark.sql.codegen.aggregate.map.twolevel.enable -> enabled > --- > > Key: SPARK-22159 > URL: https://issues.apache.org/jira/browse/SPARK-22159 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Reynold Xin >Assignee: Reynold Xin > > We should make the config names consistent. They are supposed to end with > ".enabled", rather than "enable". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22159) spark.sql.execution.arrow.enable and spark.sql.codegen.aggregate.map.twolevel.enable -> enabled
[ https://issues.apache.org/jira/browse/SPARK-22159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184634#comment-16184634 ] Apache Spark commented on SPARK-22159: -- User 'rxin' has created a pull request for this issue: https://github.com/apache/spark/pull/19384 > spark.sql.execution.arrow.enable and > spark.sql.codegen.aggregate.map.twolevel.enable -> enabled > --- > > Key: SPARK-22159 > URL: https://issues.apache.org/jira/browse/SPARK-22159 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Reynold Xin >Assignee: Reynold Xin > > We should make the config names consistent. They are supposed to end with > ".enabled", rather than "enable". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22159) spark.sql.execution.arrow.enable and spark.sql.codegen.aggregate.map.twolevel.enable -> enabled
Reynold Xin created SPARK-22159: --- Summary: spark.sql.execution.arrow.enable and spark.sql.codegen.aggregate.map.twolevel.enable -> enabled Key: SPARK-22159 URL: https://issues.apache.org/jira/browse/SPARK-22159 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.0 Reporter: Reynold Xin Assignee: Reynold Xin We should make the config names consistent. They are supposed to end with ".enabled", rather than "enable". -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-11412) Support merge schema for ORC
[ https://issues.apache.org/jira/browse/SPARK-11412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Dongjoon Hyun updated SPARK-11412: -- Affects Version/s: 1.6.3 2.0.0 2.2.0 > Support merge schema for ORC > > > Key: SPARK-11412 > URL: https://issues.apache.org/jira/browse/SPARK-11412 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 1.6.3, 2.0.0, 2.1.1, 2.2.0 >Reporter: Dave > > when I tried to load partitioned orc files with a slight difference in a > nested column. say > column > -- request: struct (nullable = true) > ||-- datetime: string (nullable = true) > ||-- host: string (nullable = true) > ||-- ip: string (nullable = true) > ||-- referer: string (nullable = true) > ||-- request_uri: string (nullable = true) > ||-- uri: string (nullable = true) > ||-- useragent: string (nullable = true) > And then there's a page_url_lists attributes in the later partitions. > I tried to use > val s = sqlContext.read.format("orc").option("mergeSchema", > "true").load("/data/warehouse/") to load the data. > But the schema doesn't show request.page_url_lists. > I am wondering if schema merge doesn't work for orc? -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184588#comment-16184588 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 6:20 PM: -- I verified the example and error is the same yet the reason is as in the cluster mode case: {noformat} 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_ERROR 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Blacklisting Mesos slave 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 due to too many failures; is Spark installed on it? 17/09/28 21:07:34 DEBUG CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove executor 1 with reason Executor finished with state LOST 17/09/28 21:07:34 INFO BlockManagerMaster: Removal of executor 1 requested 17/09/28 21:07:34 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 1 17/09/28 21:07:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster. {noformat} The task is failing and the agent is blacklisted which leads to starvation. In the scheduler we check for task failures for a slave in order to avoid feature launches there: {code:java} slaves.get(slaveId).map(_.taskFailures).getOrElse(0) < MAX_SLAVE_FAILURES {code} The task is failing due to: {noformat} I0928 21:07:34.621839 5559 master.cpp:6532] Sending status update TASK_ERROR for task 0 of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 'Total resources cpus(spark-prive)(allocated: spark-prive):8; mem(spark-prive)(allocated: spark-prive):1408 required by task and its executor is more than available ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216' I0928 21:07:34.622593 5559 hierarchical.cpp:850] Updated allocation of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 on agent 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 from ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 to ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 I0928 21:07:34.647950 5559 master.cpp:4941] Processing REVIVE call for framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 (Spark Pi) at scheduler-df433215-b87c-4b9b-993c-a3253c5f11a8@127.0.1.1:34775 {noformat} So again its the same reason as I have seen before. If I try to set a principle to spark mesos framework will not be able to register because even if I set a secret the driver will be aborted: {noformat} I0928 21:19:54.793844 7363 sched.cpp:232] Version: 1.3.0 I0928 21:19:54.795897 7355 sched.cpp:336] New master detected at master@127.0.1.1:5050 I0928 21:19:54.796042 7355 sched.cpp:407] Authenticating with master master@127.0.1.1:5050 I0928 21:19:54.796052 7355 sched.cpp:414] Using default CRAM-MD5 authenticatee I0928 21:19:54.796152 7361 authenticatee.cpp:97] Initializing client SASL I0928 21:19:54.814299 7361 authenticatee.cpp:121] Creating new client SASL connection I0928 21:19:54.815407 7357 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5 I0928 21:19:54.815421 7357 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5' I0928 21:19:54.815757 7360 authenticatee.cpp:259] Received SASL authentication step E0928 21:19:54.816179 7355 sched.cpp:507] Master master@127.0.1.1:5050 refused authentication I0928 21:19:54.816193 7355 sched.cpp:1187] Got error 'Master refused authentication' I0928 21:19:54.816200 7355 sched.cpp:2055] Asked to abort the driver 17/09/28 21:19:54 ERROR MesosCoarseGrainedSchedulerBackend: Mesos error: Master refused authentication Exception in thread "Thread-12" org.apache.spark.SparkException: Exiting due to error from cluster scheduler: Master refused authentication at org.apache.spark.scheduler.TaskSchedulerImpl.error(TaskSchedulerImpl.scala:500) at org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.error(MesosCoarseGrainedSchedulerBackend.scala:599) I0928 21:19:54.817343 7355 sched.cpp:2055] Asked to abort the driver I0928 21:19:54.817355 7355 sched.cpp:1233] Aborting framework {noformat} was (Author: skonto): I verified the example and error is the same yet the reason is as in the cluster mode c
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184588#comment-16184588 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 6:20 PM: -- I verified the example and error is the same yet the reason is as in the cluster mode case: {noformat} 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_ERROR 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Blacklisting Mesos slave 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 due to too many failures; is Spark installed on it? 17/09/28 21:07:34 DEBUG CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove executor 1 with reason Executor finished with state LOST 17/09/28 21:07:34 INFO BlockManagerMaster: Removal of executor 1 requested 17/09/28 21:07:34 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 1 17/09/28 21:07:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster. {noformat} The task is failing and the agent is blacklisted which leads to starvation. In the scheduler we check for task failures for a slave in order to avoid feature launches there: {code:java} slaves.get(slaveId).map(_.taskFailures).getOrElse(0) < MAX_SLAVE_FAILURES {code} The task is failing due to: {noformat} I0928 21:07:34.621839 5559 master.cpp:6532] Sending status update TASK_ERROR for task 0 of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 'Total resources cpus(spark-prive)(allocated: spark-prive):8; mem(spark-prive)(allocated: spark-prive):1408 required by task and its executor is more than available ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216' I0928 21:07:34.622593 5559 hierarchical.cpp:850] Updated allocation of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 on agent 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 from ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 to ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 I0928 21:07:34.647950 5559 master.cpp:4941] Processing REVIVE call for framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 (Spark Pi) at scheduler-df433215-b87c-4b9b-993c-a3253c5f11a8@127.0.1.1:34775 {noformat} So again its the same reason as I have seen before. If I try to set a principle mesos framework will not be able to register because even if I set a secret the driver will be aborted: {noformat} I0928 21:19:54.793844 7363 sched.cpp:232] Version: 1.3.0 I0928 21:19:54.795897 7355 sched.cpp:336] New master detected at master@127.0.1.1:5050 I0928 21:19:54.796042 7355 sched.cpp:407] Authenticating with master master@127.0.1.1:5050 I0928 21:19:54.796052 7355 sched.cpp:414] Using default CRAM-MD5 authenticatee I0928 21:19:54.796152 7361 authenticatee.cpp:97] Initializing client SASL I0928 21:19:54.814299 7361 authenticatee.cpp:121] Creating new client SASL connection I0928 21:19:54.815407 7357 authenticatee.cpp:213] Received SASL authentication mechanisms: CRAM-MD5 I0928 21:19:54.815421 7357 authenticatee.cpp:239] Attempting to authenticate with mechanism 'CRAM-MD5' I0928 21:19:54.815757 7360 authenticatee.cpp:259] Received SASL authentication step E0928 21:19:54.816179 7355 sched.cpp:507] Master master@127.0.1.1:5050 refused authentication I0928 21:19:54.816193 7355 sched.cpp:1187] Got error 'Master refused authentication' I0928 21:19:54.816200 7355 sched.cpp:2055] Asked to abort the driver 17/09/28 21:19:54 ERROR MesosCoarseGrainedSchedulerBackend: Mesos error: Master refused authentication Exception in thread "Thread-12" org.apache.spark.SparkException: Exiting due to error from cluster scheduler: Master refused authentication at org.apache.spark.scheduler.TaskSchedulerImpl.error(TaskSchedulerImpl.scala:500) at org.apache.spark.scheduler.cluster.mesos.MesosCoarseGrainedSchedulerBackend.error(MesosCoarseGrainedSchedulerBackend.scala:599) I0928 21:19:54.817343 7355 sched.cpp:2055] Asked to abort the driver I0928 21:19:54.817355 7355 sched.cpp:1233] Aborting framework {noformat} was (Author: skonto): I verified the example and error is the same yet the reason is as in the cluster mode case: {no
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184588#comment-16184588 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 6:17 PM: -- I verified the example and error is the same yet the reason is as in the cluster mode case: {noformat} 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_ERROR 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Blacklisting Mesos slave 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 due to too many failures; is Spark installed on it? 17/09/28 21:07:34 DEBUG CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove executor 1 with reason Executor finished with state LOST 17/09/28 21:07:34 INFO BlockManagerMaster: Removal of executor 1 requested 17/09/28 21:07:34 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 1 17/09/28 21:07:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster. {noformat} The task is failing and the agent is blacklisted which leads to starvation. In the scheduler we check for task failures for a slave in order to avoid feature launches there: {code:java} slaves.get(slaveId).map(_.taskFailures).getOrElse(0) < MAX_SLAVE_FAILURES {code} The task is failing due to: {noformat} I0928 21:07:34.621839 5559 master.cpp:6532] Sending status update TASK_ERROR for task 0 of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 'Total resources cpus(spark-prive)(allocated: spark-prive):8; mem(spark-prive)(allocated: spark-prive):1408 required by task and its executor is more than available ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216' I0928 21:07:34.622593 5559 hierarchical.cpp:850] Updated allocation of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 on agent 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 from ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 to ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 I0928 21:07:34.647950 5559 master.cpp:4941] Processing REVIVE call for framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 (Spark Pi) at scheduler-df433215-b87c-4b9b-993c-a3253c5f11a8@127.0.1.1:34775 {noformat} So again its the same reason as I have seen before. was (Author: skonto): I verified the example and error is the same yet the reason is as in the cluster mode case: {noformat} 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_ERROR 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Blacklisting Mesos slave 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 due to too many failures; is Spark installed on it? 17/09/28 21:07:34 DEBUG CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove executor 1 with reason Executor finished with state LOST 17/09/28 21:07:34 INFO BlockManagerMaster: Removal of executor 1 requested 17/09/28 21:07:34 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 1 17/09/28 21:07:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster. {noformat} The task is failing and the agent is blacklisted which leads to starvation. The task is failing due to: {noformat} I0928 21:07:34.621839 5559 master.cpp:6532] Sending status update TASK_ERROR for task 0 of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 'Total resources cpus(spark-prive)(allocated: spark-prive):8; mem(spark-prive)(allocated: spark-prive):1408 required by task and its executor is more than available ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216' I0928 21:07:34.622593 5559 hierarchical.cpp:850] Updated allocation of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 on agent 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 from ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive,
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184588#comment-16184588 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 6:16 PM: -- I verified the example and error is the same yet the reason is as in the cluster mode case: {noformat} 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_ERROR 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Blacklisting Mesos slave 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 due to too many failures; is Spark installed on it? 17/09/28 21:07:34 DEBUG CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove executor 1 with reason Executor finished with state LOST 17/09/28 21:07:34 INFO BlockManagerMaster: Removal of executor 1 requested 17/09/28 21:07:34 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 1 17/09/28 21:07:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster. {noformat} The task is failing and the agent is blacklisted. The task is failing due to: {noformat} I0928 21:07:34.621839 5559 master.cpp:6532] Sending status update TASK_ERROR for task 0 of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 'Total resources cpus(spark-prive)(allocated: spark-prive):8; mem(spark-prive)(allocated: spark-prive):1408 required by task and its executor is more than available ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216' I0928 21:07:34.622593 5559 hierarchical.cpp:850] Updated allocation of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 on agent 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 from ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 to ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 I0928 21:07:34.647950 5559 master.cpp:4941] Processing REVIVE call for framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 (Spark Pi) at scheduler-df433215-b87c-4b9b-993c-a3253c5f11a8@127.0.1.1:34775 {noformat} So again its the same reason as I have seen before. was (Author: skonto): I verified the example and error is the same yet the reason is different: {noformat} 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_ERROR 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Blacklisting Mesos slave 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 due to too many failures; is Spark installed on it? 17/09/28 21:07:34 DEBUG CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove executor 1 with reason Executor finished with state LOST 17/09/28 21:07:34 INFO BlockManagerMaster: Removal of executor 1 requested 17/09/28 21:07:34 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 1 17/09/28 21:07:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster. {noformat} The task is failing and the agent is blacklisted. The task is failing due to: {noformat} I0928 21:07:34.621839 5559 master.cpp:6532] Sending status update TASK_ERROR for task 0 of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 'Total resources cpus(spark-prive)(allocated: spark-prive):8; mem(spark-prive)(allocated: spark-prive):1408 required by task and its executor is more than available ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216' I0928 21:07:34.622593 5559 hierarchical.cpp:850] Updated allocation of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 on agent 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 from ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 to ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184588#comment-16184588 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 6:16 PM: -- I verified the example and error is the same yet the reason is as in the cluster mode case: {noformat} 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_ERROR 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Blacklisting Mesos slave 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 due to too many failures; is Spark installed on it? 17/09/28 21:07:34 DEBUG CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove executor 1 with reason Executor finished with state LOST 17/09/28 21:07:34 INFO BlockManagerMaster: Removal of executor 1 requested 17/09/28 21:07:34 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 1 17/09/28 21:07:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster. {noformat} The task is failing and the agent is blacklisted which leads to starvation. The task is failing due to: {noformat} I0928 21:07:34.621839 5559 master.cpp:6532] Sending status update TASK_ERROR for task 0 of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 'Total resources cpus(spark-prive)(allocated: spark-prive):8; mem(spark-prive)(allocated: spark-prive):1408 required by task and its executor is more than available ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216' I0928 21:07:34.622593 5559 hierarchical.cpp:850] Updated allocation of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 on agent 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 from ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 to ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 I0928 21:07:34.647950 5559 master.cpp:4941] Processing REVIVE call for framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 (Spark Pi) at scheduler-df433215-b87c-4b9b-993c-a3253c5f11a8@127.0.1.1:34775 {noformat} So again its the same reason as I have seen before. was (Author: skonto): I verified the example and error is the same yet the reason is as in the cluster mode case: {noformat} 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_ERROR 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Blacklisting Mesos slave 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 due to too many failures; is Spark installed on it? 17/09/28 21:07:34 DEBUG CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove executor 1 with reason Executor finished with state LOST 17/09/28 21:07:34 INFO BlockManagerMaster: Removal of executor 1 requested 17/09/28 21:07:34 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 1 17/09/28 21:07:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster. {noformat} The task is failing and the agent is blacklisted. The task is failing due to: {noformat} I0928 21:07:34.621839 5559 master.cpp:6532] Sending status update TASK_ERROR for task 0 of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 'Total resources cpus(spark-prive)(allocated: spark-prive):8; mem(spark-prive)(allocated: spark-prive):1408 required by task and its executor is more than available ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216' I0928 21:07:34.622593 5559 hierarchical.cpp:850] Updated allocation of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 on agent 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 from ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 to ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):100
[jira] [Commented] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184588#comment-16184588 ] Stavros Kontopoulos commented on SPARK-18935: - I verified the example and error is the same yet the reason is different: {noformat} 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Mesos task 1 is now TASK_ERROR 17/09/28 21:07:34 INFO MesosCoarseGrainedSchedulerBackend: Blacklisting Mesos slave 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 due to too many failures; is Spark installed on it? 17/09/28 21:07:34 DEBUG CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove executor 1 with reason Executor finished with state LOST 17/09/28 21:07:34 INFO BlockManagerMaster: Removal of executor 1 requested 17/09/28 21:07:34 INFO CoarseGrainedSchedulerBackend$DriverEndpoint: Asked to remove non-existent executor 1 17/09/28 21:07:34 INFO BlockManagerMasterEndpoint: Trying to remove executor 1 from BlockManagerMaster. {noformat} The task is failing and the agent is blacklisted. The task is failing due to: {noformat} I0928 21:07:34.621839 5559 master.cpp:6532] Sending status update TASK_ERROR for task 0 of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 'Total resources cpus(spark-prive)(allocated: spark-prive):8; mem(spark-prive)(allocated: spark-prive):1408 required by task and its executor is more than available ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216' I0928 21:07:34.622593 5559 hierarchical.cpp:850] Updated allocation of framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 on agent 433038b9-80aa-43ef-b6eb-0075f5028d37-S0 from ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 to ports(spark-prive, )(allocated: spark-prive):[31000-32000]; disk(spark-prive, )(allocated: spark-prive):1000; cpus(spark-prive, )(allocated: spark-prive):8; mem(spark-prive, )(allocated: spark-prive):10024; mem(*)(allocated: spark-prive):4590; disk(*)(allocated: spark-prive):103216 I0928 21:07:34.647950 5559 master.cpp:4941] Processing REVIVE call for framework e46985fe-1392-4d39-a3d5-e7ec77810695-0004 (Spark Pi) at scheduler-df433215-b87c-4b9b-993c-a3253c5f11a8@127.0.1.1:34775 {noformat} So again its the same reason as I have seen before. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as shown below: > 16/12/19 22:33:28 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > have sufficient resources -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-20643) Implement listener for saving application status data in key-value store
[ https://issues.apache.org/jira/browse/SPARK-20643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20643: Assignee: (was: Apache Spark) > Implement listener for saving application status data in key-value store > > > Key: SPARK-20643 > URL: https://issues.apache.org/jira/browse/SPARK-20643 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 2.3.0 >Reporter: Marcelo Vanzin > > See spec in parent issue (SPARK-18085) for more details. > This task tracks adding a new listener that will save application state to > the key-value store added in SPARK-20641; the listener will eventually > replace the existing listeners (such as JobProgressListener and > StatusListener), and the UI code will read data directly from the key-value > store instead of being coupled to the listener implementation. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-20643) Implement listener for saving application status data in key-value store
[ https://issues.apache.org/jira/browse/SPARK-20643?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-20643: Assignee: Apache Spark > Implement listener for saving application status data in key-value store > > > Key: SPARK-20643 > URL: https://issues.apache.org/jira/browse/SPARK-20643 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 2.3.0 >Reporter: Marcelo Vanzin >Assignee: Apache Spark > > See spec in parent issue (SPARK-18085) for more details. > This task tracks adding a new listener that will save application state to > the key-value store added in SPARK-20641; the listener will eventually > replace the existing listeners (such as JobProgressListener and > StatusListener), and the UI code will read data directly from the key-value > store instead of being coupled to the listener implementation. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-20643) Implement listener for saving application status data in key-value store
[ https://issues.apache.org/jira/browse/SPARK-20643?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184548#comment-16184548 ] Apache Spark commented on SPARK-20643: -- User 'vanzin' has created a pull request for this issue: https://github.com/apache/spark/pull/19383 > Implement listener for saving application status data in key-value store > > > Key: SPARK-20643 > URL: https://issues.apache.org/jira/browse/SPARK-20643 > Project: Spark > Issue Type: Sub-task > Components: Spark Core >Affects Versions: 2.3.0 >Reporter: Marcelo Vanzin > > See spec in parent issue (SPARK-18085) for more details. > This task tracks adding a new listener that will save application state to > the key-value store added in SPARK-20641; the listener will eventually > replace the existing listeners (such as JobProgressListener and > StatusListener), and the UI code will read data directly from the key-value > store instead of being coupled to the listener implementation. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22158) convertMetastoreOrc should not ignore table properties
[ https://issues.apache.org/jira/browse/SPARK-22158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22158: Assignee: (was: Apache Spark) > convertMetastoreOrc should not ignore table properties > -- > > Key: SPARK-22158 > URL: https://issues.apache.org/jira/browse/SPARK-22158 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0, 2.1.0, 2.2.0 >Reporter: Dongjoon Hyun > > From the beginning, convertMetastoreOrc ignores table properties and use an > emtpy map instead. > {code} > val options = Map[String, String]() > {code} > - SPARK-14070: > https://github.com/apache/spark/pull/11891/files#diff-ee66e11b56c21364760a5ed2b783f863R650 > - master: > https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala#L197 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22158) convertMetastoreOrc should not ignore table properties
[ https://issues.apache.org/jira/browse/SPARK-22158?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184521#comment-16184521 ] Apache Spark commented on SPARK-22158: -- User 'dongjoon-hyun' has created a pull request for this issue: https://github.com/apache/spark/pull/19382 > convertMetastoreOrc should not ignore table properties > -- > > Key: SPARK-22158 > URL: https://issues.apache.org/jira/browse/SPARK-22158 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0, 2.1.0, 2.2.0 >Reporter: Dongjoon Hyun > > From the beginning, convertMetastoreOrc ignores table properties and use an > emtpy map instead. > {code} > val options = Map[String, String]() > {code} > - SPARK-14070: > https://github.com/apache/spark/pull/11891/files#diff-ee66e11b56c21364760a5ed2b783f863R650 > - master: > https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala#L197 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22158) convertMetastoreOrc should not ignore table properties
[ https://issues.apache.org/jira/browse/SPARK-22158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22158: Assignee: Apache Spark > convertMetastoreOrc should not ignore table properties > -- > > Key: SPARK-22158 > URL: https://issues.apache.org/jira/browse/SPARK-22158 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.0, 2.1.0, 2.2.0 >Reporter: Dongjoon Hyun >Assignee: Apache Spark > > From the beginning, convertMetastoreOrc ignores table properties and use an > emtpy map instead. > {code} > val options = Map[String, String]() > {code} > - SPARK-14070: > https://github.com/apache/spark/pull/11891/files#diff-ee66e11b56c21364760a5ed2b783f863R650 > - master: > https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala#L197 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22152) Add Dataset flatten function
[ https://issues.apache.org/jira/browse/SPARK-22152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184511#comment-16184511 ] Wenchen Fan commented on SPARK-22152: - I don't have a strong opinion, if someone already has a PR, I can review and merge it. > Add Dataset flatten function > > > Key: SPARK-22152 > URL: https://issues.apache.org/jira/browse/SPARK-22152 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.2.0 >Reporter: Drew Robb >Priority: Minor > > Currently you can use an identify flatMap to flatten a Dataset, for example > to get from a Dataset[Option[T]] to a Dataset[T], but adding flatten directly > would allow for a more similar API to scala collections. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22158) convertMetastoreOrc should not ignore table properties
Dongjoon Hyun created SPARK-22158: - Summary: convertMetastoreOrc should not ignore table properties Key: SPARK-22158 URL: https://issues.apache.org/jira/browse/SPARK-22158 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.0, 2.1.0, 2.0.0 Reporter: Dongjoon Hyun >From the beginning, convertMetastoreOrc ignores table properties and use an >emtpy map instead. {code} val options = Map[String, String]() {code} - SPARK-14070: https://github.com/apache/spark/pull/11891/files#diff-ee66e11b56c21364760a5ed2b783f863R650 - master: https://github.com/apache/spark/blob/master/sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala#L197 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22143) OffHeapColumnVector may leak memory
[ https://issues.apache.org/jira/browse/SPARK-22143?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Herman van Hovell updated SPARK-22143: -- Fix Version/s: 2.2.1 > OffHeapColumnVector may leak memory > --- > > Key: SPARK-22143 > URL: https://issues.apache.org/jira/browse/SPARK-22143 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.2.0 >Reporter: Herman van Hovell >Assignee: Herman van Hovell > Fix For: 2.2.1, 2.3.0 > > > ColumnVector does not clean-up its children on close. This means that we are > sure to leak memory when we are using OffHeapColumnVectors. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22152) Add Dataset flatten function
[ https://issues.apache.org/jira/browse/SPARK-22152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184296#comment-16184296 ] Drew Robb commented on SPARK-22152: --- There is also a ticket for adding it to RDD: https://issues.apache.org/jira/browse/SPARK-18855 > Add Dataset flatten function > > > Key: SPARK-22152 > URL: https://issues.apache.org/jira/browse/SPARK-22152 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.2.0 >Reporter: Drew Robb >Priority: Minor > > Currently you can use an identify flatMap to flatten a Dataset, for example > to get from a Dataset[Option[T]] to a Dataset[T], but adding flatten directly > would allow for a more similar API to scala collections. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22074) Task killed by other attempt task should not be resubmitted
[ https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-22074: - Component/s: Scheduler > Task killed by other attempt task should not be resubmitted > --- > > Key: SPARK-22074 > URL: https://issues.apache.org/jira/browse/SPARK-22074 > Project: Spark > Issue Type: Bug > Components: Scheduler, Spark Core >Affects Versions: 2.1.0, 2.2.0 >Reporter: Li Yuanjian > Labels: speculation > > When a task killed by other task attempt, the task still resubmitted while > its executor lost. There is a certain probability caused the stage hanging > forever because of the unnecessary resubmit(see the scenario description > below). Although the patch https://issues.apache.org/jira/browse/SPARK-13931 > can resolve the hanging problem(thx [~GavinGavinNo1] :) ), but the > unnecessary resubmit should abandon. > Detail scenario description: > 1. A ShuffleMapStage has many tasks, some of them finished successfully > 2. An Executor Lost happened, this will trigger a new TaskSet resubmitted, > includes all missing partitions. > 3. Before the resubmitted TaskSet completed, another executor which only > include the task killed by other attempt lost, trigger the Resubmitted Event, > current stage's pendingPartitions is not empty. > 4. Resubmitted TaskSet end, shuffleMapStage.isAvailable == true, but > pendingPartitions is not empty, never step into submitWaitingChildStages. > Leave the key logs of this scenario below: > {noformat} > 393332:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO DAGScheduler: > Submitting 120 missing tasks from ShuffleMapStage 1046 > (MapPartitionsRDD[5321] at rdd at AFDEntry.scala:116) > 39:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO > YarnClusterScheduler: Adding task set 1046.0 with 120 tasks > 408766:17/09/11 13:46:25 [dispatcher-event-loop-5] INFO TaskSetManager: > Starting task 66.0 in stage 1046.0 (TID 110761, hidden-baidu-host.baidu.com, > executor 15, partition 66, PROCESS_LOCAL, 6237 bytes) > [1] Executor 15 lost, task 66.0 and 90.0 on it > 410532:17/09/11 13:46:32 [dispatcher-event-loop-47] INFO > YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 15. > 410900:17/09/11 13:46:33 [dispatcher-event-loop-34] INFO TaskSetManager: > Starting task 66.1 in stage 1046.0 (TID 111400, hidden-baidu-host.baidu.com, > executor 70, partition 66, PROCESS_LOCAL, 6237 bytes) > [2] Task 66.0 killed by 66.1 > 411315:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Killing > attempt 0 for task 66.0 in stage 1046.0 (TID 110761) on > hidden-baidu-host.baidu.com as the attempt 1 succeeded on > hidden-baidu-host.baidu.com > 411316:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Finished > task 66.1 in stage 1046.0 (TID 111400) in 3545 ms on > hidden-baidu-host.baidu.com (executor 70) (115/120) > [3] Executor 7 lost, task 0.0 72.0 7.0 on it > 411390:17/09/11 13:46:37 [dispatcher-event-loop-24] INFO > YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 7. > 416014:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: > ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) finished in 94.577 s > [4] ShuffleMapStage 1046.0 finished, missing partition trigger resubmitted > 1046.1 > 416019:17/09/1 13:46:59 [dag-scheduler-event- oop] INFO DAGScheduler: > Resubmitting ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) because some of > its tasks had failed: 0, 72, 79 > 416020:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: > Submitting ShuffleMapStage 1046 (MapPartitionsRDD[5321] at rdd at > AFDEntry.scala:116), which has no missing parents > 416030:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: > Submitting 3 missing tasks from ShuffleMapStage 1046 (MapPartitionsRDD[5321] > at rdd at AFDEntry.scala:116) > 416032:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO > YarnClusterScheduler: Adding task set 1046.1 with 3 tasks > 416034:17/09/11 13:46:59 [dispatcher-event-loop-21] INFO TaskSetManager: > Starting task 0.0 in stage 1046.1 (TID 112788, hidden-baidu-host.baidu.com, > executor 37, partition 0, PROCESS_LOCAL, 6237 bytes) > 416037:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: > Starting task 1.0 in stage 1046.1 (TID 112789, > yq01-inf-nmg01-spark03-20160817113538.yq01.baidu.com, executor 69, partition > 72, PROCESS_LOCAL, 6237 bytes) > 416039:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: > Starting task 2.0 in stage 1046.1 (TID 112790, hidden-baidu-host.baidu.com, > executor 26, partition 79, PROCESS_LOCAL, 6237 bytes) > [5] ShuffleMapStage 1046.1 still running, the attempted task killed by other > trigger the Resubmitted event > 416646:17/09/11 13:47:01 [dispatcher-event-loop-26]
[jira] [Updated] (SPARK-22074) Task killed by other attempt task should not be resubmitted
[ https://issues.apache.org/jira/browse/SPARK-22074?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Imran Rashid updated SPARK-22074: - Labels: speculation (was: ) > Task killed by other attempt task should not be resubmitted > --- > > Key: SPARK-22074 > URL: https://issues.apache.org/jira/browse/SPARK-22074 > Project: Spark > Issue Type: Bug > Components: Spark Core >Affects Versions: 2.1.0, 2.2.0 >Reporter: Li Yuanjian > Labels: speculation > > When a task killed by other task attempt, the task still resubmitted while > its executor lost. There is a certain probability caused the stage hanging > forever because of the unnecessary resubmit(see the scenario description > below). Although the patch https://issues.apache.org/jira/browse/SPARK-13931 > can resolve the hanging problem(thx [~GavinGavinNo1] :) ), but the > unnecessary resubmit should abandon. > Detail scenario description: > 1. A ShuffleMapStage has many tasks, some of them finished successfully > 2. An Executor Lost happened, this will trigger a new TaskSet resubmitted, > includes all missing partitions. > 3. Before the resubmitted TaskSet completed, another executor which only > include the task killed by other attempt lost, trigger the Resubmitted Event, > current stage's pendingPartitions is not empty. > 4. Resubmitted TaskSet end, shuffleMapStage.isAvailable == true, but > pendingPartitions is not empty, never step into submitWaitingChildStages. > Leave the key logs of this scenario below: > {noformat} > 393332:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO DAGScheduler: > Submitting 120 missing tasks from ShuffleMapStage 1046 > (MapPartitionsRDD[5321] at rdd at AFDEntry.scala:116) > 39:17/09/11 13:45:24 [dag-scheduler-event-loop] INFO > YarnClusterScheduler: Adding task set 1046.0 with 120 tasks > 408766:17/09/11 13:46:25 [dispatcher-event-loop-5] INFO TaskSetManager: > Starting task 66.0 in stage 1046.0 (TID 110761, hidden-baidu-host.baidu.com, > executor 15, partition 66, PROCESS_LOCAL, 6237 bytes) > [1] Executor 15 lost, task 66.0 and 90.0 on it > 410532:17/09/11 13:46:32 [dispatcher-event-loop-47] INFO > YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 15. > 410900:17/09/11 13:46:33 [dispatcher-event-loop-34] INFO TaskSetManager: > Starting task 66.1 in stage 1046.0 (TID 111400, hidden-baidu-host.baidu.com, > executor 70, partition 66, PROCESS_LOCAL, 6237 bytes) > [2] Task 66.0 killed by 66.1 > 411315:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Killing > attempt 0 for task 66.0 in stage 1046.0 (TID 110761) on > hidden-baidu-host.baidu.com as the attempt 1 succeeded on > hidden-baidu-host.baidu.com > 411316:17/09/11 13:46:37 [task-result-getter-2] INFO TaskSetManager: Finished > task 66.1 in stage 1046.0 (TID 111400) in 3545 ms on > hidden-baidu-host.baidu.com (executor 70) (115/120) > [3] Executor 7 lost, task 0.0 72.0 7.0 on it > 411390:17/09/11 13:46:37 [dispatcher-event-loop-24] INFO > YarnSchedulerBackend$YarnDriverEndpoint: Disabling executor 7. > 416014:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: > ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) finished in 94.577 s > [4] ShuffleMapStage 1046.0 finished, missing partition trigger resubmitted > 1046.1 > 416019:17/09/1 13:46:59 [dag-scheduler-event- oop] INFO DAGScheduler: > Resubmitting ShuffleMapStage 1046 (rdd at AFDEntry.scala:116) because some of > its tasks had failed: 0, 72, 79 > 416020:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: > Submitting ShuffleMapStage 1046 (MapPartitionsRDD[5321] at rdd at > AFDEntry.scala:116), which has no missing parents > 416030:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO DAGScheduler: > Submitting 3 missing tasks from ShuffleMapStage 1046 (MapPartitionsRDD[5321] > at rdd at AFDEntry.scala:116) > 416032:17/09/11 13:46:59 [dag-scheduler-event-loop] INFO > YarnClusterScheduler: Adding task set 1046.1 with 3 tasks > 416034:17/09/11 13:46:59 [dispatcher-event-loop-21] INFO TaskSetManager: > Starting task 0.0 in stage 1046.1 (TID 112788, hidden-baidu-host.baidu.com, > executor 37, partition 0, PROCESS_LOCAL, 6237 bytes) > 416037:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: > Starting task 1.0 in stage 1046.1 (TID 112789, > yq01-inf-nmg01-spark03-20160817113538.yq01.baidu.com, executor 69, partition > 72, PROCESS_LOCAL, 6237 bytes) > 416039:17/09/11 13:46:59 [dispatcher-event-loop-23] INFO TaskSetManager: > Starting task 2.0 in stage 1046.1 (TID 112790, hidden-baidu-host.baidu.com, > executor 26, partition 79, PROCESS_LOCAL, 6237 bytes) > [5] ShuffleMapStage 1046.1 still running, the attempted task killed by other > trigger the Resubmitted event > 416646:17/09/11 13:47:01 [dispatcher-event-loop-26] WARN
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 2:12 PM: -- I can reproduce it locally in client mode according to the example, I am looking into this. Btw I tested this a while ago with spark latest version (master) and cluster mode (if I recall correctly) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization (https://github.com/apache/mesos/commit/efbdef8dfd96ff08c1342b171ef89dcb266bdce7. However the comments here require it: https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L274-L278 The error itself indicates that for some reason the task resources requested are less than the available on the slave. I checked the numbers does not make sense unless there is a mismatch for the principal empty vs "" or something. I will test. Since numbers are fine it seems the failure is somewhere here where the resources are compared: https://github.com/apache/mesos/blob/d47641039f5e2dd18af007250ef7ae2a34258a2d/src/common/resources.cpp#L438 was (Author: skonto): I can reproduce it locally according to the example, I am looking at it. Btw I tested this a while ago with spark latest version (master) and cluster mode (if I recall correctly) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization (https://github.com/apache/mesos/commit/efbdef8dfd96ff08c1342b171ef89dcb266bdce7. However the comments here require it: https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L274-L278 The error itself indicates that for some reason the task resources requested are less than the available on the slave. I checked the numbers does not make sense unless th
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 2:11 PM: -- I can reproduce it locally according to the example, I am looking at it. Btw I tested this a while ago with spark latest version (master) and cluster mode (if I recall correctly) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization (https://github.com/apache/mesos/commit/efbdef8dfd96ff08c1342b171ef89dcb266bdce7. However the comments here require it: https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L274-L278 The error itself indicates that for some reason the task resources requested are less than the available on the slave. I checked the numbers does not make sense unless there is a mismatch for the principal empty vs "" or something. I will test. Since numbers are fine it seems the failure is somewhere here where the resources are compared: https://github.com/apache/mesos/blob/d47641039f5e2dd18af007250ef7ae2a34258a2d/src/common/resources.cpp#L438 was (Author: skonto): I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization (https://github.com/apache/mesos/commit/efbdef8dfd96ff08c1342b171ef89dcb266bdce7. However the comments here require it: https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L274-L278 The error itself indicates that for some reason the task resources requested are less than the available on the slave. I checked the numbers does not make sense unless there is a mismatch for the principal empty vs "" or something. I will test. Since numbers are fine it seems the failure is somewhere here wh
[jira] [Assigned] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18016: Assignee: Apache Spark (was: Aleksander Eskilson) > Code Generation: Constant Pool Past Limit for Wide/Nested Dataset > - > > Key: SPARK-18016 > URL: https://issues.apache.org/jira/browse/SPARK-18016 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Aleksander Eskilson >Assignee: Apache Spark > Fix For: 2.3.0 > > > When attempting to encode collections of large Java objects to Datasets > having very wide or deeply nested schemas, code generation can fail, yielding: > {code} > Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool for > class > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection > has grown past JVM limit of 0x > at > org.codehaus.janino.util.ClassFile.addToConstantPool(ClassFile.java:499) > at > org.codehaus.janino.util.ClassFile.addConstantNameAndTypeInfo(ClassFile.java:439) > at > org.codehaus.janino.util.ClassFile.addConstantMethodrefInfo(ClassFile.java:358) > at > org.codehaus.janino.UnitCompiler.writeConstantMethodrefInfo(UnitCompiler.java:4) > at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4547) > at org.codehaus.janino.UnitCompiler.access$7500(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3774) > at > org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3762) > at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328) > at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762) > at > org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4933) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:3180) > at org.codehaus.janino.UnitCompiler.access$5000(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3151) > at > org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3139) > at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3139) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2112) > at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1377) > at > org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1370) > at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:2558) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1370) > at > org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1450) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2811) > at > org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1262) > at > org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1234) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:538) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:890) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:894) > at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:377) > at > org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:369) > at > org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1128) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369) > at > org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1209) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:564) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:420) > at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:374) > at > org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:369) > at > org.codehaus.janino.Java$AbstractPackageMemberClassDeclaration.accept(Java.java:1309) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369) > at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:345) > at > org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:396) > at >
[jira] [Assigned] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-18016: Assignee: Aleksander Eskilson (was: Apache Spark) > Code Generation: Constant Pool Past Limit for Wide/Nested Dataset > - > > Key: SPARK-18016 > URL: https://issues.apache.org/jira/browse/SPARK-18016 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Aleksander Eskilson >Assignee: Aleksander Eskilson > Fix For: 2.3.0 > > > When attempting to encode collections of large Java objects to Datasets > having very wide or deeply nested schemas, code generation can fail, yielding: > {code} > Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool for > class > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection > has grown past JVM limit of 0x > at > org.codehaus.janino.util.ClassFile.addToConstantPool(ClassFile.java:499) > at > org.codehaus.janino.util.ClassFile.addConstantNameAndTypeInfo(ClassFile.java:439) > at > org.codehaus.janino.util.ClassFile.addConstantMethodrefInfo(ClassFile.java:358) > at > org.codehaus.janino.UnitCompiler.writeConstantMethodrefInfo(UnitCompiler.java:4) > at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4547) > at org.codehaus.janino.UnitCompiler.access$7500(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3774) > at > org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3762) > at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328) > at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762) > at > org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4933) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:3180) > at org.codehaus.janino.UnitCompiler.access$5000(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3151) > at > org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3139) > at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3139) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2112) > at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1377) > at > org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1370) > at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:2558) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1370) > at > org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1450) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2811) > at > org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1262) > at > org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1234) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:538) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:890) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:894) > at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:377) > at > org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:369) > at > org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1128) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369) > at > org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1209) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:564) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:420) > at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:374) > at > org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:369) > at > org.codehaus.janino.Java$AbstractPackageMemberClassDeclaration.accept(Java.java:1309) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369) > at org.codehaus.janino.UnitCompiler.compileUnit(UnitCompiler.java:345) > at > org.codehaus.janino.SimpleCompiler.compileToClassLoader(SimpleCompiler.java:396) >
[jira] [Commented] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184206#comment-16184206 ] Kazuaki Ishizaki commented on SPARK-18016: -- Thank you for reporting this again. While I pinged the original author in [this PR|https://github.com/apache/spark/pull/16648], it will not happen yet. > Code Generation: Constant Pool Past Limit for Wide/Nested Dataset > - > > Key: SPARK-18016 > URL: https://issues.apache.org/jira/browse/SPARK-18016 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Aleksander Eskilson >Assignee: Aleksander Eskilson > Fix For: 2.3.0 > > > When attempting to encode collections of large Java objects to Datasets > having very wide or deeply nested schemas, code generation can fail, yielding: > {code} > Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool for > class > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection > has grown past JVM limit of 0x > at > org.codehaus.janino.util.ClassFile.addToConstantPool(ClassFile.java:499) > at > org.codehaus.janino.util.ClassFile.addConstantNameAndTypeInfo(ClassFile.java:439) > at > org.codehaus.janino.util.ClassFile.addConstantMethodrefInfo(ClassFile.java:358) > at > org.codehaus.janino.UnitCompiler.writeConstantMethodrefInfo(UnitCompiler.java:4) > at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4547) > at org.codehaus.janino.UnitCompiler.access$7500(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3774) > at > org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3762) > at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328) > at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762) > at > org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4933) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:3180) > at org.codehaus.janino.UnitCompiler.access$5000(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3151) > at > org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3139) > at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3139) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2112) > at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1377) > at > org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1370) > at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:2558) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1370) > at > org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1450) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2811) > at > org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1262) > at > org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1234) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:538) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:890) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:894) > at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:377) > at > org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:369) > at > org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1128) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369) > at > org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1209) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:564) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:420) > at org.codehaus.janino.UnitCompiler.access$400(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:374) > at > org.codehaus.janino.UnitCompiler$2.visitPackageMemberClassDeclaration(UnitCompiler.java:369) > at > org.codehaus.janino.Java$AbstractPackageMemberClassDeclaration.accept(Java.java:1309) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369) > at org.codehaus
[jira] [Reopened] (SPARK-18016) Code Generation: Constant Pool Past Limit for Wide/Nested Dataset
[ https://issues.apache.org/jira/browse/SPARK-18016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrew Ash reopened SPARK-18016: // reopening issue One PR addressing this bug has been merged -- https://github.com/apache/spark/pull/18075 -- but the second PR hasn't gone in yet: https://github.com/apache/spark/pull/16648 I'm still observing the error message on a dataset with ~3000 columns: {noformat} java.lang.RuntimeException: Error while encoding: org.codehaus.janino.JaninoRuntimeException: failed to compile: org.codehaus.janino.JaninoRuntimeException: Constant pool for class org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection$NestedClass2 has grown past JVM limit of 0x {noformat} even when running with the first PR, and [~jamcon] reported similarly at https://issues.apache.org/jira/browse/SPARK-18016?focusedCommentId=16103853&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16103853 > Code Generation: Constant Pool Past Limit for Wide/Nested Dataset > - > > Key: SPARK-18016 > URL: https://issues.apache.org/jira/browse/SPARK-18016 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.1.0 >Reporter: Aleksander Eskilson >Assignee: Aleksander Eskilson > Fix For: 2.3.0 > > > When attempting to encode collections of large Java objects to Datasets > having very wide or deeply nested schemas, code generation can fail, yielding: > {code} > Caused by: org.codehaus.janino.JaninoRuntimeException: Constant pool for > class > org.apache.spark.sql.catalyst.expressions.GeneratedClass$SpecificUnsafeProjection > has grown past JVM limit of 0x > at > org.codehaus.janino.util.ClassFile.addToConstantPool(ClassFile.java:499) > at > org.codehaus.janino.util.ClassFile.addConstantNameAndTypeInfo(ClassFile.java:439) > at > org.codehaus.janino.util.ClassFile.addConstantMethodrefInfo(ClassFile.java:358) > at > org.codehaus.janino.UnitCompiler.writeConstantMethodrefInfo(UnitCompiler.java:4) > at org.codehaus.janino.UnitCompiler.compileGet2(UnitCompiler.java:4547) > at org.codehaus.janino.UnitCompiler.access$7500(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3774) > at > org.codehaus.janino.UnitCompiler$12.visitMethodInvocation(UnitCompiler.java:3762) > at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328) > at org.codehaus.janino.UnitCompiler.compileGet(UnitCompiler.java:3762) > at > org.codehaus.janino.UnitCompiler.compileGetValue(UnitCompiler.java:4933) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:3180) > at org.codehaus.janino.UnitCompiler.access$5000(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3151) > at > org.codehaus.janino.UnitCompiler$9.visitMethodInvocation(UnitCompiler.java:3139) > at org.codehaus.janino.Java$MethodInvocation.accept(Java.java:4328) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:3139) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:2112) > at org.codehaus.janino.UnitCompiler.access$1700(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1377) > at > org.codehaus.janino.UnitCompiler$6.visitExpressionStatement(UnitCompiler.java:1370) > at org.codehaus.janino.Java$ExpressionStatement.accept(Java.java:2558) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:1370) > at > org.codehaus.janino.UnitCompiler.compileStatements(UnitCompiler.java:1450) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:2811) > at > org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1262) > at > org.codehaus.janino.UnitCompiler.compileDeclaredMethods(UnitCompiler.java:1234) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:538) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:890) > at org.codehaus.janino.UnitCompiler.compile2(UnitCompiler.java:894) > at org.codehaus.janino.UnitCompiler.access$600(UnitCompiler.java:206) > at > org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:377) > at > org.codehaus.janino.UnitCompiler$2.visitMemberClassDeclaration(UnitCompiler.java:369) > at > org.codehaus.janino.Java$MemberClassDeclaration.accept(Java.java:1128) > at org.codehaus.janino.UnitCompiler.compile(UnitCompiler.java:369) > at > org.codehaus.janino.UnitCompiler.compileDeclaredMemberTypes(UnitCompiler.java:1209) > at org.code
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 1:13 PM: -- I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization (https://github.com/apache/mesos/commit/efbdef8dfd96ff08c1342b171ef89dcb266bdce7. However the comments here require it: https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L274-L278 The error itself indicates that for some reason the task resources requested are less than the available on the slave. I checked the numbers does not make sense unless there is a mismatch for the principal empty vs "" or something. I will test. Since numbers are fine it seems the failure is somewhere here where the resources are compared: https://github.com/apache/mesos/blob/d47641039f5e2dd18af007250ef7ae2a34258a2d/src/common/resources.cpp#L438 was (Author: skonto): I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization (https://github.com/apache/mesos/commit/efbdef8dfd96ff08c1342b171ef89dcb266bdce7. However the comments here require it: https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L274-L278 The error itself indicates that for some reason the task resources requested are less than the available on the slave. I checked the numbers does not make sense unless there is a mismatch for the principal empty vs "" or something. I will test. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.a
[jira] [Commented] (SPARK-10884) Support prediction on single instance for regression and classification related models
[ https://issues.apache.org/jira/browse/SPARK-10884?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184126#comment-16184126 ] Apache Spark commented on SPARK-10884: -- User 'WeichenXu123' has created a pull request for this issue: https://github.com/apache/spark/pull/19381 > Support prediction on single instance for regression and classification > related models > -- > > Key: SPARK-10884 > URL: https://issues.apache.org/jira/browse/SPARK-10884 > Project: Spark > Issue Type: Sub-task > Components: ML >Reporter: Yanbo Liang >Assignee: Yanbo Liang > Labels: 2.2.0 > > Support prediction on single instance for regression and classification > related models (i.e., PredictionModel, ClassificationModel and their sub > classes). > Add corresponding test cases. > See parent issue for more details. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-21737) Create communication channel between arbitrary clients and the Spark AM in YARN mode
[ https://issues.apache.org/jira/browse/SPARK-21737?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184120#comment-16184120 ] Thomas Graves commented on SPARK-21737: --- We definitely still want to do this. I don't think [~yoonlee95] has time currently, I'm hoping to pick it back up. If someone else has time feel free. > Create communication channel between arbitrary clients and the Spark AM in > YARN mode > > > Key: SPARK-21737 > URL: https://issues.apache.org/jira/browse/SPARK-21737 > Project: Spark > Issue Type: Improvement > Components: Spark Core >Affects Versions: 2.1.1 >Reporter: Jong Yoon Lee >Priority: Minor > > In this JIRA, I develop code to create a communication channel between > arbitrary clients and a Spark AM on YARN. This code can be utilized to send > commands such as getting status command, getting history info from the CLI, > killing the application and pushing new tokens. > Design Doc: > https://docs.google.com/document/d/1QMbWhg13ocIoADywZQBRRVj-b9Zf8CnBrruP5JhcOOY/edit?usp=sharing -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22151) PYTHONPATH not picked up from the spark.yarn.appMasterEnv properly
[ https://issues.apache.org/jira/browse/SPARK-22151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Thomas Graves updated SPARK-22151: -- Description: Running in yarn cluster mode and trying to set pythonpath via spark.yarn.appMasterEnv.PYTHONPATH doesn't work. the yarn Client code looks at the env variables: val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath) But when you set spark.yarn.appMasterEnv it puts it into the local env. So the python path set in spark.yarn.appMasterEnv isn't properly set. You can work around if you are running in cluster mode by setting it on the client like: PYTHONPATH=./addon/python/ spark-submit was: the code looks at the env variables: val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath) But when you set spark.yarn.appMasterEnv it puts it into the local env. So the python path set in spark.yarn.appMasterEnv isn't properly set. You can work around if you are running in cluster mode by setting it on the client like: PYTHONPATH=./addon/python/ spark-submit > PYTHONPATH not picked up from the spark.yarn.appMasterEnv properly > -- > > Key: SPARK-22151 > URL: https://issues.apache.org/jira/browse/SPARK-22151 > Project: Spark > Issue Type: Bug > Components: YARN >Affects Versions: 2.1.1 >Reporter: Thomas Graves > > Running in yarn cluster mode and trying to set pythonpath via > spark.yarn.appMasterEnv.PYTHONPATH doesn't work. > the yarn Client code looks at the env variables: > val pythonPathStr = (sys.env.get("PYTHONPATH") ++ pythonPath) > But when you set spark.yarn.appMasterEnv it puts it into the local env. > So the python path set in spark.yarn.appMasterEnv isn't properly set. > You can work around if you are running in cluster mode by setting it on the > client like: > PYTHONPATH=./addon/python/ spark-submit -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22157) The uniux_timestamp method handles the time field that is lost in mill
[ https://issues.apache.org/jira/browse/SPARK-22157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184088#comment-16184088 ] Sean Owen commented on SPARK-22157: --- This is invalid. A UNIX timestamp is in whole seconds. > The uniux_timestamp method handles the time field that is lost in mill > -- > > Key: SPARK-22157 > URL: https://issues.apache.org/jira/browse/SPARK-22157 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.0, 2.2.0 >Reporter: hantiantian > > 1、create table test,and execute the flowing command > select s1 from test; > result: 2014-10-10 19:30:10.222 > 2、when use the native unix_timestamp method, and execute the flowing command > select unix_timestamp(s1,"-MM-dd HH:mm:ss.SSS") from test; > result: 1412940610 > Obviously,the mill part of the time field has been lost. > 3、After repair,execute the command again > select unix_timestamp(s1,"-MM-dd HH:mm:ss.SSS") from test; > result: 1412940610.222 > Conclusion:After repair, we can keep the the mill part of the time field. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22157) The uniux_timestamp method handles the time field that is lost in mill
[ https://issues.apache.org/jira/browse/SPARK-22157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22157: Assignee: (was: Apache Spark) > The uniux_timestamp method handles the time field that is lost in mill > -- > > Key: SPARK-22157 > URL: https://issues.apache.org/jira/browse/SPARK-22157 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.0, 2.2.0 >Reporter: hantiantian > > 1、create table test,and execute the flowing command > select s1 from test; > result: 2014-10-10 19:30:10.222 > 2、when use the native unix_timestamp method, and execute the flowing command > select unix_timestamp(s1,"-MM-dd HH:mm:ss.SSS") from test; > result: 1412940610 > Obviously,the mill part of the time field has been lost. > 3、After repair,execute the command again > select unix_timestamp(s1,"-MM-dd HH:mm:ss.SSS") from test; > result: 1412940610.222 > Conclusion:After repair, we can keep the the mill part of the time field. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22157) The uniux_timestamp method handles the time field that is lost in mill
[ https://issues.apache.org/jira/browse/SPARK-22157?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184086#comment-16184086 ] Apache Spark commented on SPARK-22157: -- User 'httfighter' has created a pull request for this issue: https://github.com/apache/spark/pull/19380 > The uniux_timestamp method handles the time field that is lost in mill > -- > > Key: SPARK-22157 > URL: https://issues.apache.org/jira/browse/SPARK-22157 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.0, 2.2.0 >Reporter: hantiantian > > 1、create table test,and execute the flowing command > select s1 from test; > result: 2014-10-10 19:30:10.222 > 2、when use the native unix_timestamp method, and execute the flowing command > select unix_timestamp(s1,"-MM-dd HH:mm:ss.SSS") from test; > result: 1412940610 > Obviously,the mill part of the time field has been lost. > 3、After repair,execute the command again > select unix_timestamp(s1,"-MM-dd HH:mm:ss.SSS") from test; > result: 1412940610.222 > Conclusion:After repair, we can keep the the mill part of the time field. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22157) The uniux_timestamp method handles the time field that is lost in mill
[ https://issues.apache.org/jira/browse/SPARK-22157?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22157: Assignee: Apache Spark > The uniux_timestamp method handles the time field that is lost in mill > -- > > Key: SPARK-22157 > URL: https://issues.apache.org/jira/browse/SPARK-22157 > Project: Spark > Issue Type: Bug > Components: SQL >Affects Versions: 2.0.2, 2.1.0, 2.2.0 >Reporter: hantiantian >Assignee: Apache Spark > > 1、create table test,and execute the flowing command > select s1 from test; > result: 2014-10-10 19:30:10.222 > 2、when use the native unix_timestamp method, and execute the flowing command > select unix_timestamp(s1,"-MM-dd HH:mm:ss.SSS") from test; > result: 1412940610 > Obviously,the mill part of the time field has been lost. > 3、After repair,execute the command again > select unix_timestamp(s1,"-MM-dd HH:mm:ss.SSS") from test; > result: 1412940610.222 > Conclusion:After repair, we can keep the the mill part of the time field. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Created] (SPARK-22157) The uniux_timestamp method handles the time field that is lost in mill
hantiantian created SPARK-22157: --- Summary: The uniux_timestamp method handles the time field that is lost in mill Key: SPARK-22157 URL: https://issues.apache.org/jira/browse/SPARK-22157 Project: Spark Issue Type: Bug Components: SQL Affects Versions: 2.2.0, 2.1.0, 2.0.2 Reporter: hantiantian 1、create table test,and execute the flowing command select s1 from test; result: 2014-10-10 19:30:10.222 2、when use the native unix_timestamp method, and execute the flowing command select unix_timestamp(s1,"-MM-dd HH:mm:ss.SSS") from test; result: 1412940610 Obviously,the mill part of the time field has been lost. 3、After repair,execute the command again select unix_timestamp(s1,"-MM-dd HH:mm:ss.SSS") from test; result: 1412940610.222 Conclusion:After repair, we can keep the the mill part of the time field. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 12:19 PM: --- I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization (https://github.com/apache/mesos/commit/efbdef8dfd96ff08c1342b171ef89dcb266bdce7. However the comments here require it: https://github.com/apache/mesos/blob/master/include/mesos/mesos.proto#L274-L278 The error itself indicates that for some reason the task resources requested are less than the available on the slave. I checked the numbers does not make sense unless there is a mismatch for the principal empty vs "" or something. I will test. was (Author: skonto): I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization (https://github.com/apache/mesos/commit/efbdef8dfd96ff08c1342b171ef89dcb266bdce7). Btw the error indicates that for some reason the task resources requested are less than the available on the slave. I checked the numbers does not make sense unless there is a mismatch for the principal empty vs "" or something. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 12:13 PM: --- I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization (https://github.com/apache/mesos/commit/efbdef8dfd96ff08c1342b171ef89dcb266bdce7). Btw the error indicates that for some reason the task resources requested are less than the available on the slave. I checked the numbers does not make sense unless there is a mismatch for the principal empty vs "" or something. was (Author: skonto): I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization (https://github.com/apache/mesos/commit/efbdef8dfd96ff08c1342b171ef89dcb266bdce7). Btw the error indicates that the scheduler received some offers let say from * role and when it tries to launch a task that offer is not valid anymore as resource partition has changed due to the reservation on the slave the task was launched. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"p
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 12:00 PM: --- I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization (https://github.com/apache/mesos/commit/efbdef8dfd96ff08c1342b171ef89dcb266bdce7). Btw the error indicates that the scheduler received some offers let say from * role and when it tries to launch a task that offer is not valid anymore as resource partition has changed due to the reservation on the slave the task was launched. was (Author: skonto): I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization. Btw the error indicates that the scheduler received some offers let say from * role and when it tries to launch a task that offer is not valid anymore as resource partition has changed due to the reservation on the slave the task was launched. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 11:58 AM: --- I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization. Btw the error indicates that the scheduler received some offers let say from * role and when it tries to launch a task that offer is not valid anymore as resource partition has changed due to the reservation. was (Author: skonto): I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seem soptional if you have no authorization. Btw the error indicates that the scheduler received some offers let say from * role and when it tries to launch a task that offer is not valid anymore as resource partition has changed due to the reservation. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 11:58 AM: --- I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seem soptional if you have no authorization. Btw the error indicates that the scheduler received some offers let say from * role and when it tries to launch a task that offer is not valid anymore as resource partition has changed due to the reservation. was (Author: skonto): I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but probably I need to do so. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as sho
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 11:58 AM: --- I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization. Btw the error indicates that the scheduler received some offers let say from * role and when it tries to launch a task that offer is not valid anymore as resource partition has changed due to the reservation on the slave the task was launched. was (Author: skonto): I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but seems optional if you have no authorization. Btw the error indicates that the scheduler received some offers let say from * role and when it tries to launch a task that offer is not valid anymore as resource partition has changed due to the reservation. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run sp
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 11:47 AM: --- I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent checked with a principal yet, but probably I need to do so. was (Author: skonto): I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent check with a principal yet. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as shown below: > 16/12/19 22:33:28 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > have sufficient resources -- This message was sent by Atlassian JIRA (v6.4.14#64029) -
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 11:46 AM: --- I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent check with a principal yet. was (Author: skonto): I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent check with a principal but I suspect it will not make a difference, its optional anyway. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as shown below: > 16/12/19 22:33:28 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > have sufficient resources -- This message was sent by Atlassi
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 11:26 AM: --- I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent check with a principal but I suspect it will not make a difference, its optional anyway. was (Author: skonto): I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes fromhere: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent check with a principal but I suspect it will not make a difference, its optional anyway. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as shown below: > 16/12/19 22:33:28 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > hav
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 11:25 AM: --- I tested this a while ago with spark latest version (master) and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes fromhere: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent check with a principal but I suspect it will not make a difference, its optional anyway. was (Author: skonto): I tested this a while ago and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes fromhere: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent check with a principal but I suspect it will not make a difference, its optional anyway. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as shown below: > 16/12/19 22:33:28 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > have sufficient resources -- This me
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 11:24 AM: --- I tested this a while ago and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes fromhere: [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339]. I havent check with a principal but I suspect it will not make a difference, its optional anyway. was (Author: skonto): I tested this a while ago and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339.]. I havent check with a principal but I suspect its optional anyway. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as shown below: > 16/12/19 22:33:28 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > have sufficient resources -- This message was sent by Atlassian JIRA (v6.4.14#64029) -
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 11:24 AM: --- I tested this a while ago and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from [https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339.]. I havent check with a principal but I suspect its optional anyway. was (Author: skonto): I tested this a while ago and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339. I havent check with a principal but I suspect its optional anyway. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as shown below: > 16/12/19 22:33:28 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > have sufficient resources -- This message was sent by Atlassian JIRA (v6.4.14#64029) -
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 11:23 AM: --- I tested this a while ago and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: {code:java} curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve {code} The error comes from here: https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339. I havent check with a principal but I suspect its optional anyway. was (Author: skonto): I tested this a while ago and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve The error comes from here: https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339. I havent check with a principal but I suspect its optional anyway. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as shown below: > 16/12/19 22:33:28 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > have sufficient resources -- This message was sent by Atlassian JIRA (v6.4.14#64029) -
[jira] [Comment Edited] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos edited comment on SPARK-18935 at 9/28/17 11:23 AM: --- I tested this a while ago and stuck here: {noformat} I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' {noformat} To reserve resources I used: curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve The error comes from here: https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339. I havent check with a principal but I suspect its optional anyway. was (Author: skonto): I tested this a while ago and stuck here: I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' To reserve resources I used: curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve The error comes from here: https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339. I havent check with a principal but I suspect its optional anyway. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as shown below: > 16/12/19 22:33:28 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > have sufficient resources -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@sp
[jira] [Commented] (SPARK-18935) Use Mesos "Dynamic Reservation" resource for Spark
[ https://issues.apache.org/jira/browse/SPARK-18935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16184025#comment-16184025 ] Stavros Kontopoulos commented on SPARK-18935: - I tested this a while ago and stuck here: I0829 14:05:56.342872 21756 master.cpp:6532] Sending status update TASK_ERROR for task 1 of framework f0a1a46a-e404-4faa-87f7-29479f30b57e-0009 'Total resources cpus(spark-prive)(allocated: spark-prive):1; cpus(*)(allocated: spark-prive):1; mem(spark-prive)(allocated: spark-prive):1024; mem(*)(allocated: spark-prive):384 required by task and its executor is more than available disk(spark-prive, )(allocated: spark-prive):1000; ports(spark-prive, )(allocated: spark-prive):[31000-32000]; cpus(spark-prive, )(allocated: spark-prive):1; mem(spark-prive, )(allocated: spark-prive):1024; cpus(*)(allocated: spark-prive):1; mem(*)(allocated: spark-prive):976; disk(*)(allocated: spark-prive):9000' To reserve resources I used: curl -i -d slaveId=cf885682-8a28-4e82-b5db-a01277edfafc-S0 -d resources='[{"name":"disk","type":"SCALAR","scalar": {"value":1000} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"ports","type":"RANGES","ranges": { "range": [{"begin": 31000, "end": 32000}] },"role":"spark-prive","reservation":{"principal":""}}, {"name":"cpus","type":"SCALAR","scalar": {"value":1} ,"role":"spark-prive","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar": {"value":1024},"role":"spark-prive","reservation":{"principal":""}}]' -X POST http://172.17.0.1:5050/master/reserve The error comes from here: https://github.com/apache/mesos/blob/11ee081ee578ea12e85799e00c5fe8b89eb6ea5f/src/master/validation.cpp#L1339. I havent check with a principal but I suspect its optional anyway. > Use Mesos "Dynamic Reservation" resource for Spark > -- > > Key: SPARK-18935 > URL: https://issues.apache.org/jira/browse/SPARK-18935 > Project: Spark > Issue Type: Bug >Affects Versions: 2.0.0, 2.0.1, 2.0.2 >Reporter: jackyoh > > I'm running spark on Apache Mesos > Please follow these steps to reproduce the issue: > 1. First, run Mesos resource reserve: > curl -i -d slaveId=c24d1cfb-79f3-4b07-9f8b-c7b19543a333-S0 -d > resources='[{"name":"cpus","type":"SCALAR","scalar":{"value":20},"role":"spark","reservation":{"principal":""}},{"name":"mem","type":"SCALAR","scalar":{"value":4096},"role":"spark","reservation":{"principal":""}}]' > -X POST http://192.168.1.118:5050/master/reserve > 2. Then run spark-submit command: > ./spark-submit --class org.apache.spark.examples.SparkPi --master > mesos://192.168.1.118:5050 --conf spark.mesos.role=spark > ../examples/jars/spark-examples_2.11-2.0.2.jar 1 > And the console will keep loging same warning message as shown below: > 16/12/19 22:33:28 WARN TaskSchedulerImpl: Initial job has not accepted any > resources; check your cluster UI to ensure that workers are registered and > have sufficient resources -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22152) Add Dataset flatten function
[ https://issues.apache.org/jira/browse/SPARK-22152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183996#comment-16183996 ] Sean Owen commented on SPARK-22152: --- (It does not have a flatten method. I was suggesting it makes sense to add it to both or neither, but not sure how much it adds) > Add Dataset flatten function > > > Key: SPARK-22152 > URL: https://issues.apache.org/jira/browse/SPARK-22152 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.2.0 >Reporter: Drew Robb >Priority: Minor > > Currently you can use an identify flatMap to flatten a Dataset, for example > to get from a Dataset[Option[T]] to a Dataset[T], but adding flatten directly > would allow for a more similar API to scala collections. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22152) Add Dataset flatten function
[ https://issues.apache.org/jira/browse/SPARK-22152?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183989#comment-16183989 ] Wenchen Fan commented on SPARK-22152: - if RDD has flatten, I'm ok to add it to Dataset. > Add Dataset flatten function > > > Key: SPARK-22152 > URL: https://issues.apache.org/jira/browse/SPARK-22152 > Project: Spark > Issue Type: New Feature > Components: SQL >Affects Versions: 2.2.0 >Reporter: Drew Robb >Priority: Minor > > Currently you can use an identify flatMap to flatten a Dataset, for example > to get from a Dataset[Option[T]] to a Dataset[T], but adding flatten directly > would allow for a more similar API to scala collections. -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22156) incorrect learning rate update equation when numIterations > 1
[ https://issues.apache.org/jira/browse/SPARK-22156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22156: Assignee: (was: Apache Spark) > incorrect learning rate update equation when numIterations > 1 > -- > > Key: SPARK-22156 > URL: https://issues.apache.org/jira/browse/SPARK-22156 > Project: Spark > Issue Type: Improvement > Components: MLlib >Affects Versions: 2.2.0 >Reporter: kento nozawa >Priority: Minor > > Current equation of learning rate is incorrect when numIterations > 1. > Original code: > https://github.com/tmikolov/word2vec/blob/master/word2vec.c#L393 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Assigned] (SPARK-22156) incorrect learning rate update equation when numIterations > 1
[ https://issues.apache.org/jira/browse/SPARK-22156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Apache Spark reassigned SPARK-22156: Assignee: Apache Spark > incorrect learning rate update equation when numIterations > 1 > -- > > Key: SPARK-22156 > URL: https://issues.apache.org/jira/browse/SPARK-22156 > Project: Spark > Issue Type: Improvement > Components: MLlib >Affects Versions: 2.2.0 >Reporter: kento nozawa >Assignee: Apache Spark >Priority: Minor > > Current equation of learning rate is incorrect when numIterations > 1. > Original code: > https://github.com/tmikolov/word2vec/blob/master/word2vec.c#L393 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-22156) incorrect learning rate update equation when numIterations > 1
[ https://issues.apache.org/jira/browse/SPARK-22156?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16183981#comment-16183981 ] Apache Spark commented on SPARK-22156: -- User 'nzw0301' has created a pull request for this issue: https://github.com/apache/spark/pull/19372 > incorrect learning rate update equation when numIterations > 1 > -- > > Key: SPARK-22156 > URL: https://issues.apache.org/jira/browse/SPARK-22156 > Project: Spark > Issue Type: Improvement > Components: MLlib >Affects Versions: 2.2.0 >Reporter: kento nozawa >Priority: Minor > > Current equation of learning rate is incorrect when numIterations > 1. > Original code: > https://github.com/tmikolov/word2vec/blob/master/word2vec.c#L393 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Updated] (SPARK-22156) incorrect learning rate update equation when numIterations > 1
[ https://issues.apache.org/jira/browse/SPARK-22156?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] kento nozawa updated SPARK-22156: - Description: Current equation of learning rate is incorrect when numIterations > 1. Original code: https://github.com/tmikolov/word2vec/blob/master/word2vec.c#L393 was: Current equation of learning rate is incorrect when numIterations > 1. [original code](https://github.com/tmikolov/word2vec/blob/master/word2vec.c#L393) > incorrect learning rate update equation when numIterations > 1 > -- > > Key: SPARK-22156 > URL: https://issues.apache.org/jira/browse/SPARK-22156 > Project: Spark > Issue Type: Improvement > Components: MLlib >Affects Versions: 2.2.0 >Reporter: kento nozawa >Priority: Minor > > Current equation of learning rate is incorrect when numIterations > 1. > Original code: > https://github.com/tmikolov/word2vec/blob/master/word2vec.c#L393 -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org