[jira] [Comment Edited] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code

2023-01-23 Thread Mayank Asthana (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17679807#comment-17679807
 ] 

Mayank Asthana edited comment on SPARK-26365 at 1/23/23 2:06 PM:
-

{quote}Spark submit command exit code ($?) as 0 is okay as there is no error in 
job submission.
{quote}
Spark submit in cluster mode with master yarn, exits with `1` status code on a 
job failure. It would also be equivalent to a job submission, to yarn instead 
of kubernetes.

So, this should also be considered a bug.


was (Author: masthana):
{quote}Spark submit command exit code ($?) as 0 is okay as there is no error in 
job submission.


{quote}

> spark-submit for k8s cluster doesn't propagate exit code
> 
>
> Key: SPARK-26365
> URL: https://issues.apache.org/jira/browse/SPARK-26365
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Spark Core, Spark Submit
>Affects Versions: 2.3.2, 2.4.0, 3.0.0, 3.1.0
>Reporter: Oscar Bonilla
>Priority: Major
> Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, 
> spark-3.0.0-raise-exception-k8s-failure.patch
>
>
> When launching apps using spark-submit in a kubernetes cluster, if the Spark 
> applications fails (returns exit code = 1 for example), spark-submit will 
> still exit gracefully and return exit code = 0.
> This is problematic, since there's no way to know if there's been a problem 
> with the Spark application.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code

2023-01-23 Thread Mayank Asthana (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17679807#comment-17679807
 ] 

Mayank Asthana commented on SPARK-26365:


{quote}Spark submit command exit code ($?) as 0 is okay as there is no error in 
job submission.


{quote}

> spark-submit for k8s cluster doesn't propagate exit code
> 
>
> Key: SPARK-26365
> URL: https://issues.apache.org/jira/browse/SPARK-26365
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Spark Core, Spark Submit
>Affects Versions: 2.3.2, 2.4.0, 3.0.0, 3.1.0
>Reporter: Oscar Bonilla
>Priority: Major
> Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, 
> spark-3.0.0-raise-exception-k8s-failure.patch
>
>
> When launching apps using spark-submit in a kubernetes cluster, if the Spark 
> applications fails (returns exit code = 1 for example), spark-submit will 
> still exit gracefully and return exit code = 0.
> This is problematic, since there's no way to know if there's been a problem 
> with the Spark application.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code

2022-04-29 Thread Mayank Asthana (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529867#comment-17529867
 ] 

Mayank Asthana edited comment on SPARK-26365 at 4/29/22 8:51 AM:
-

[~unnamed101] Your change looks good. Can you open a pull request on 
[https://github.com/apache/spark] for an official review?


was (Author: masthana):
[~oscar.bonilla] Your change looks good. Can you open a pull request on 
[https://github.com/apache/spark] for an official review?

> spark-submit for k8s cluster doesn't propagate exit code
> 
>
> Key: SPARK-26365
> URL: https://issues.apache.org/jira/browse/SPARK-26365
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Spark Core, Spark Submit
>Affects Versions: 2.3.2, 2.4.0, 3.0.0, 3.1.0
>Reporter: Oscar Bonilla
>Priority: Major
> Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, 
> spark-3.0.0-raise-exception-k8s-failure.patch
>
>
> When launching apps using spark-submit in a kubernetes cluster, if the Spark 
> applications fails (returns exit code = 1 for example), spark-submit will 
> still exit gracefully and return exit code = 0.
> This is problematic, since there's no way to know if there's been a problem 
> with the Spark application.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26365) spark-submit for k8s cluster doesn't propagate exit code

2022-04-29 Thread Mayank Asthana (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-26365?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17529867#comment-17529867
 ] 

Mayank Asthana commented on SPARK-26365:


[~oscar.bonilla] Your change looks good. Can you open a pull request on 
[https://github.com/apache/spark] for an official review?

> spark-submit for k8s cluster doesn't propagate exit code
> 
>
> Key: SPARK-26365
> URL: https://issues.apache.org/jira/browse/SPARK-26365
> Project: Spark
>  Issue Type: Bug
>  Components: Kubernetes, Spark Core, Spark Submit
>Affects Versions: 2.3.2, 2.4.0, 3.0.0, 3.1.0
>Reporter: Oscar Bonilla
>Priority: Major
> Attachments: spark-2.4.5-raise-exception-k8s-failure.patch, 
> spark-3.0.0-raise-exception-k8s-failure.patch
>
>
> When launching apps using spark-submit in a kubernetes cluster, if the Spark 
> applications fails (returns exit code = 1 for example), spark-submit will 
> still exit gracefully and return exit code = 0.
> This is problematic, since there's no way to know if there's been a problem 
> with the Spark application.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18683) REST APIs for standalone Master、Workers and Applications

2021-05-20 Thread Mayank Asthana (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-18683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17348708#comment-17348708
 ] 

Mayank Asthana commented on SPARK-18683:


Looking through the code I found that there is a  `/json` endpoint to the 
master ui which returns the json representation of everything on that page. 
However, I don't think this is documented anywhere.

> REST APIs for standalone Master、Workers and Applications
> 
>
> Key: SPARK-18683
> URL: https://issues.apache.org/jira/browse/SPARK-18683
> Project: Spark
>  Issue Type: Improvement
>Reporter: Shixiong Zhu
>Priority: Major
>  Labels: bulk-closed
>
> It would be great that we have some REST APIs to access Master、Workers and 
> Applications information. Right now the only way to get them is using the Web 
> UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-18683) REST APIs for standalone Master、Workers and Applications

2021-05-18 Thread Mayank Asthana (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-18683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17346739#comment-17346739
 ] 

Mayank Asthana commented on SPARK-18683:


Can we reopen this? My usecase is to have spark streaming applications running 
all the time, and we want to check through the REST API if these applications 
are still running and restart them if they are not.

> REST APIs for standalone Master、Workers and Applications
> 
>
> Key: SPARK-18683
> URL: https://issues.apache.org/jira/browse/SPARK-18683
> Project: Spark
>  Issue Type: Improvement
>Reporter: Shixiong Zhu
>Priority: Major
>  Labels: bulk-closed
>
> It would be great that we have some REST APIs to access Master、Workers and 
> Applications information. Right now the only way to get them is using the Web 
> UI.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-29301) Removing block is not reflected to the driver/executor's storage memory

2020-03-16 Thread Mayank Asthana (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-29301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=1706#comment-1706
 ] 

Mayank Asthana commented on SPARK-29301:


This seems to be done. Can we close the ticket?

> Removing block is not reflected to the driver/executor's storage memory
> ---
>
> Key: SPARK-29301
> URL: https://issues.apache.org/jira/browse/SPARK-29301
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.4.4, 3.0.0
>Reporter: Jungtaek Lim
>Priority: Major
>
> While investigating SPARK-29055 I've found that heap memory in driver doesn't 
> increase which would denote memory leak happen, but storage memory doesn't 
> decrease even there're some broadcast blocks getting removed - these values 
> keep increasing.
> Please refer SPARK-29055 to see the step to reproduce. Very easy to see it, 
> as running simple query repeatedly would reproduce the issue.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-30926) Same SQL on CSV and on Parquet gives different result

2020-02-24 Thread Mayank Asthana (Jira)


[ 
https://issues.apache.org/jira/browse/SPARK-30926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17043791#comment-17043791
 ] 

Mayank Asthana commented on SPARK-30926:


 

In your CSV code,
{code:java}
val result = session.sql("SELECT * FROM airQuality WHERE P1 > 20")
.map(ParticleAirQuality.mappingFunction)
{code}
The _WHERE > 20_ clause is run on _P1_, where _P1_ is a _String_ column. After 
filtering and selecting only is the mappingFunction applied.

Whereas, in the parquet code, the predicate is applied on the _P1_ column when 
it is already a _double_ column type.

 

 

> Same SQL on CSV and on Parquet gives different result
> -
>
> Key: SPARK-30926
> URL: https://issues.apache.org/jira/browse/SPARK-30926
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.4.4
> Environment: I run this locally on a windows 10 machine.
> The java runtime is:
> {color:#cc}openjdk 11.0.5 2019-10-15
> OpenJDK Runtime Environment AdoptOpenJDK (build 11.0.5+10)
> OpenJDK 64-Bit Server VM AdoptOpenJDK (build 11.0.5+10, mixed mode){color}
>Reporter: Bozhidar Karaargirov
>Priority: Major
>
> SO I played around with a data set from here: 
> [https://www.kaggle.com/hmavrodiev/sofia-air-quality-dataset]
> I ran the same query for the base CSVs and against a parquet version of them:
> {color:#008000}SELECT * FROM airQualityP WHERE P1 > 20{color}
> Here is the csv code:
> {color:#80}import 
> {color}{color:#660e7a}session{color}.{color:#660e7a}sqlContext{color}.implicits._
> {color:#80}val {color}df = 
> {color:#660e7a}session{color}.read.option({color:#008000}"header"{color}, 
> {color:#008000}"true"{color}).csv({color:#660e7a}originalDataset{color})
> df.createTempView({color:#008000}"airQuality"{color})
> {color:#80}val {color}result = 
> {color:#660e7a}session{color}.sql({color:#008000}"SELECT * FROM airQuality 
> WHERE P1 > 20"{color})
>  .map(ParticleAirQuality.{color:#660e7a}mappingFunction{color})
> println(result.count())
>  
> Here is the parquet code:
>  
> {color:#80}import 
> {color}{color:#660e7a}session{color}.{color:#660e7a}sqlContext{color}.implicits._
> {color:#80}val {color}df = 
> {color:#660e7a}session{color}.read.option({color:#008000}"header"{color}, 
> {color:#008000}"true"{color}).parquet({color:#660e7a}bigParquetDataset{color})
> df.createTempView({color:#008000}"airQualityP"{color})
> {color:#80}val {color}result = {color:#660e7a}session{color} 
> .sql({color:#008000}"SELECT * FROM airQualityP WHERE P1 > 20"{color})
>  .map(ParticleAirQuality.{color:#660e7a}namedMappingFunction{color})
> println(result.count())
>  
> And this is how I transform the csv into parquets:
> {color:#80}import 
> {color}{color:#660e7a}session{color}.{color:#660e7a}sqlContext{color}.implicits._
> {color:#80}val {color}df = 
> {color:#660e7a}session{color}.read.option({color:#008000}"header"{color}, 
> {color:#008000}"true"{color})
>  .csv({color:#660e7a}originalDataset{color})
>  .map(ParticleAirQuality.{color:#660e7a}mappingFunction{color})
> df.write.parquet({color:#660e7a}bigParquetDataset{color})
>  
> These are the two mapping functions:
> {color:#80}val {color}{color:#660e7a}mappingFunction {color}= {
>  r: Row => ParticleAirQuality(
>  r.getString({color:#ff}1{color}),
>  r.getString({color:#ff}2{color}),
>  r.getString({color:#ff}3{color}),
>  r.getString({color:#ff}4{color}),
>  r.getString({color:#ff}5{color}),
>  {
>  {color:#80}val {color}p1 = r.getString({color:#ff}6{color})
>  {color:#80}if{color}(p1 == {color:#80}null{color}) 
> Double.{color:#660e7a}NaN{color} {color:#80}else {color}p1.toDouble
>  },
>  {
>  {color:#80}val {color}p2 = r.getString({color:#ff}7{color})
>  {color:#80}if{color}(p2 == {color:#80}null{color}) 
> Double.{color:#660e7a}NaN{color} {color:#80}else {color}p2.toDouble
>  }
>  ) }
> {color:#80}val {color}{color:#660e7a}namedMappingFunction {color}= {
>  r: Row => ParticleAirQuality(
>  r.getAs[{color:#20999d}String{color}]({color:#008000}"sensor_id"{color}),
>  r.getAs[{color:#20999d}String{color}]({color:#008000}"location"{color}),
>  r.getAs[{color:#20999d}String{color}]({color:#008000}"lat"{color}),
>  r.getAs[{color:#20999d}String{color}]({color:#008000}"lon"{color}),
>  r.getAs[{color:#20999d}String{color}]({color:#008000}"timestamp"{color}),
>  r.getAs[Double]({color:#008000}"P1"{color}),
>  r.getAs[Double]({color:#008000}"P2"{color})
>  )
>  }
>  
> If it matters this is the paths (Note that I actually use double \ instead of 
> / since it is windows - but that doesn't really matter):
> {color:#80}val {color}{color:#660e7a}originalDataset {color}= 
>