[jira] [Resolved] (SPARK-48517) PythonWorkerFactory does not print error stream in case the daemon fails before the main daemon.py#main()

2024-06-04 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph resolved SPARK-48517.
---
Resolution: Invalid

Driver logs have correctly captured the error stream. Apologies for the spam.

> PythonWorkerFactory does not print error stream in case the daemon fails 
> before the main daemon.py#main()
> -
>
> Key: SPARK-48517
> URL: https://issues.apache.org/jira/browse/SPARK-48517
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 3.3.2
>Reporter: Prabhu Joseph
>Priority: Major
>
> PythonWorkerFactory does not print the error stream in case the daemon fails 
> before executing the main daemon.py#mainI(). It throws SparkException like 
> below but does not print the error stream which has the failure why 
> pyspark.daemon failed to start.
> {code:java}
> org.apache.spark.SparkException: 
> 2024-05-07T16:04:53.169524256Z stderr F Bad data in pyspark.daemon's standard 
> output. Invalid port number:
> 2024-05-07T16:04:53.169530703Z stderr F   1097887852 (0x4170706c){code}
> The error stream is being [read|#L303]] after throwing SparkException. It has 
> to be captured during the exception as well.
>  
> *Simple Repro:*
> 1. Run a sample pyspark job by setting wrong
> spark.python.daemon.module like pyspark.wrongdaemon instead of default one 
> pyspark.daemon.
>  
> 2. The forked  python process will fail with 
> {*}"{*}{*}/opt/python/3.9.2/bin/python: No module named 
> pyspark.wrongdaemon"{*} but it is not captured by PythonWorkerFactory.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48517) PythonWorkerFactory does not print error stream in case the daemon fails before the main daemon.py#main()

2024-06-03 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated SPARK-48517:
--
Description: 
PythonWorkerFactory does not print the error stream in case the daemon fails 
before executing the main daemon.py#mainI(). It throws SparkException like 
below but does not print the error stream which has the failure why 
pyspark.daemon failed to start.
{code:java}
org.apache.spark.SparkException: 
2024-05-07T16:04:53.169524256Z stderr F Bad data in pyspark.daemon's standard 
output. Invalid port number:
2024-05-07T16:04:53.169530703Z stderr F   1097887852 (0x4170706c){code}
The error stream is being [read|#L303]] after throwing SparkException. It has 
to be captured during the exception as well.

 

*Simple Repro:*

1. Run a sample pyspark job by setting wrong
spark.python.daemon.module like pyspark.wrongdaemon instead of default one 
pyspark.daemon.
 
2. The forked  python process will fail with 
{*}"{*}{*}/opt/python/3.9.2/bin/python: No module named pyspark.wrongdaemon"{*} 
but it is not captured by PythonWorkerFactory.
 

 

  was:
PythonWorkerFactory does not print the error stream in case the daemon fails 
before calling the main daemon.py#method.

 It throws SparkException like below but does not print the error stream which 
has the failure why pyspark.daemon failed to start.
{code:java}
org.apache.spark.SparkException: 
2024-05-07T16:04:53.169524256Z stderr F Bad data in pyspark.daemon's standard 
output. Invalid port number:
2024-05-07T16:04:53.169530703Z stderr F   1097887852 (0x4170706c){code}
The error stream is being 
[read|[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala#L303]]
 after throwing SparkException. It has to be captured during the exception as 
well.

 

*Simple Repro:*

1. Run a sample pyspark job by setting wrong
spark.python.daemon.module like pyspark.wrongdaemon instead of default one 
pyspark.daemon.
 
2. The forked  python process will fail with 
{*}"{*}{*}/opt/python/3.9.2/bin/python: No module named pyspark.wrongdaemon"{*} 
but it is not captured by PythonWorkerFactory.
 

 


> PythonWorkerFactory does not print error stream in case the daemon fails 
> before the main daemon.py#main()
> -
>
> Key: SPARK-48517
> URL: https://issues.apache.org/jira/browse/SPARK-48517
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 3.3.2
>Reporter: Prabhu Joseph
>Priority: Major
>
> PythonWorkerFactory does not print the error stream in case the daemon fails 
> before executing the main daemon.py#mainI(). It throws SparkException like 
> below but does not print the error stream which has the failure why 
> pyspark.daemon failed to start.
> {code:java}
> org.apache.spark.SparkException: 
> 2024-05-07T16:04:53.169524256Z stderr F Bad data in pyspark.daemon's standard 
> output. Invalid port number:
> 2024-05-07T16:04:53.169530703Z stderr F   1097887852 (0x4170706c){code}
> The error stream is being [read|#L303]] after throwing SparkException. It has 
> to be captured during the exception as well.
>  
> *Simple Repro:*
> 1. Run a sample pyspark job by setting wrong
> spark.python.daemon.module like pyspark.wrongdaemon instead of default one 
> pyspark.daemon.
>  
> 2. The forked  python process will fail with 
> {*}"{*}{*}/opt/python/3.9.2/bin/python: No module named 
> pyspark.wrongdaemon"{*} but it is not captured by PythonWorkerFactory.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-48517) PythonWorkerFactory does not print error stream in case the daemon fails before the main daemon.py#main()

2024-06-03 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-48517?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated SPARK-48517:
--
Summary: PythonWorkerFactory does not print error stream in case the daemon 
fails before the main daemon.py#main()  (was: PythonWorkerFactory does not 
print error stream in case the daemon fails before the main daemon.py#method)

> PythonWorkerFactory does not print error stream in case the daemon fails 
> before the main daemon.py#main()
> -
>
> Key: SPARK-48517
> URL: https://issues.apache.org/jira/browse/SPARK-48517
> Project: Spark
>  Issue Type: Bug
>  Components: PySpark
>Affects Versions: 3.3.2
>Reporter: Prabhu Joseph
>Priority: Major
>
> PythonWorkerFactory does not print the error stream in case the daemon fails 
> before calling the main daemon.py#method.
>  It throws SparkException like below but does not print the error stream 
> which has the failure why pyspark.daemon failed to start.
> {code:java}
> org.apache.spark.SparkException: 
> 2024-05-07T16:04:53.169524256Z stderr F Bad data in pyspark.daemon's standard 
> output. Invalid port number:
> 2024-05-07T16:04:53.169530703Z stderr F   1097887852 (0x4170706c){code}
> The error stream is being 
> [read|[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala#L303]]
>  after throwing SparkException. It has to be captured during the exception as 
> well.
>  
> *Simple Repro:*
> 1. Run a sample pyspark job by setting wrong
> spark.python.daemon.module like pyspark.wrongdaemon instead of default one 
> pyspark.daemon.
>  
> 2. The forked  python process will fail with 
> {*}"{*}{*}/opt/python/3.9.2/bin/python: No module named 
> pyspark.wrongdaemon"{*} but it is not captured by PythonWorkerFactory.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-48517) PythonWorkerFactory does not print error stream in case the daemon fails before the main daemon.py#method

2024-06-03 Thread Prabhu Joseph (Jira)
Prabhu Joseph created SPARK-48517:
-

 Summary: PythonWorkerFactory does not print error stream in case 
the daemon fails before the main daemon.py#method
 Key: SPARK-48517
 URL: https://issues.apache.org/jira/browse/SPARK-48517
 Project: Spark
  Issue Type: Bug
  Components: PySpark
Affects Versions: 3.3.2
Reporter: Prabhu Joseph


PythonWorkerFactory does not print the error stream in case the daemon fails 
before calling the main daemon.py#method.

 It throws SparkException like below but does not print the error stream which 
has the failure why pyspark.daemon failed to start.
{code:java}
org.apache.spark.SparkException: 
2024-05-07T16:04:53.169524256Z stderr F Bad data in pyspark.daemon's standard 
output. Invalid port number:
2024-05-07T16:04:53.169530703Z stderr F   1097887852 (0x4170706c){code}
The error stream is being 
[read|[https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/api/python/PythonWorkerFactory.scala#L303]]
 after throwing SparkException. It has to be captured during the exception as 
well.

 

*Simple Repro:*

1. Run a sample pyspark job by setting wrong
spark.python.daemon.module like pyspark.wrongdaemon instead of default one 
pyspark.daemon.
 
2. The forked  python process will fail with 
{*}"{*}{*}/opt/python/3.9.2/bin/python: No module named pyspark.wrongdaemon"{*} 
but it is not captured by PythonWorkerFactory.
 

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-36328) HadoopRDD#getPartitions fetches FileSystem Delegation Token for every partition

2021-07-28 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-36328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated SPARK-36328:
--
Description: 
Spark Job creates a separate JobConf for every RDD (every hive table partition) 
in HadoopRDD#getPartitions.

{code}
  override def getPartitions: Array[Partition] = {
val jobConf = getJobConf()
// add the credentials here as this can be called before SparkContext 
initialized
SparkHadoopUtil.get.addCredentials(jobConf)
{code}

Hadoop FileSystem fetches FileSystem Delegation Token and sets into the 
Credentials which is part of JobConf. On further requests, will reuse the token 
from the Credentials if already exists.

{code}
   if (serviceName != null) { // fs has token, grab it
  final Text service = new Text(serviceName);
  Token token = credentials.getToken(service);
  if (token == null) {
token = getDelegationToken(renewer);
if (token != null) {
  tokens.add(token);
  credentials.addToken(service, token);
}
  }
}
{code}

 But since Spark Job creates a new JobConf (which will have a new Credentials) 
for every hive table partition, the token is not reused and gets fetched for 
every partition. This is slowing down the query as each delegation token has to 
go through KDC and SSL handshake on Secure Clusters.

*Improvement:*

Spark can add the credentials from previous JobConf into the new JobConf to 
reuse the FileSystem Delegation Token similar to how the User Credentials are 
added into JobConf after construction.

{code}
 val jobConf = getJobConf()
// add the credentials here as this can be called before SparkContext 
initialized
SparkHadoopUtil.get.addCredentials(jobConf)
{code}



*Repro*

{code}
beeline>
create table parttable (key char(1), value int) partitioned by (p int);
insert into table parttable partition(p=100) values ('d', 1), ('e', 2), ('f', 
3);
insert into table parttable partition(p=200) values ('d', 1), ('e', 2), ('f', 
3);
insert into table parttable partition(p=300) values ('d', 1), ('e', 2), ('f', 
3);

spark-sql>
select value, count(*) from parttable group by value
{code}

  was:
Spark Job creates a separate JobConf for every RDD (every hive table partition) 
in HadoopRDD#getPartitions.

{code}
  override def getPartitions: Array[Partition] = {
val jobConf = getJobConf()
// add the credentials here as this can be called before SparkContext 
initialized
SparkHadoopUtil.get.addCredentials(jobConf)
{code}

Hadoop FileSystem fetches FileSystem Delegation Token and sets into the 
Credentials which is part of JobConf. On further call, will reuse the token 
from the Credentials if already exists.

{code}
   if (serviceName != null) { // fs has token, grab it
  final Text service = new Text(serviceName);
  Token token = credentials.getToken(service);
  if (token == null) {
token = getDelegationToken(renewer);
if (token != null) {
  tokens.add(token);
  credentials.addToken(service, token);
}
  }
}
{code}

 But since Spark Job creates a new JobConf (which will have a new Credentials) 
for every hive table partition, the token is not reused and gets fetched for 
every partition. This is slowing down the query as each delegation token has to 
go through KDC and SSL handshake on Secure Clusters.

*Improvement:*

Spark can add the credentials from previous JobConf into the new JobConf to 
reuse the FileSystem Delegation Token similar to how the User Credentials are 
added into JobConf after construction.

{code}
 val jobConf = getJobConf()
// add the credentials here as this can be called before SparkContext 
initialized
SparkHadoopUtil.get.addCredentials(jobConf)
{code}



*Repro*

{code}
beeline>
create table parttable (key char(1), value int) partitioned by (p int);
insert into table parttable partition(p=100) values ('d', 1), ('e', 2), ('f', 
3);
insert into table parttable partition(p=200) values ('d', 1), ('e', 2), ('f', 
3);
insert into table parttable partition(p=300) values ('d', 1), ('e', 2), ('f', 
3);

spark-sql>
select value, count(*) from parttable group by value
{code}


> HadoopRDD#getPartitions fetches FileSystem Delegation Token for every 
> partition
> ---
>
> Key: SPARK-36328
> URL: https://issues.apache.org/jira/browse/SPARK-36328
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.1.2
>Reporter: Prabhu Joseph
>Priority: Major
>
> Spark Job creates a separate JobConf for every RDD (every hive table 
> partition) in HadoopRDD#getPartitions.
> {code}
>   override def getPartitions: Array[Partition] = {
> val jobConf = getJobConf()
> // add the credentials here as this can be 

[jira] [Updated] (SPARK-36328) HadoopRDD#getPartitions fetches FileSystem Delegation Token for every partition

2021-07-28 Thread Prabhu Joseph (Jira)


 [ 
https://issues.apache.org/jira/browse/SPARK-36328?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated SPARK-36328:
--
Summary: HadoopRDD#getPartitions fetches FileSystem Delegation Token for 
every partition  (was: HadoopRDD#getPartitions fetches FileSystem Delegation 
Token for evert partition)

> HadoopRDD#getPartitions fetches FileSystem Delegation Token for every 
> partition
> ---
>
> Key: SPARK-36328
> URL: https://issues.apache.org/jira/browse/SPARK-36328
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 3.1.2
>Reporter: Prabhu Joseph
>Priority: Major
>
> Spark Job creates a separate JobConf for every RDD (every hive table 
> partition) in HadoopRDD#getPartitions.
> {code}
>   override def getPartitions: Array[Partition] = {
> val jobConf = getJobConf()
> // add the credentials here as this can be called before SparkContext 
> initialized
> SparkHadoopUtil.get.addCredentials(jobConf)
> {code}
> Hadoop FileSystem fetches FileSystem Delegation Token and sets into the 
> Credentials which is part of JobConf. On further call, will reuse the token 
> from the Credentials if already exists.
> {code}
>if (serviceName != null) { // fs has token, grab it
>   final Text service = new Text(serviceName);
>   Token token = credentials.getToken(service);
>   if (token == null) {
> token = getDelegationToken(renewer);
> if (token != null) {
>   tokens.add(token);
>   credentials.addToken(service, token);
> }
>   }
> }
> {code}
>  But since Spark Job creates a new JobConf (which will have a new 
> Credentials) for every hive table partition, the token is not reused and gets 
> fetched for every partition. This is slowing down the query as each 
> delegation token has to go through KDC and SSL handshake on Secure Clusters.
> *Improvement:*
> Spark can add the credentials from previous JobConf into the new JobConf to 
> reuse the FileSystem Delegation Token similar to how the User Credentials are 
> added into JobConf after construction.
> {code}
>  val jobConf = getJobConf()
> // add the credentials here as this can be called before SparkContext 
> initialized
> SparkHadoopUtil.get.addCredentials(jobConf)
> {code}
> *Repro*
> {code}
> beeline>
> create table parttable (key char(1), value int) partitioned by (p int);
> insert into table parttable partition(p=100) values ('d', 1), ('e', 2), ('f', 
> 3);
> insert into table parttable partition(p=200) values ('d', 1), ('e', 2), ('f', 
> 3);
> insert into table parttable partition(p=300) values ('d', 1), ('e', 2), ('f', 
> 3);
> spark-sql>
> select value, count(*) from parttable group by value
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-36328) HadoopRDD#getPartitions fetches FileSystem Delegation Token for evert partition

2021-07-28 Thread Prabhu Joseph (Jira)
Prabhu Joseph created SPARK-36328:
-

 Summary: HadoopRDD#getPartitions fetches FileSystem Delegation 
Token for evert partition
 Key: SPARK-36328
 URL: https://issues.apache.org/jira/browse/SPARK-36328
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 3.1.2
Reporter: Prabhu Joseph


Spark Job creates a separate JobConf for every RDD (every hive table partition) 
in HadoopRDD#getPartitions.

{code}
  override def getPartitions: Array[Partition] = {
val jobConf = getJobConf()
// add the credentials here as this can be called before SparkContext 
initialized
SparkHadoopUtil.get.addCredentials(jobConf)
{code}

Hadoop FileSystem fetches FileSystem Delegation Token and sets into the 
Credentials which is part of JobConf. On further call, will reuse the token 
from the Credentials if already exists.

{code}
   if (serviceName != null) { // fs has token, grab it
  final Text service = new Text(serviceName);
  Token token = credentials.getToken(service);
  if (token == null) {
token = getDelegationToken(renewer);
if (token != null) {
  tokens.add(token);
  credentials.addToken(service, token);
}
  }
}
{code}

 But since Spark Job creates a new JobConf (which will have a new Credentials) 
for every hive table partition, the token is not reused and gets fetched for 
every partition. This is slowing down the query as each delegation token has to 
go through KDC and SSL handshake on Secure Clusters.

*Improvement:*

Spark can add the credentials from previous JobConf into the new JobConf to 
reuse the FileSystem Delegation Token similar to how the User Credentials are 
added into JobConf after construction.

{code}
 val jobConf = getJobConf()
// add the credentials here as this can be called before SparkContext 
initialized
SparkHadoopUtil.get.addCredentials(jobConf)
{code}



*Repro*

{code}
beeline>
create table parttable (key char(1), value int) partitioned by (p int);
insert into table parttable partition(p=100) values ('d', 1), ('e', 2), ('f', 
3);
insert into table parttable partition(p=200) values ('d', 1), ('e', 2), ('f', 
3);
insert into table parttable partition(p=300) values ('d', 1), ('e', 2), ('f', 
3);

spark-sql>
select value, count(*) from parttable group by value
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-26867) Spark Support of YARN Placement Constraint

2019-03-01 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782270#comment-16782270
 ] 

Prabhu Joseph edited comment on SPARK-26867 at 3/2/19 3:52 AM:
---

[~srowen] Spark can allow users to configure the Placement Constraint so that 
users will have more control on where the executors will get placed. For 
example:

1. Spark job wants to be run on machines where Python version is x or Java 
version is y (Node Attributes)
2. Spark job needs / does not need executors to be placed on machine where 
Hbase RegionServer / Zookeeper / Or any other Service is running. (Affinity / 
Anti Affinity)
3. Spark job wants no more than 2 of it's executors on same node (Cardinality)
4. Spark Job A executors wants / does not want to be run on where Spark Job / 
Any Other Job B containers runs (Application_Tag NameSpace)




was (Author: prabhu joseph):
Spark can allow users to configure the Placement Constraint so that users will 
have more control on where the executors will get placed. For example:

1. Spark job wants to be run on machines where Python version is x or Java 
version is y (Node Attributes)
2. Spark job needs / does not need executors to be placed on machine where 
Hbase RegionServer / Zookeeper / Or any other Service is running. (Affinity / 
Anti Affinity)
3. Spark job wants no more than 2 of it's executors on same node (Cardinality)
4. Spark Job A executors wants / does not want to be run on where Spark Job / 
Any Other Job B containers runs (Application_Tag NameSpace)



> Spark Support of YARN Placement Constraint
> --
>
> Key: SPARK-26867
> URL: https://issues.apache.org/jira/browse/SPARK-26867
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core, YARN
>Affects Versions: 3.0.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> YARN provides Placement Constraint Features - where application can request 
> containers based on affinity / anti-affinity / cardinality to services or 
> other application containers / node attributes. This is a useful feature for 
> Spark Jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-26867) Spark Support of YARN Placement Constraint

2019-03-01 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-26867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16782270#comment-16782270
 ] 

Prabhu Joseph commented on SPARK-26867:
---

Spark can allow users to configure the Placement Constraint so that users will 
have more control on where the executors will get placed. For example:

1. Spark job wants to be run on machines where Python version is x or Java 
version is y (Node Attributes)
2. Spark job needs / does not need executors to be placed on machine where 
Hbase RegionServer / Zookeeper / Or any other Service is running. (Affinity / 
Anti Affinity)
3. Spark job wants no more than 2 of it's executors on same node (Cardinality)
4. Spark Job A executors wants / does not want to be run on where Spark Job / 
Any Other Job B containers runs (Application_Tag NameSpace)



> Spark Support of YARN Placement Constraint
> --
>
> Key: SPARK-26867
> URL: https://issues.apache.org/jira/browse/SPARK-26867
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core, YARN
>Affects Versions: 3.0.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> YARN provides Placement Constraint Features - where application can request 
> containers based on affinity / anti-affinity / cardinality to services or 
> other application containers / node attributes. This is a useful feature for 
> Spark Jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26867) Spark Support of YARN Placement Constraint

2019-02-13 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated SPARK-26867:
--
Summary: Spark Support of YARN Placement Constraint  (was: Spark Support 
for YARN Placement Constraint)

> Spark Support of YARN Placement Constraint
> --
>
> Key: SPARK-26867
> URL: https://issues.apache.org/jira/browse/SPARK-26867
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core
>Affects Versions: 2.4.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> YARN provides Placement Constraint Features - where application can request 
> containers based on affinity / anti-affinity / cardinality to services or 
> other application containers / node attributes. This is a useful feature for 
> Spark Jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-26867) Spark Support of YARN Placement Constraint

2019-02-13 Thread Prabhu Joseph (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-26867?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated SPARK-26867:
--
Component/s: YARN

> Spark Support of YARN Placement Constraint
> --
>
> Key: SPARK-26867
> URL: https://issues.apache.org/jira/browse/SPARK-26867
> Project: Spark
>  Issue Type: New Feature
>  Components: Spark Core, YARN
>Affects Versions: 2.4.0
>Reporter: Prabhu Joseph
>Priority: Major
>
> YARN provides Placement Constraint Features - where application can request 
> containers based on affinity / anti-affinity / cardinality to services or 
> other application containers / node attributes. This is a useful feature for 
> Spark Jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-26867) Spark Support for YARN Placement Constraint

2019-02-13 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created SPARK-26867:
-

 Summary: Spark Support for YARN Placement Constraint
 Key: SPARK-26867
 URL: https://issues.apache.org/jira/browse/SPARK-26867
 Project: Spark
  Issue Type: New Feature
  Components: Spark Core
Affects Versions: 2.4.0
Reporter: Prabhu Joseph


YARN provides Placement Constraint Features - where application can request 
containers based on affinity / anti-affinity / cardinality to services or other 
application containers / node attributes. This is a useful feature for Spark 
Jobs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-24192) Invalid Spark URL in local spark session since upgrading from org.apache.spark:spark-sql_2.11:2.2.1 to org.apache.spark:spark-sql_2.11:2.3.0

2018-07-19 Thread Prabhu Joseph (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-24192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16549707#comment-16549707
 ] 

Prabhu Joseph commented on SPARK-24192:
---

We have faced this issue and this happens when the hostname has underscore in 
it. 

> Invalid Spark URL in local spark session since upgrading from 
> org.apache.spark:spark-sql_2.11:2.2.1 to org.apache.spark:spark-sql_2.11:2.3.0
> 
>
> Key: SPARK-24192
> URL: https://issues.apache.org/jira/browse/SPARK-24192
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.3.0
>Reporter: Tal Barda
>Priority: Major
>  Labels: HeartbeatReceiver, config, session, spark, spark-conf, 
> spark-session
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> since updating to Spark 2.3.0, tests which are run in my CI (Codeship) fail 
> due to a allegedly invalid spark url when creating the (local) spark context. 
> Here's a log from my _*mvn clean install*_ command execution:
> {quote}{{2018-05-03 13:18:47.668 ERROR 5533 --- [ main] 
> org.apache.spark.SparkContext : Error initializing SparkContext. 
> org.apache.spark.SparkException: Invalid Spark URL: 
> spark://HeartbeatReceiver@railsonfire_61eb1c99-232b-49d0-abb5-a1eb9693516b_52bcc09bb48b:44284
>  at 
> org.apache.spark.rpc.RpcEndpointAddress$.apply(RpcEndpointAddress.scala:66) 
> ~[spark-core_2.11-2.3.0.jar:2.3.0] at 
> org.apache.spark.rpc.netty.NettyRpcEnv.asyncSetupEndpointRefByURI(NettyRpcEnv.scala:134)
>  ~[spark-core_2.11-2.3.0.jar:2.3.0] at 
> org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101) 
> ~[spark-core_2.11-2.3.0.jar:2.3.0] at 
> org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:109) 
> ~[spark-core_2.11-2.3.0.jar:2.3.0] at 
> org.apache.spark.util.RpcUtils$.makeDriverRef(RpcUtils.scala:32) 
> ~[spark-core_2.11-2.3.0.jar:2.3.0] at 
> org.apache.spark.executor.Executor.(Executor.scala:155) 
> ~[spark-core_2.11-2.3.0.jar:2.3.0] at 
> org.apache.spark.scheduler.local.LocalEndpoint.(LocalSchedulerBackend.scala:59)
>  ~[spark-core_2.11-2.3.0.jar:2.3.0] at 
> org.apache.spark.scheduler.local.LocalSchedulerBackend.start(LocalSchedulerBackend.scala:126)
>  ~[spark-core_2.11-2.3.0.jar:2.3.0] at 
> org.apache.spark.scheduler.TaskSchedulerImpl.start(TaskSchedulerImpl.scala:164)
>  ~[spark-core_2.11-2.3.0.jar:2.3.0] at 
> org.apache.spark.SparkContext.(SparkContext.scala:500) 
> ~[spark-core_2.11-2.3.0.jar:2.3.0] at 
> org.apache.spark.SparkContext$.getOrCreate(SparkContext.scala:2486) 
> [spark-core_2.11-2.3.0.jar:2.3.0] at 
> org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:930)
>  [spark-sql_2.11-2.3.0.jar:2.3.0] at 
> org.apache.spark.sql.SparkSession$Builder$$anonfun$7.apply(SparkSession.scala:921)
>  [spark-sql_2.11-2.3.0.jar:2.3.0] at scala.Option.getOrElse(Option.scala:121) 
> [scala-library-2.11.8.jar:na] at 
> org.apache.spark.sql.SparkSession$Builder.getOrCreate(SparkSession.scala:921) 
> [spark-sql_2.11-2.3.0.jar:2.3.0] at 
> com.planck.spark_data_features_extractors.utils.SparkConfig.sparkSession(SparkConfig.java:43)
>  [classes/:na] at 
> com.planck.spark_data_features_extractors.utils.SparkConfig$$EnhancerBySpringCGLIB$$66dd1f72.CGLIB$sparkSession$0()
>  [classes/:na] at 
> com.planck.spark_data_features_extractors.utils.SparkConfig$$EnhancerBySpringCGLIB$$66dd1f72$$FastClassBySpringCGLIB$$a213b647.invoke()
>  [classes/:na] at 
> org.springframework.cglib.proxy.MethodProxy.invokeSuper(MethodProxy.java:228) 
> [spring-core-5.0.2.RELEASE.jar:5.0.2.RELEASE] at 
> org.springframework.context.annotation.ConfigurationClassEnhancer$BeanMethodInterceptor.intercept(ConfigurationClassEnhancer.java:361)
>  [spring-context-5.0.2.RELEASE.jar:5.0.2.RELEASE] at 
> com.planck.spark_data_features_extractors.utils.SparkConfig$$EnhancerBySpringCGLIB$$66dd1f72.sparkSession()
>  [classes/:na] at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> ~[na:1.8.0_171] at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> ~[na:1.8.0_171] at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  ~[na:1.8.0_171] at java.lang.reflect.Method.invoke(Method.java:498) 
> ~[na:1.8.0_171] at 
> org.springframework.beans.factory.support.SimpleInstantiationStrategy.instantiate(SimpleInstantiationStrategy.java:154)
>  [spring-beans-5.0.2.RELEASE.jar:5.0.2.RELEASE] at 
> org.springframework.beans.factory.support.ConstructorResolver.instantiateUsingFactoryMethod(ConstructorResolver.java:579)
>  [spring-beans-5.0.2.RELEASE.jar:5.0.2.RELEASE] at 
> 

[jira] [Updated] (SPARK-24008) SQL/Hive Context fails with NullPointerException

2018-04-18 Thread Prabhu Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated SPARK-24008:
--
Description: 
SQL / Hive Context fails with NullPointerException while getting configuration 
from SQLConf. This happens when the MemoryStore is filled with lot of broadcast 
and started dropping and then SQL / Hive Context is created and broadcast. When 
using this Context to access a table fails with below NullPointerException.

Repro is attached - the Spark Example which fills the MemoryStore with 
broadcasts and then creates and accesses a SQL Context.

{code}
java.lang.NullPointerException
at org.apache.spark.sql.SQLConf.getConf(SQLConf.scala:638)
at org.apache.spark.sql.SQLConf.defaultDataSourceName(SQLConf.scala:558)
at 
org.apache.spark.sql.DataFrameReader.(DataFrameReader.scala:362)
at org.apache.spark.sql.SQLContext.read(SQLContext.scala:623)
at SparkHiveExample$.main(SparkHiveExample.scala:76)
at SparkHiveExample.main(SparkHiveExample.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)


18/04/06 14:17:42 ERROR ApplicationMaster: User class threw exception: 
java.lang.NullPointerException 
java.lang.NullPointerException 
at org.apache.spark.sql.SQLConf.getConf(SQLConf.scala:638) 
at org.apache.spark.sql.SQLContext.getConf(SQLContext.scala:153) 
at 
org.apache.spark.sql.hive.HiveContext.hiveMetastoreVersion(HiveContext.scala:166)
 
at 
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:258)
 
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:255) 
at org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:475) 
at 
org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:475) 
at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:474) 
at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:90) 
at org.apache.spark.sql.SQLContext.table(SQLContext.scala:831) 
at org.apache.spark.sql.SQLContext.table(SQLContext.scala:827) 

{code}

MemoryStore got filled and started dropping the blocks.
{code}
18/04/17 08:03:43 INFO MemoryStore: 2 blocks selected for dropping
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_14 stored as values in 
memory (estimated size 78.1 MB, free 64.4 MB)
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_14_piece0 stored as bytes 
in memory (estimated size 1522.0 B, free 64.4 MB)
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_15 stored as values in 
memory (estimated size 350.9 KB, free 64.1 MB)
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_15_piece0 stored as bytes 
in memory (estimated size 29.9 KB, free 64.0 MB)
18/04/17 08:03:43 INFO MemoryStore: 10 blocks selected for dropping
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_16 stored as values in 
memory (estimated size 78.1 MB, free 64.7 MB)
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_16_piece0 stored as bytes 
in memory (estimated size 1522.0 B, free 64.7 MB)
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_1 stored as values in 
memory (estimated size 136.0 B, free 64.7 MB)
18/04/17 08:03:44 INFO MemoryStore: MemoryStore cleared
18/04/17 08:03:20 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
18/04/17 08:03:44 INFO MemoryStore: MemoryStore cleared
18/04/17 08:03:57 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
18/04/17 08:04:23 INFO MemoryStore: MemoryStore cleared
{code}


Fix is to remove broadcasting SQL/Hive Context or Increasing the Driver memory.


  was:
SQL / Hive Context fails with NullPointerException while getting configuration 
from SQLConf. This happens when the MemoryStore is filled with lot of broadcast 
and started dropping and then SQL / Hive Context is created and broadcast. When 
using this Context to access a table fails with below NullPointerException.

Repro is attached - the Spark Example which fills the MemoryStore with 
broadcasts and then creates and accesses a SQL Context.

{code}
java.lang.NullPointerException
at org.apache.spark.sql.SQLConf.getConf(SQLConf.scala:638)
at org.apache.spark.sql.SQLConf.defaultDataSourceName(SQLConf.scala:558)
at 
org.apache.spark.sql.DataFrameReader.(DataFrameReader.scala:362)
at org.apache.spark.sql.SQLContext.read(SQLContext.scala:623)
at SparkHiveExample$.main(SparkHiveExample.scala:76)
at SparkHiveExample.main(SparkHiveExample.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 

[jira] [Commented] (SPARK-24008) SQL/Hive Context fails with NullPointerException

2018-04-18 Thread Prabhu Joseph (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-24008?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16443484#comment-16443484
 ] 

Prabhu Joseph commented on SPARK-24008:
---

Yes it's driver specific. But better to handle this case instead of failing.

> SQL/Hive Context fails with NullPointerException 
> -
>
> Key: SPARK-24008
> URL: https://issues.apache.org/jira/browse/SPARK-24008
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.3
>Reporter: Prabhu Joseph
>Priority: Major
> Attachments: Repro
>
>
> SQL / Hive Context fails with NullPointerException while getting 
> configuration from SQLConf. This happens when the MemoryStore is filled with 
> lot of broadcast and started dropping and then SQL / Hive Context is created 
> and broadcast. When using this Context to access a table fails with below 
> NullPointerException.
> Repro is attached - the Spark Example which fills the MemoryStore with 
> broadcasts and then creates and accesses a SQL Context.
> {code}
> java.lang.NullPointerException
> at org.apache.spark.sql.SQLConf.getConf(SQLConf.scala:638)
> at 
> org.apache.spark.sql.SQLConf.defaultDataSourceName(SQLConf.scala:558)
> at 
> org.apache.spark.sql.DataFrameReader.(DataFrameReader.scala:362)
> at org.apache.spark.sql.SQLContext.read(SQLContext.scala:623)
> at SparkHiveExample$.main(SparkHiveExample.scala:76)
> at SparkHiveExample.main(SparkHiveExample.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> 18/04/06 14:17:42 ERROR ApplicationMaster: User class threw exception: 
> java.lang.NullPointerException 
> java.lang.NullPointerException 
> at org.apache.spark.sql.SQLConf.getConf(SQLConf.scala:638) 
> at org.apache.spark.sql.SQLContext.getConf(SQLContext.scala:153) 
> at 
> org.apache.spark.sql.hive.HiveContext.hiveMetastoreVersion(HiveContext.scala:166)
>  
> at 
> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:258)
>  
> at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:255) 
> at 
> org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:475) 
> at 
> org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:475)
>  
> at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:474) 
> at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:90) 
> at org.apache.spark.sql.SQLContext.table(SQLContext.scala:831) 
> at org.apache.spark.sql.SQLContext.table(SQLContext.scala:827) 
> {code}
> MemoryStore got filled and started dropping the blocks.
> {code}
> 18/04/17 08:03:43 INFO MemoryStore: 2 blocks selected for dropping
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_14 stored as values in 
> memory (estimated size 78.1 MB, free 64.4 MB)
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_14_piece0 stored as bytes 
> in memory (estimated size 1522.0 B, free 64.4 MB)
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_15 stored as values in 
> memory (estimated size 350.9 KB, free 64.1 MB)
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_15_piece0 stored as bytes 
> in memory (estimated size 29.9 KB, free 64.0 MB)
> 18/04/17 08:03:43 INFO MemoryStore: 10 blocks selected for dropping
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_16 stored as values in 
> memory (estimated size 78.1 MB, free 64.7 MB)
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_16_piece0 stored as bytes 
> in memory (estimated size 1522.0 B, free 64.7 MB)
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_1 stored as values in 
> memory (estimated size 136.0 B, free 64.7 MB)
> 18/04/17 08:03:44 INFO MemoryStore: MemoryStore cleared
> 18/04/17 08:03:20 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
> 18/04/17 08:03:44 INFO MemoryStore: MemoryStore cleared
> 18/04/17 08:03:57 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
> 18/04/17 08:04:23 INFO MemoryStore: MemoryStore cleared
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-24008) SQL/Hive Context fails with NullPointerException

2018-04-17 Thread Prabhu Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-24008?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated SPARK-24008:
--
Attachment: Repro

> SQL/Hive Context fails with NullPointerException 
> -
>
> Key: SPARK-24008
> URL: https://issues.apache.org/jira/browse/SPARK-24008
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.6.3
>Reporter: Prabhu Joseph
>Priority: Major
> Attachments: Repro
>
>
> SQL / Hive Context fails with NullPointerException while getting 
> configuration from SQLConf. This happens when the MemoryStore is filled with 
> lot of broadcast and started dropping and then SQL / Hive Context is created 
> and broadcast. When using this Context to access a table fails with below 
> NullPointerException.
> Repro is attached - the Spark Example which fills the MemoryStore with 
> broadcasts and then creates and accesses a SQL Context.
> {code}
> java.lang.NullPointerException
> at org.apache.spark.sql.SQLConf.getConf(SQLConf.scala:638)
> at 
> org.apache.spark.sql.SQLConf.defaultDataSourceName(SQLConf.scala:558)
> at 
> org.apache.spark.sql.DataFrameReader.(DataFrameReader.scala:362)
> at org.apache.spark.sql.SQLContext.read(SQLContext.scala:623)
> at SparkHiveExample$.main(SparkHiveExample.scala:76)
> at SparkHiveExample.main(SparkHiveExample.scala)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> 18/04/06 14:17:42 ERROR ApplicationMaster: User class threw exception: 
> java.lang.NullPointerException 
> java.lang.NullPointerException 
> at org.apache.spark.sql.SQLConf.getConf(SQLConf.scala:638) 
> at org.apache.spark.sql.SQLContext.getConf(SQLContext.scala:153) 
> at 
> org.apache.spark.sql.hive.HiveContext.hiveMetastoreVersion(HiveContext.scala:166)
>  
> at 
> org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:258)
>  
> at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:255) 
> at 
> org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:475) 
> at 
> org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:475)
>  
> at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:474) 
> at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:90) 
> at org.apache.spark.sql.SQLContext.table(SQLContext.scala:831) 
> at org.apache.spark.sql.SQLContext.table(SQLContext.scala:827) 
> {code}
> MemoryStore got filled and started dropping the blocks.
> {code}
> 18/04/17 08:03:43 INFO MemoryStore: 2 blocks selected for dropping
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_14 stored as values in 
> memory (estimated size 78.1 MB, free 64.4 MB)
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_14_piece0 stored as bytes 
> in memory (estimated size 1522.0 B, free 64.4 MB)
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_15 stored as values in 
> memory (estimated size 350.9 KB, free 64.1 MB)
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_15_piece0 stored as bytes 
> in memory (estimated size 29.9 KB, free 64.0 MB)
> 18/04/17 08:03:43 INFO MemoryStore: 10 blocks selected for dropping
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_16 stored as values in 
> memory (estimated size 78.1 MB, free 64.7 MB)
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_16_piece0 stored as bytes 
> in memory (estimated size 1522.0 B, free 64.7 MB)
> 18/04/17 08:03:43 INFO MemoryStore: Block broadcast_1 stored as values in 
> memory (estimated size 136.0 B, free 64.7 MB)
> 18/04/17 08:03:44 INFO MemoryStore: MemoryStore cleared
> 18/04/17 08:03:20 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
> 18/04/17 08:03:44 INFO MemoryStore: MemoryStore cleared
> 18/04/17 08:03:57 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
> 18/04/17 08:04:23 INFO MemoryStore: MemoryStore cleared
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-24008) SQL/Hive Context fails with NullPointerException

2018-04-17 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created SPARK-24008:
-

 Summary: SQL/Hive Context fails with NullPointerException 
 Key: SPARK-24008
 URL: https://issues.apache.org/jira/browse/SPARK-24008
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.6.3
Reporter: Prabhu Joseph


SQL / Hive Context fails with NullPointerException while getting configuration 
from SQLConf. This happens when the MemoryStore is filled with lot of broadcast 
and started dropping and then SQL / Hive Context is created and broadcast. When 
using this Context to access a table fails with below NullPointerException.

Repro is attached - the Spark Example which fills the MemoryStore with 
broadcasts and then creates and accesses a SQL Context.

{code}
java.lang.NullPointerException
at org.apache.spark.sql.SQLConf.getConf(SQLConf.scala:638)
at org.apache.spark.sql.SQLConf.defaultDataSourceName(SQLConf.scala:558)
at 
org.apache.spark.sql.DataFrameReader.(DataFrameReader.scala:362)
at org.apache.spark.sql.SQLContext.read(SQLContext.scala:623)
at SparkHiveExample$.main(SparkHiveExample.scala:76)
at SparkHiveExample.main(SparkHiveExample.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)


18/04/06 14:17:42 ERROR ApplicationMaster: User class threw exception: 
java.lang.NullPointerException 
java.lang.NullPointerException 
at org.apache.spark.sql.SQLConf.getConf(SQLConf.scala:638) 
at org.apache.spark.sql.SQLContext.getConf(SQLContext.scala:153) 
at 
org.apache.spark.sql.hive.HiveContext.hiveMetastoreVersion(HiveContext.scala:166)
 
at 
org.apache.spark.sql.hive.HiveContext.metadataHive$lzycompute(HiveContext.scala:258)
 
at org.apache.spark.sql.hive.HiveContext.metadataHive(HiveContext.scala:255) 
at org.apache.spark.sql.hive.HiveContext$$anon$2.(HiveContext.scala:475) 
at 
org.apache.spark.sql.hive.HiveContext.catalog$lzycompute(HiveContext.scala:475) 
at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:474) 
at org.apache.spark.sql.hive.HiveContext.catalog(HiveContext.scala:90) 
at org.apache.spark.sql.SQLContext.table(SQLContext.scala:831) 
at org.apache.spark.sql.SQLContext.table(SQLContext.scala:827) 

{code}

MemoryStore got filled and started dropping the blocks.
{code}
18/04/17 08:03:43 INFO MemoryStore: 2 blocks selected for dropping
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_14 stored as values in 
memory (estimated size 78.1 MB, free 64.4 MB)
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_14_piece0 stored as bytes 
in memory (estimated size 1522.0 B, free 64.4 MB)
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_15 stored as values in 
memory (estimated size 350.9 KB, free 64.1 MB)
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_15_piece0 stored as bytes 
in memory (estimated size 29.9 KB, free 64.0 MB)
18/04/17 08:03:43 INFO MemoryStore: 10 blocks selected for dropping
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_16 stored as values in 
memory (estimated size 78.1 MB, free 64.7 MB)
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_16_piece0 stored as bytes 
in memory (estimated size 1522.0 B, free 64.7 MB)
18/04/17 08:03:43 INFO MemoryStore: Block broadcast_1 stored as values in 
memory (estimated size 136.0 B, free 64.7 MB)
18/04/17 08:03:44 INFO MemoryStore: MemoryStore cleared
18/04/17 08:03:20 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
18/04/17 08:03:44 INFO MemoryStore: MemoryStore cleared
18/04/17 08:03:57 INFO MemoryStore: MemoryStore started with capacity 511.1 MB
18/04/17 08:04:23 INFO MemoryStore: MemoryStore cleared
{code}




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-22587) Spark job fails if fs.defaultFS and application jar are different url

2017-11-22 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created SPARK-22587:
-

 Summary: Spark job fails if fs.defaultFS and application jar are 
different url
 Key: SPARK-22587
 URL: https://issues.apache.org/jira/browse/SPARK-22587
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.6.3
Reporter: Prabhu Joseph


Spark Job fails if the fs.defaultFs and url where application jar resides are 
different and having same scheme,

spark-submit  --conf spark.master=yarn-cluster wasb://XXX/tmp/test.py

core-site.xml fs.defaultFS is set to wasb:///YYY. Hadoop list works (hadoop fs 
-ls) works for both the url XXX and YYY.

{code}
Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: 
wasb://XXX/tmp/test.py, expected: wasb://YYY 
at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:665) 
at 
org.apache.hadoop.fs.azure.NativeAzureFileSystem.checkPath(NativeAzureFileSystem.java:1251)
 
at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:485) 
at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:396) 
at 
org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:507)
 
at org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:660) 
at 
org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:912)
 
at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:172) 
at org.apache.spark.deploy.yarn.Client.run(Client.scala:1248) 
at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1307) 
at org.apache.spark.deploy.yarn.Client.main(Client.scala) 
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
 
at java.lang.reflect.Method.invoke(Method.java:498) 
at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:751)
 
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) 
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) 
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) 
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 
{code}

The code Client.copyFileToRemote tries to resolve the path of application jar 
(XXX) from the FileSystem object created using fs.defaultFS url (YYY) instead 
of the actual url of application jar.

val destFs = destDir.getFileSystem(hadoopConf)
val srcFs = srcPath.getFileSystem(hadoopConf)

getFileSystem will create the filesystem based on the url of the path and so 
this is fine. But the below lines of code tries to get the srcPath (XXX url) 
from the destFs (YYY url) and so it fails.

var destPath = srcPath
val qualifiedDestPath = destFs.makeQualified(destPath)






--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22426) Spark AM launching containers on node where External spark shuffle service failed to initialize

2017-11-03 Thread Prabhu Joseph (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16237769#comment-16237769
 ] 

Prabhu Joseph commented on SPARK-22426:
---

Thanks [~jerryshao], we can close this as a duplicate.

> Spark AM launching containers on node where External spark shuffle service 
> failed to initialize
> ---
>
> Key: SPARK-22426
> URL: https://issues.apache.org/jira/browse/SPARK-22426
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle, YARN
>Affects Versions: 1.6.3
>Reporter: Prabhu Joseph
>Priority: Major
>
> When Spark External Shuffle Service on a NodeManager fails, the remote 
> executors will fail while fetching the data from the executors launched on 
> this Node. Spark ApplicationMaster should not launch containers on this Node 
> or not use external shuffle service.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22426) Spark AM launching containers on node where External spark shuffle service failed to initialize

2017-11-02 Thread Prabhu Joseph (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16235500#comment-16235500
 ] 

Prabhu Joseph commented on SPARK-22426:
---

Node and NodeManager process is fine, External Spark Shuffle Service failed to 
initialize on that NodeManager for some reason like SPARK-17433, SPARK-15519

> Spark AM launching containers on node where External spark shuffle service 
> failed to initialize
> ---
>
> Key: SPARK-22426
> URL: https://issues.apache.org/jira/browse/SPARK-22426
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle, YARN
>Affects Versions: 1.6.3
>Reporter: Prabhu Joseph
>Priority: Major
>
> When Spark External Shuffle Service on a NodeManager fails, the remote 
> executors will fail while fetching the data from the executors launched on 
> this Node. Spark ApplicationMaster should not launch containers on this Node 
> or not use external shuffle service.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-22426) Spark AM launching containers on node where External spark shuffle service failed to initialize

2017-11-02 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created SPARK-22426:
-

 Summary: Spark AM launching containers on node where External 
spark shuffle service failed to initialize
 Key: SPARK-22426
 URL: https://issues.apache.org/jira/browse/SPARK-22426
 Project: Spark
  Issue Type: Bug
  Components: Shuffle
Affects Versions: 1.6.3
Reporter: Prabhu Joseph
Priority: Major


When Spark External Shuffle Service on a NodeManager fails, the remote 
executors will fail while fetching the data from the executors launched on this 
Node. Spark ApplicationMaster should not launch containers on this Node or not 
use external shuffle service.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22426) Spark AM launching containers on node where External spark shuffle service failed to initialize

2017-11-02 Thread Prabhu Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22426?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated SPARK-22426:
--
Component/s: YARN

> Spark AM launching containers on node where External spark shuffle service 
> failed to initialize
> ---
>
> Key: SPARK-22426
> URL: https://issues.apache.org/jira/browse/SPARK-22426
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle, YARN
>Affects Versions: 1.6.3
>Reporter: Prabhu Joseph
>Priority: Major
>
> When Spark External Shuffle Service on a NodeManager fails, the remote 
> executors will fail while fetching the data from the executors launched on 
> this Node. Spark ApplicationMaster should not launch containers on this Node 
> or not use external shuffle service.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-22199) Spark Job on YARN fails with executors "Slave registration failed"

2017-10-04 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created SPARK-22199:
-

 Summary: Spark Job on YARN fails with executors "Slave 
registration failed"
 Key: SPARK-22199
 URL: https://issues.apache.org/jira/browse/SPARK-22199
 Project: Spark
  Issue Type: Bug
  Components: YARN
Affects Versions: 1.6.3
Reporter: Prabhu Joseph


Spark Job on YARN Failed with max executors Failed.

ApplicationMaster logs:
{code}
17/09/28 04:18:27 INFO ApplicationMaster: Unregistering ApplicationMaster with 
FAILED (diag message: Max number of executor failures (3) reached)
{code}

Checking the failed container logs shows "Slave registration failed: Duplicate 
executor ID" whereas the Driver logs shows it has removed those executors as 
they are idle for spark.dynamicAllocation.executorIdleTimeout

Executor Logs:
{code}
17/09/28 04:18:26 ERROR CoarseGrainedExecutorBackend: Slave registration 
failed: Duplicate executor ID: 122
{code}

Driver logs:
{code}
17/09/28 04:18:21 INFO ExecutorAllocationManager: Removing executor 122 because 
it has been idle for 60 seconds (new desired total will be 133)
{code}

There are two issues here:

1. Error Message in executor is misleading "Slave registration failed: 
Duplicate executor ID"  as the actual error is it was idle
2. The job failed as there are executors idle for 
spark.dynamicAllocation.executorIdleTimeout
 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-16225) Spark Sql throws ArrayIndexOutOfBoundsException on accessing Null Value

2016-06-27 Thread Prabhu Joseph (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350718#comment-15350718
 ] 

Prabhu Joseph edited comment on SPARK-16225 at 6/27/16 10:00 AM:
-

Thanks [~srowen]. Changing code to use limit as Integer.MAX_VALUE in 
split(regex,limit) worked

val pRowRDD = pTel.map(_.split(delimiter,Integer.MAX_VALUE)).map(p => 
Row(p(0),p(1),p(2),p(3),p(4),p(5)))


was (Author: prabhu joseph):
Thanks [~srowen]

> Spark Sql throws ArrayIndexOutOfBoundsException on accessing Null Value
> ---
>
> Key: SPARK-16225
> URL: https://issues.apache.org/jira/browse/SPARK-16225
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.5.2
>Reporter: Prabhu Joseph
>
> Spark Sql throws ArrayIndexOutOfBoundsException on accessing Null value
> Repro:
> {code}
> [root@spark1 spark]# cat repro 
> a,b,c,,,
> ./bin/spark-shell
> scala> val pTel = sc.textFile("file:///usr/hdp/2.3.4.7-4/spark/repro")
> scala> pTel.take(1)
> res0: Array[String] = Array(a,b,c,,,)
> scala> val delimiter = ","
> delimiter: String = ,
> scala> import org.apache.spark.sql._
> import org.apache.spark.sql._
> scala> val pRowRDD = pTel.map(_.split(delimiter)).map(p => 
> Row(p(0),p(1),p(2),p(3),p(4),p(5)))
> pRowRDD: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = 
> MapPartitionsRDD[3] at map at :28
> scala> pRowRDD.take(1)
> 16/06/27 08:21:40 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
> java.lang.ArrayIndexOutOfBoundsException: 3
> {code}
> It works fine when the data is a,b,c,d,e,f



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16225) Spark Sql throws ArrayIndexOutOfBoundsException on accessing Null Value

2016-06-27 Thread Prabhu Joseph (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16225?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350718#comment-15350718
 ] 

Prabhu Joseph commented on SPARK-16225:
---

Thanks [~srowen]

> Spark Sql throws ArrayIndexOutOfBoundsException on accessing Null Value
> ---
>
> Key: SPARK-16225
> URL: https://issues.apache.org/jira/browse/SPARK-16225
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 1.5.2
>Reporter: Prabhu Joseph
>
> Spark Sql throws ArrayIndexOutOfBoundsException on accessing Null value
> Repro:
> {code}
> [root@spark1 spark]# cat repro 
> a,b,c,,,
> ./bin/spark-shell
> scala> val pTel = sc.textFile("file:///usr/hdp/2.3.4.7-4/spark/repro")
> scala> pTel.take(1)
> res0: Array[String] = Array(a,b,c,,,)
> scala> val delimiter = ","
> delimiter: String = ,
> scala> import org.apache.spark.sql._
> import org.apache.spark.sql._
> scala> val pRowRDD = pTel.map(_.split(delimiter)).map(p => 
> Row(p(0),p(1),p(2),p(3),p(4),p(5)))
> pRowRDD: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = 
> MapPartitionsRDD[3] at map at :28
> scala> pRowRDD.take(1)
> 16/06/27 08:21:40 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
> java.lang.ArrayIndexOutOfBoundsException: 3
> {code}
> It works fine when the data is a,b,c,d,e,f



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-16225) Spark Sql throws ArrayIndexOutOfBoundsException on accessing Null Value

2016-06-27 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created SPARK-16225:
-

 Summary: Spark Sql throws ArrayIndexOutOfBoundsException on 
accessing Null Value
 Key: SPARK-16225
 URL: https://issues.apache.org/jira/browse/SPARK-16225
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.5.2
Reporter: Prabhu Joseph


Spark Sql throws ArrayIndexOutOfBoundsException on accessing Null value

Repro:

{code}
[root@spark1 spark]# cat repro 
a,b,c,,,

./bin/spark-shell

scala> val pTel = sc.textFile("file:///usr/hdp/2.3.4.7-4/spark/repro")

scala> pTel.take(1)
res0: Array[String] = Array(a,b,c,,,)

scala> val delimiter = ","
delimiter: String = ,

scala> import org.apache.spark.sql._
import org.apache.spark.sql._

scala> val pRowRDD = pTel.map(_.split(delimiter)).map(p => 
Row(p(0),p(1),p(2),p(3),p(4),p(5)))
pRowRDD: org.apache.spark.rdd.RDD[org.apache.spark.sql.Row] = 
MapPartitionsRDD[3] at map at :28

scala> pRowRDD.take(1)
16/06/27 08:21:40 ERROR Executor: Exception in task 0.0 in stage 1.0 (TID 1)
java.lang.ArrayIndexOutOfBoundsException: 3
{code}

It works fine when the data is a,b,c,d,e,f



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-15918) unionAll returns wrong result when two dataframes has schema in different order

2016-06-13 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created SPARK-15918:
-

 Summary: unionAll returns wrong result when two dataframes has 
schema in different order
 Key: SPARK-15918
 URL: https://issues.apache.org/jira/browse/SPARK-15918
 Project: Spark
  Issue Type: Bug
  Components: SQL
Affects Versions: 1.6.1
 Environment: CentOS
Reporter: Prabhu Joseph
 Fix For: 1.6.1


On applying unionAll operation between A and B dataframes, they both has same 
schema but in different order and hence the result has column value mapping 
changed.

Repro:

{code}

A.show()
+---++---+--+--+-++---+--+---+---+-+
|tag|year_day|tm_hour|tm_min|tm_sec|dtype|time|tm_mday|tm_mon|tm_yday|tm_year|value|
+---++---+--+--+-++---+--+---+---+-+
+---++---+--+--+-++---+--+---+---+-+

B.show()
+-+---+--+---+---+--+--+--+---+---+--++
|dtype|tag|  
time|tm_hour|tm_mday|tm_min|tm_mon|tm_sec|tm_yday|tm_year| value|year_day|
+-+---+--+---+---+--+--+--+---+---+--++
|F|C_FNHXUT701Z.CNSTLO|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F|C_FNHXUDP713.CNSTHI|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F| C_FNHXUT718.CNSTHI|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F|C_FNHXUT703Z.CNSTLO|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F|C_FNHXUR716A.CNSTLO|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F|C_FNHXUT803Z.CNSTHI|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F| C_FNHXUT728.CNSTHI|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F| C_FNHXUR806.CNSTHI|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
+-+---+--+---+---+--+--+--+---+---+--++

A = A.unionAll(B)
A.show()
+---+---+--+--+--+-++---+--+---+---+-+
|tag|   year_day|   
tm_hour|tm_min|tm_sec|dtype|time|tm_mday|tm_mon|tm_yday|tm_year|value|
+---+---+--+--+--+-++---+--+---+---+-+
|  F|C_FNHXUT701Z.CNSTLO|1443790800|13| 2|0|  10|  0|   275|   
2015| 1.2345|2015275.0|
|  F|C_FNHXUDP713.CNSTHI|1443790800|13| 2|0|  10|  0|   275|   
2015| 1.2345|2015275.0|
|  F| C_FNHXUT718.CNSTHI|1443790800|13| 2|0|  10|  0|   275|   
2015| 1.2345|2015275.0|
|  F|C_FNHXUT703Z.CNSTLO|1443790800|13| 2|0|  10|  0|   275|   
2015| 1.2345|2015275.0|
|  F|C_FNHXUR716A.CNSTLO|1443790800|13| 2|0|  10|  0|   275|   
2015| 1.2345|2015275.0|
|  F|C_FNHXUT803Z.CNSTHI|1443790800|13| 2|0|  10|  0|   275|   
2015| 1.2345|2015275.0|
|  F| C_FNHXUT728.CNSTHI|1443790800|13| 2|0|  10|  0|   275|   
2015| 1.2345|2015275.0|
|  F| C_FNHXUR806.CNSTHI|1443790800|13| 2|0|  10|  0|   275|   
2015| 1.2345|2015275.0|
+---+---+--+--+--+-++---+--+---+---+-+
{code}

On changing the schema of A according to B and doing unionAll works fine

{code}

C = 
A.select("dtype","tag","time","tm_hour","tm_mday","tm_min",”tm_mon”,"tm_sec","tm_yday","tm_year","value","year_day")

A = C.unionAll(B)
A.show()

+-+---+--+---+---+--+--+--+---+---+--++
|dtype|tag|  
time|tm_hour|tm_mday|tm_min|tm_mon|tm_sec|tm_yday|tm_year| value|year_day|
+-+---+--+---+---+--+--+--+---+---+--++
|F|C_FNHXUT701Z.CNSTLO|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F|C_FNHXUDP713.CNSTHI|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F| C_FNHXUT718.CNSTHI|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F|C_FNHXUT703Z.CNSTLO|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F|C_FNHXUR716A.CNSTLO|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F|C_FNHXUT803Z.CNSTHI|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F| C_FNHXUT728.CNSTHI|1443790800| 13|  2| 0|10| 0|
275|   2015|1.2345| 2015275|
|F| C_FNHXUR806.CNSTHI|1443790800| 13|  2| 0|

[jira] [Resolved] (SPARK-13181) Spark delay in task scheduling within executor

2016-02-04 Thread Prabhu Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph resolved SPARK-13181.
---
Resolution: Not A Problem

> Spark delay in task scheduling within executor
> --
>
> Key: SPARK-13181
> URL: https://issues.apache.org/jira/browse/SPARK-13181
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.5.2
>Reporter: Prabhu Joseph
> Fix For: 1.5.2
>
> Attachments: ran3.JPG
>
>
> When Spark job with some RDD in memory and some in Hadoop, the tasks within 
> Executor which reads from memory is started parallel but task to read from 
> hadoop is started after some delay.
> Repro: 
> A logFile of 1.25 GB is given as input. (5 RDD each of 256MB) 
> val logData = sc.textFile(logFile, 2).cache()
> var numAs = logData.filter(line => line.contains("a")).count()
> var numBs = logData.filter(line => line.contains("b")).count()
> Run Spark Job with 1 executor with 6GB memory, 12 cores
> Stage A (reading line with a) - executor starts 5 tasks parallel and all 
> reads data from Hadoop.
> Stage B(reading line with b) - As the data is cached (4 RDD is in memory, 1 
> is in Hadoop) - executor starts 4 tasks parallel and after 4 seconds delay, 
> starts the last task to read from Hadoop.
> On Running the same Spark Job with 12GB memory, all 5 RDD are in memory ans 5 
> tasks in Stage B started parallel. 
> On Running the job with 2GB memory, all 5 RDD are in Hadoop and 5 tasks in 
> stage B started parallel. 
> The task delay happens only when some RDD in memory and some in Hadoop.
> Check the attached image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13181) Spark delay in task scheduling within executor

2016-02-04 Thread Prabhu Joseph (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13181?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15132115#comment-15132115
 ] 

Prabhu Joseph commented on SPARK-13181:
---

Okay, the reason for the task delay within executor when some RDD in memory and 
some in Hadoop i.e, Multiple Locality Levels NODE_LOCAL and ANY, in this case 
Scheduler waits for spark.locality.wait 3 seconds default. During this period, 
scheduler waits to launch a data-local task before giving up and launching it 
on a less-local node. So after making it 0, all tasks started parallel. But 
learned that it is better not to reduce it to 0. 

> Spark delay in task scheduling within executor
> --
>
> Key: SPARK-13181
> URL: https://issues.apache.org/jira/browse/SPARK-13181
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.5.2
>Reporter: Prabhu Joseph
> Fix For: 1.5.2
>
> Attachments: ran3.JPG
>
>
> When Spark job with some RDD in memory and some in Hadoop, the tasks within 
> Executor which reads from memory is started parallel but task to read from 
> hadoop is started after some delay.
> Repro: 
> A logFile of 1.25 GB is given as input. (5 RDD each of 256MB) 
> val logData = sc.textFile(logFile, 2).cache()
> var numAs = logData.filter(line => line.contains("a")).count()
> var numBs = logData.filter(line => line.contains("b")).count()
> Run Spark Job with 1 executor with 6GB memory, 12 cores
> Stage A (reading line with a) - executor starts 5 tasks parallel and all 
> reads data from Hadoop.
> Stage B(reading line with b) - As the data is cached (4 RDD is in memory, 1 
> is in Hadoop) - executor starts 4 tasks parallel and after 4 seconds delay, 
> starts the last task to read from Hadoop.
> On Running the same Spark Job with 12GB memory, all 5 RDD are in memory ans 5 
> tasks in Stage B started parallel. 
> On Running the job with 2GB memory, all 5 RDD are in Hadoop and 5 tasks in 
> stage B started parallel. 
> The task delay happens only when some RDD in memory and some in Hadoop.
> Check the attached image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-13181) Spark delay in task scheduling within executor

2016-02-03 Thread Prabhu Joseph (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-13181?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prabhu Joseph updated SPARK-13181:
--
Attachment: ran3.JPG

> Spark delay in task scheduling within executor
> --
>
> Key: SPARK-13181
> URL: https://issues.apache.org/jira/browse/SPARK-13181
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 1.5.2
>Reporter: Prabhu Joseph
> Fix For: 1.5.2
>
> Attachments: ran3.JPG
>
>
> When Spark job with some RDD in memory and some in Hadoop, the tasks within 
> Executor which reads from memory is started parallel but task to read from 
> hadoop is started after some delay.
> Repro: 
> A logFile of 1.25 GB is given as input. (5 RDD each of 256MB) 
> val logData = sc.textFile(logFile, 2).cache()
> var numAs = logData.filter(line => line.contains("a")).count()
> var numBs = logData.filter(line => line.contains("b")).count()
> Run Spark Job with 1 executor with 6GB memory, 12 cores
> Stage A (reading line with a) - executor starts 5 tasks parallel and all 
> reads data from Hadoop.
> Stage B(reading line with b) - As the data is cached (4 RDD is in memory, 1 
> is in Hadoop) - executor starts 4 tasks parallel and after 4 seconds delay, 
> starts the last task to read from Hadoop.
> On Running the same Spark Job with 12GB memory, all 5 RDD are in memory ans 5 
> tasks in Stage B started parallel. 
> On Running the job with 2GB memory, all 5 RDD are in Hadoop and 5 tasks in 
> stage B started parallel. 
> The task delay happens only when some RDD in memory and some in Hadoop.
> Check the attached image.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-13181) Spark delay in task scheduling within executor

2016-02-03 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created SPARK-13181:
-

 Summary: Spark delay in task scheduling within executor
 Key: SPARK-13181
 URL: https://issues.apache.org/jira/browse/SPARK-13181
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.5.2
Reporter: Prabhu Joseph
 Fix For: 1.5.2


When Spark job with some RDD in memory and some in Hadoop, the tasks within 
Executor which reads from memory is started parallel but task to read from 
hadoop is started after some delay.

Repro: 

A logFile of 1.25 GB is given as input. (5 RDD each of 256MB) 

val logData = sc.textFile(logFile, 2).cache()
var numAs = logData.filter(line => line.contains("a")).count()
var numBs = logData.filter(line => line.contains("b")).count()

Run Spark Job with 1 executor with 6GB memory, 12 cores

Stage A (reading line with a) - executor starts 5 tasks parallel and all reads 
data from Hadoop.

Stage B(reading line with b) - As the data is cached (4 RDD is in memory, 1 is 
in Hadoop) - executor starts 4 tasks parallel and after 4 seconds delay, starts 
the last task to read from Hadoop.

On Running the same Spark Job with 12GB memory, all 5 RDD are in memory ans 5 
tasks in Stage B started parallel. 

On Running the job with 2GB memory, all 5 RDD are in Hadoop and 5 tasks in 
stage B started parallel. 

The task delay happens only when some RDD in memory and some in Hadoop.

Check the attached image.









--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-13182) Spark Executor retries infinitely

2016-02-03 Thread Prabhu Joseph (JIRA)
Prabhu Joseph created SPARK-13182:
-

 Summary: Spark Executor retries infinitely
 Key: SPARK-13182
 URL: https://issues.apache.org/jira/browse/SPARK-13182
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
Affects Versions: 1.5.2
Reporter: Prabhu Joseph
Priority: Minor
 Fix For: 1.5.2


  When a Spark job (Spark-1.5.2) is submitted with a single executor and if 
user passes some wrong JVM arguments with spark.executor.extraJavaOptions, the 
first executor fails. But the job keeps on retrying, creating a new executor 
and failing every time, until CTRL-C is pressed. 

./spark-submit --class SimpleApp --master "spark://10.10.72.145:7077"  --conf 
"spark.executor.extraJavaOptions=-XX:+PrintGCDetails -XX:+PrintGCTimeStamps 
-XX:+UseG1GC -XX:InitiatingHeapOccupancyPercent=35 -XX:ConcGCThreads=16" 
/SPARK/SimpleApp.jar

Here when user submits job with ConcGCThreads 16 which is greater than 
ParallelGCThreads, JVM will crash. But the job does not exit, keeps on creating 
executors and retrying.
..
16/02/01 06:54:28 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20160201065319-0014/2846 on hostPort 10.10.72.145:36558 with 12 cores, 2.0 
GB RAM
16/02/01 06:54:28 INFO AppClient$ClientEndpoint: Executor updated: 
app-20160201065319-0014/2846 is now LOADING
16/02/01 06:54:28 INFO AppClient$ClientEndpoint: Executor updated: 
app-20160201065319-0014/2846 is now RUNNING
16/02/01 06:54:28 INFO AppClient$ClientEndpoint: Executor updated: 
app-20160201065319-0014/2846 is now EXITED (Command exited with code 1)
16/02/01 06:54:28 INFO SparkDeploySchedulerBackend: Executor 
app-20160201065319-0014/2846 removed: Command exited with code 1
16/02/01 06:54:28 INFO SparkDeploySchedulerBackend: Asked to remove 
non-existent executor 2846
16/02/01 06:54:28 INFO AppClient$ClientEndpoint: Executor added: 
app-20160201065319-0014/2847 on worker-20160131230345-10.10.72.145-36558 
(10.10.72.145:36558) with 12 cores
16/02/01 06:54:28 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20160201065319-0014/2847 on hostPort 10.10.72.145:36558 with 12 cores, 2.0 
GB RAM
16/02/01 06:54:28 INFO AppClient$ClientEndpoint: Executor updated: 
app-20160201065319-0014/2847 is now LOADING
16/02/01 06:54:28 INFO AppClient$ClientEndpoint: Executor updated: 
app-20160201065319-0014/2847 is now EXITED (Command exited with code 1)
16/02/01 06:54:28 INFO SparkDeploySchedulerBackend: Executor 
app-20160201065319-0014/2847 removed: Command exited with code 1
16/02/01 06:54:28 INFO SparkDeploySchedulerBackend: Asked to remove 
non-existent executor 2847
16/02/01 06:54:28 INFO AppClient$ClientEndpoint: Executor added: 
app-20160201065319-0014/2848 on worker-20160131230345-10.10.72.145-36558 
(10.10.72.145:36558) with 12 cores
16/02/01 06:54:28 INFO SparkDeploySchedulerBackend: Granted executor ID 
app-20160201065319-0014/2848 on hostPort 10.10.72.145:36558 with 12 cores, 2.0 
GB RAM
16/02/01 06:54:28 INFO AppClient$ClientEndpoint: Executor updated: 
app-20160201065319-0014/2848 is now LOADING
16/02/01 06:54:28 INFO AppClient$ClientEndpoint: Executor updated: 
app-20160201065319-0014/2848 is now RUNNING

Spark should not fall into a trap on these kind of user errors on a production 
cluster.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org