[jira] [Commented] (SPARK-25128) multiple simultaneous job submissions against k8s backend cause driver pods to hang

2019-06-19 Thread Suman Somasundar (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-25128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16868138#comment-16868138
 ] 

Suman Somasundar commented on SPARK-25128:
--

I have the same issue. When multiple jobs are submitted, the driver pods start, 
then the executor pods start. But executor fails because it is not able to 
resolve the driver service. Driver is stuck in running state with the warning - 
"Initial job has not accepted any resources; check your cluster UI to ensure 
that workers are registered and have sufficient resources"

Error in executor pod:

Exception in thread "main" java.lang.reflect.UndeclaredThrowableException
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1707)
 at 
org.apache.spark.deploy.SparkHadoopUtil.runAsSparkUser(SparkHadoopUtil.scala:64)
 at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.run(CoarseGrainedExecutorBackend.scala:188)
 at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$.main(CoarseGrainedExecutorBackend.scala:293)
 at 
org.apache.spark.executor.CoarseGrainedExecutorBackend.main(CoarseGrainedExecutorBackend.scala)
Caused by: org.apache.spark.SparkException: Exception thrown in awaitResult:
 at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:205)
 at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
 at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
 at 
org.apache.spark.executor.CoarseGrainedExecutorBackend$$anonfun$run$1.apply$mcV$sp(CoarseGrainedExecutorBackend.scala:201)
 at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:65)
 at 
org.apache.spark.deploy.SparkHadoopUtil$$anon$2.run(SparkHadoopUtil.scala:64)
 at java.security.AccessController.doPrivileged(Native Method)
 at javax.security.auth.Subject.doAs(Subject.java:422)
 at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1692)
 ... 4 more

Caused by: java.io.IOException: Failed to connect to 
t-f5d67725474036458526157f70bc999c-driver-svc.spark-namespace.svc:7078
 at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)
 at 
org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
 at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198)
 at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194)
 at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190)
 at java.util.concurrent.FutureTask.run(FutureTask.java:266)
 at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
 at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
 at java.lang.Thread.run(Thread.java:748)
Caused by: java.net.UnknownHostException: 
t-f5d67725474036458526157f70bc999c-driver-svc.spark-namespace.svc
 at java.net.InetAddress.getAllByName0(InetAddress.java:1280)
 at java.net.InetAddress.getAllByName(InetAddress.java:1192)
 at java.net.InetAddress.getAllByName(InetAddress.java:1126)
 at java.net.InetAddress.getByName(InetAddress.java:1076)
 at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146)
 at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143)
 at java.security.AccessController.doPrivileged(Native Method)
 at io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:143)
 at io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:43)
 at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:63)
 at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:55)
 at 
io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:57)
 at 
io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:32)
 at 
io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:108)
 at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:208)
 at io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:49)
 at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:188)
 at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:174)
 at 
io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:511)
 at 
io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:485)
 at 
io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:424)
 at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:103)
 at 
io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
 at 
io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetSuccess(AbstractChannel.java:982)
 at 
io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:516)
 at 
io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:427)
 at 

[jira] [Commented] (SPARK-23529) Specify hostpath volume and mount the volume in Spark driver and executor pods in Kubernetes

2018-03-13 Thread Suman Somasundar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23529?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16397662#comment-16397662
 ] 

Suman Somasundar commented on SPARK-23529:
--

I have a couple of use-cases for this as explained below:
 # There are multiple Spark jobs running in the cluster and I would want to be 
able to extract the log files from these jobs. In this case, I can mount a 
hostpath volume in all the driver and executor pods and have them write logs to 
that path. I can login to the nodes and copy all the log files from all jobs.
 # My jobs have a dependency on large data files and fat jars. I dont want to 
copy them over (from HDFS) to the driver and executor pods every time I submit 
the job. Packaging these data files and jars in the image is not an option as 
the image size will be large and I dont want to have images specific to a job. 
In this case, I can download the dependencies to a local volume and mount it as 
hostpath on the driver and executor pods and use the dependencies as local 
files. 

> Specify hostpath volume and mount the volume in Spark driver and executor 
> pods in Kubernetes
> 
>
> Key: SPARK-23529
> URL: https://issues.apache.org/jira/browse/SPARK-23529
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Suman Somasundar
>Assignee: Anirudh Ramanathan
>Priority: Minor
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23529) Specify hostpath volume and mount the volume in Spark driver and executor pods in Kubernetes

2018-02-27 Thread Suman Somasundar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suman Somasundar updated SPARK-23529:
-
Issue Type: Improvement  (was: Sub-task)
Parent: (was: SPARK-18278)

> Specify hostpath volume and mount the volume in Spark driver and executor 
> pods in Kubernetes
> 
>
> Key: SPARK-23529
> URL: https://issues.apache.org/jira/browse/SPARK-23529
> Project: Spark
>  Issue Type: Improvement
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Suman Somasundar
>Assignee: Anirudh Ramanathan
>Priority: Minor
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-23529) Specify hostpath volume and mount the volume in Spark driver and executor pods in Kubernetes

2018-02-27 Thread Suman Somasundar (JIRA)
Suman Somasundar created SPARK-23529:


 Summary: Specify hostpath volume and mount the volume in Spark 
driver and executor pods in Kubernetes
 Key: SPARK-23529
 URL: https://issues.apache.org/jira/browse/SPARK-23529
 Project: Spark
  Issue Type: Sub-task
  Components: Kubernetes
Affects Versions: 2.3.0
Reporter: Suman Somasundar
Assignee: Anirudh Ramanathan
 Fix For: 2.3.0






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23529) Specify hostpath volume and mount the volume in Spark driver and executor pods in Kubernetes

2018-02-27 Thread Suman Somasundar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suman Somasundar updated SPARK-23529:
-
Priority: Minor  (was: Blocker)

> Specify hostpath volume and mount the volume in Spark driver and executor 
> pods in Kubernetes
> 
>
> Key: SPARK-23529
> URL: https://issues.apache.org/jira/browse/SPARK-23529
> Project: Spark
>  Issue Type: Sub-task
>  Components: Kubernetes
>Affects Versions: 2.3.0
>Reporter: Suman Somasundar
>Assignee: Anirudh Ramanathan
>Priority: Minor
> Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-14301) Java examples code merge and clean up

2016-09-27 Thread Suman Somasundar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-14301?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15527557#comment-15527557
 ] 

Suman Somasundar commented on SPARK-14301:
--

This merge removed the generic JavaKMeans.java code and has replaced it with 
JavaKMeansExample.java in which the input file, number of iterations, number of 
clusters are all hardcoded.

> Java examples code merge and clean up
> -
>
> Key: SPARK-14301
> URL: https://issues.apache.org/jira/browse/SPARK-14301
> Project: Spark
>  Issue Type: Sub-task
>  Components: Examples
>Reporter: Xusen Yin
>Assignee: Yong Tang
>Priority: Minor
>  Labels: starter
> Fix For: 2.0.0
>
>
> Duplicated code that I found in java/examples/mllib and java/examples/ml:
> * java/ml
> ** JavaCrossValidatorExample.java
> ** JavaDocument.java
> ** JavaLabeledDocument.java
> ** JavaTrainValidationSplitExample.java
> * Unsure code duplications of java/ml, double check
> ** JavaDeveloperApiExample.java
> ** JavaSimpleParamsExample.java
> ** JavaSimpleTextClassificationPipeline.java
> * java/mllib
> ** JavaKMeans.java
> ** JavaLDAExample.java
> ** JavaLR.java
> * Unsure code duplications of java/mllib, double check
> ** JavaALS.java
> ** JavaFPGrowthExample.java
> When merging and cleaning those code, be sure not disturb the previous 
> example on and off blocks.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-17322) 'ANY n' clause for SQL queries to increase the ease of use of WHERE clause predicates

2016-08-30 Thread Suman Somasundar (JIRA)
Suman Somasundar created SPARK-17322:


 Summary: 'ANY n' clause for SQL queries to increase the ease of 
use of WHERE clause predicates
 Key: SPARK-17322
 URL: https://issues.apache.org/jira/browse/SPARK-17322
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Reporter: Suman Somasundar
Priority: Minor


If the user is interested in getting the results that meet 'any n' criteria out 
of m where clause predicates (m > n), then the 'any n' clause greatly 
simplifies writing a SQL query.

An example is given below:

select symbol from stocks where (market_cap > 5.7b, analysts_recommend > 10, 
moving_avg > 49.2, pe_ratio >15.4) ANY 3



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Comment Edited] (SPARK-16962) Unsafe accesses (Platform.getLong()) not supported on unaligned boundaries in SPARC/Solaris

2016-08-22 Thread Suman Somasundar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413970#comment-15413970
 ] 

Suman Somasundar edited comment on SPARK-16962 at 8/22/16 8:01 PM:
---

I am working on a fix for this issue with other Oracle engineers ([~jlhitt] & 
[~erik.oshaughnessy] ), who may also comment and contribute on this


was (Author: sumansomasundar):
We are working on a fix for this issue. I will submit a patch soon.

> Unsafe accesses (Platform.getLong()) not supported on unaligned boundaries in 
> SPARC/Solaris
> ---
>
> Key: SPARK-16962
> URL: https://issues.apache.org/jira/browse/SPARK-16962
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.0.0
> Environment: SPARC/Solaris
>Reporter: Suman Somasundar
>
> Unaligned accesses are not supported on SPARC architecture. Because of this, 
> Spark applications fail by dumping core on SPARC machines whenever unaligned 
> accesses happen. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-16962) Unsafe accesses (Platform.getLong()) not supported on unaligned boundaries in SPARC/Solaris

2016-08-10 Thread Suman Somasundar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Suman Somasundar updated SPARK-16962:
-
Component/s: SQL
 Spark Core

> Unsafe accesses (Platform.getLong()) not supported on unaligned boundaries in 
> SPARC/Solaris
> ---
>
> Key: SPARK-16962
> URL: https://issues.apache.org/jira/browse/SPARK-16962
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core, SQL
>Affects Versions: 2.0.0
> Environment: SPARC/Solaris
>Reporter: Suman Somasundar
>
> Unaligned accesses are not supported on SPARC architecture. Because of this, 
> Spark applications fail by dumping core on SPARC machines whenever unaligned 
> accesses happen. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-16962) Unsafe accesses (Platform.getLong()) not supported on unaligned boundaries in SPARC/Solaris

2016-08-09 Thread Suman Somasundar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-16962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15413970#comment-15413970
 ] 

Suman Somasundar commented on SPARK-16962:
--

We are working on a fix for this issue. I will submit a patch soon.

> Unsafe accesses (Platform.getLong()) not supported on unaligned boundaries in 
> SPARC/Solaris
> ---
>
> Key: SPARK-16962
> URL: https://issues.apache.org/jira/browse/SPARK-16962
> Project: Spark
>  Issue Type: Bug
>Affects Versions: 2.0.0
> Environment: SPARC/Solaris
>Reporter: Suman Somasundar
>
> Unaligned accesses are not supported on SPARC architecture. Because of this, 
> Spark applications fail by dumping core on SPARC machines whenever unaligned 
> accesses happen. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-16962) Unsafe accesses (Platform.getLong()) not supported on unaligned boundaries in SPARC/Solaris

2016-08-08 Thread Suman Somasundar (JIRA)
Suman Somasundar created SPARK-16962:


 Summary: Unsafe accesses (Platform.getLong()) not supported on 
unaligned boundaries in SPARC/Solaris
 Key: SPARK-16962
 URL: https://issues.apache.org/jira/browse/SPARK-16962
 Project: Spark
  Issue Type: Bug
Affects Versions: 2.0.0
 Environment: SPARC/Solaris
Reporter: Suman Somasundar


Unaligned accesses are not supported on SPARC architecture. Because of this, 
Spark applications fail by dumping core on SPARC machines whenever unaligned 
accesses happen. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org