[jira] [Commented] (SPARK-23187) Accumulator object can not be sent from Executor to Driver

2018-01-22 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23187?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16335428#comment-16335428
 ] 

Saisai Shao commented on SPARK-23187:
-

I will try to investigate on it.

> Accumulator object can not be sent from Executor to Driver
> --
>
> Key: SPARK-23187
> URL: https://issues.apache.org/jira/browse/SPARK-23187
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1, 2.3.0
>Reporter: Lantao Jin
>Priority: Major
>
> In the Executor.scala->reportHeartBeat(), task Metrics value can not be sent 
> to Driver (In receive side all values are zero).
> I write an UT for explanation.
> {code}
> diff --git 
> a/core/src/test/scala/org/apache/spark/rpc/netty/NettyRpcEnvSuite.scala 
> b/core/src/test/scala/org/apache/spark/rpc/netty/NettyRpcEnvSuite.scala
> index f9481f8..57fb096 100644
> --- a/core/src/test/scala/org/apache/spark/rpc/netty/NettyRpcEnvSuite.scala
> +++ b/core/src/test/scala/org/apache/spark/rpc/netty/NettyRpcEnvSuite.scala
> @@ -17,11 +17,16 @@
>  package org.apache.spark.rpc.netty
> +import scala.collection.mutable.ArrayBuffer
> +
>  import org.scalatest.mockito.MockitoSugar
>  import org.apache.spark._
>  import org.apache.spark.network.client.TransportClient
>  import org.apache.spark.rpc._
> +import org.apache.spark.util.AccumulatorContext
> +import org.apache.spark.util.AccumulatorV2
> +import org.apache.spark.util.LongAccumulator
>  class NettyRpcEnvSuite extends RpcEnvSuite with MockitoSugar {
> @@ -83,5 +88,21 @@ class NettyRpcEnvSuite extends RpcEnvSuite with 
> MockitoSugar {
>  assertRequestMessageEquals(
>msg3,
>RequestMessage(nettyEnv, client, msg3.serialize(nettyEnv)))
> +
> +val acc = new LongAccumulator
> +val sc = SparkContext.getOrCreate(new 
> SparkConf().setMaster("local").setAppName("testAcc"));
> +sc.register(acc, "testAcc")
> +acc.setValue(1)
> +//val msg4 = new RequestMessage(senderAddress, receiver, acc)
> +//assertRequestMessageEquals(
> +//  msg4,
> +//  RequestMessage(nettyEnv, client, msg4.serialize(nettyEnv)))
> +
> +val accbuf = new ArrayBuffer[AccumulatorV2[_, _]]()
> +accbuf += acc
> +val msg5 = new RequestMessage(senderAddress, receiver, accbuf)
> +assertRequestMessageEquals(
> +  msg5,
> +  RequestMessage(nettyEnv, client, msg5.serialize(nettyEnv)))
>}
>  }
> {code}
> msg4 and msg5 are all going to failed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23187) Accumulator object can not be sent from Executor to Driver

2018-01-22 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23187?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-23187:

Affects Version/s: (was: 2.3.1)
   2.3.0

> Accumulator object can not be sent from Executor to Driver
> --
>
> Key: SPARK-23187
> URL: https://issues.apache.org/jira/browse/SPARK-23187
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1, 2.3.0
>Reporter: Lantao Jin
>Priority: Major
>
> In the Executor.scala->reportHeartBeat(), task Metrics value can not be sent 
> to Driver (In receive side all values are zero).
> I write an UT for explanation.
> {code}
> diff --git 
> a/core/src/test/scala/org/apache/spark/rpc/netty/NettyRpcEnvSuite.scala 
> b/core/src/test/scala/org/apache/spark/rpc/netty/NettyRpcEnvSuite.scala
> index f9481f8..57fb096 100644
> --- a/core/src/test/scala/org/apache/spark/rpc/netty/NettyRpcEnvSuite.scala
> +++ b/core/src/test/scala/org/apache/spark/rpc/netty/NettyRpcEnvSuite.scala
> @@ -17,11 +17,16 @@
>  package org.apache.spark.rpc.netty
> +import scala.collection.mutable.ArrayBuffer
> +
>  import org.scalatest.mockito.MockitoSugar
>  import org.apache.spark._
>  import org.apache.spark.network.client.TransportClient
>  import org.apache.spark.rpc._
> +import org.apache.spark.util.AccumulatorContext
> +import org.apache.spark.util.AccumulatorV2
> +import org.apache.spark.util.LongAccumulator
>  class NettyRpcEnvSuite extends RpcEnvSuite with MockitoSugar {
> @@ -83,5 +88,21 @@ class NettyRpcEnvSuite extends RpcEnvSuite with 
> MockitoSugar {
>  assertRequestMessageEquals(
>msg3,
>RequestMessage(nettyEnv, client, msg3.serialize(nettyEnv)))
> +
> +val acc = new LongAccumulator
> +val sc = SparkContext.getOrCreate(new 
> SparkConf().setMaster("local").setAppName("testAcc"));
> +sc.register(acc, "testAcc")
> +acc.setValue(1)
> +//val msg4 = new RequestMessage(senderAddress, receiver, acc)
> +//assertRequestMessageEquals(
> +//  msg4,
> +//  RequestMessage(nettyEnv, client, msg4.serialize(nettyEnv)))
> +
> +val accbuf = new ArrayBuffer[AccumulatorV2[_, _]]()
> +accbuf += acc
> +val msg5 = new RequestMessage(senderAddress, receiver, accbuf)
> +assertRequestMessageEquals(
> +  msg5,
> +  RequestMessage(nettyEnv, client, msg5.serialize(nettyEnv)))
>}
>  }
> {code}
> msg4 and msg5 are all going to failed.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-22838) Avoid unnecessary copying of data

2018-01-21 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22838?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved SPARK-22838.
-
Resolution: Invalid

> Avoid unnecessary copying of data
> -
>
> Key: SPARK-22838
> URL: https://issues.apache.org/jira/browse/SPARK-22838
> Project: Spark
>  Issue Type: Improvement
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Xianyang Liu
>Priority: Major
>
> If we read data from FileChannel to HeapByteBuffer, there is a need to copy 
> the data from the off-heap to the on-heap, you can see the follow code:
> ```java
> static int read(FileDescriptor var0, ByteBuffer var1, long var2, 
> NativeDispatcher var4) throws IOException {
> if(var1.isReadOnly()) {
>   throw new IllegalArgumentException("Read-only buffer");
> } else if(var1 instanceof DirectBuffer) {
>   return readIntoNativeBuffer(var0, var1, var2, var4);
> } else {
>   ByteBuffer var5 = Util.getTemporaryDirectBuffer(var1.remaining());
>   int var7;
>   try {
> int var6 = readIntoNativeBuffer(var0, var5, var2, var4);
> var5.flip();
> if(var6 > 0) {
>   var1.put(var5);
> }
> var7 = var6;
>   } finally {
> Util.offerFirstTemporaryDirectBuffer(var5);
>   }
>   return var7;
> }
>   }
> ```



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-22976) Worker cleanup can remove running driver directories

2018-01-21 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao reassigned SPARK-22976:
---

Assignee: Russell Spitzer

> Worker cleanup can remove running driver directories
> 
>
> Key: SPARK-22976
> URL: https://issues.apache.org/jira/browse/SPARK-22976
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Spark Core
>Affects Versions: 1.0.2
>Reporter: Russell Spitzer
>Assignee: Russell Spitzer
>Priority: Major
> Fix For: 2.3.0
>
>
> Spark Standalone worker cleanup finds directories to remove with a listFiles 
> command
> This includes both application directories and driver directories from 
> cluster mode submitted applications. 
> A directory is considered to not be part of a running app if the worker does 
> not have an executor with a matching ID.
> https://github.com/apache/spark/blob/v2.2.1/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala#L432
> {code}
>   val appIds = executors.values.map(_.appId).toSet
>   val isAppStillRunning = appIds.contains(appIdFromDir)
> {code}
> If a driver has been started on a node, but all of the executors are on other 
> nodes, the worker running the driver will always assume that the driver 
> directory is not part of a running app.
> Consider a two node spark cluster with Worker A and Worker B where each node 
> has a single core available. We submit our application in deploy mode 
> cluster, the driver begins running on Worker A while the Executor starts on B.
> Worker A has a cleanup triggered and looks and finds it has a directory
> {code}
> /var/lib/spark/worker/driver-20180105234824-
> {code}
> Worker A check's it's executor list and finds no entries which match this 
> since it has no corresponding executors for this application. Worker A then 
> removes the directory even though it may still be actively running.
> I think this could be fixed by modifying line 432 to be
> {code}
>   val appIds = executors.values.map(_.appId).toSet ++ 
> drivers.values.map(_.driverId)
> {code}
> I'll run a test and submit a PR soon.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-22976) Worker cleanup can remove running driver directories

2018-01-21 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22976?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved SPARK-22976.
-
   Resolution: Fixed
Fix Version/s: 2.3.0

> Worker cleanup can remove running driver directories
> 
>
> Key: SPARK-22976
> URL: https://issues.apache.org/jira/browse/SPARK-22976
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Spark Core
>Affects Versions: 1.0.2
>Reporter: Russell Spitzer
>Priority: Major
> Fix For: 2.3.0
>
>
> Spark Standalone worker cleanup finds directories to remove with a listFiles 
> command
> This includes both application directories and driver directories from 
> cluster mode submitted applications. 
> A directory is considered to not be part of a running app if the worker does 
> not have an executor with a matching ID.
> https://github.com/apache/spark/blob/v2.2.1/core/src/main/scala/org/apache/spark/deploy/worker/Worker.scala#L432
> {code}
>   val appIds = executors.values.map(_.appId).toSet
>   val isAppStillRunning = appIds.contains(appIdFromDir)
> {code}
> If a driver has been started on a node, but all of the executors are on other 
> nodes, the worker running the driver will always assume that the driver 
> directory is not part of a running app.
> Consider a two node spark cluster with Worker A and Worker B where each node 
> has a single core available. We submit our application in deploy mode 
> cluster, the driver begins running on Worker A while the Executor starts on B.
> Worker A has a cleanup triggered and looks and finds it has a directory
> {code}
> /var/lib/spark/worker/driver-20180105234824-
> {code}
> Worker A check's it's executor list and finds no entries which match this 
> since it has no corresponding executors for this application. Worker A then 
> removes the directory even though it may still be actively running.
> I think this could be fixed by modifying line 432 to be
> {code}
>   val appIds = executors.values.map(_.appId).toSet ++ 
> drivers.values.map(_.driverId)
> {code}
> I'll run a test and submit a PR soon.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23097) Migrate text socket source

2018-01-18 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331735#comment-16331735
 ] 

Saisai Shao commented on SPARK-23097:
-

Thanks, will do.

> Migrate text socket source
> --
>
> Key: SPARK-23097
> URL: https://issues.apache.org/jira/browse/SPARK-23097
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Jose Torres
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23097) Migrate text socket source

2018-01-18 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331715#comment-16331715
 ] 

Saisai Shao commented on SPARK-23097:
-

Hi [~joseph.torres], would you mind if I take a shot on this issue (in case you 
don't have a plan on it) :).

> Migrate text socket source
> --
>
> Key: SPARK-23097
> URL: https://issues.apache.org/jira/browse/SPARK-23097
> Project: Spark
>  Issue Type: Sub-task
>  Components: Structured Streaming
>Affects Versions: 2.4.0
>Reporter: Jose Torres
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (LIVY-434) Livy 0.5-incubating Release

2018-01-18 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331591#comment-16331591
 ] 

Saisai Shao commented on LIVY-434:
--

>From permission side I think it is OK to do it. But I'm not sure if Apache has 
>restrictions on it. I think it should be fine.

AFAIK "Sameer Agarwal", Spark 2.3's release manager, is also not a PMC, so 
should be fine to do so.

Please don't forget to publish and add your PGP to the file.

> Livy 0.5-incubating Release
> ---
>
> Key: LIVY-434
> URL: https://issues.apache.org/jira/browse/LIVY-434
> Project: Livy
>  Issue Type: Task
>Reporter: Alex Bozarth
>Priority: Major
>
> A location to track the 0.5 release process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (LIVY-434) Livy 0.5-incubating Release

2018-01-18 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331580#comment-16331580
 ] 

Saisai Shao commented on LIVY-434:
--

LGTM.

> Livy 0.5-incubating Release
> ---
>
> Key: LIVY-434
> URL: https://issues.apache.org/jira/browse/LIVY-434
> Project: Livy
>  Issue Type: Task
>Reporter: Alex Bozarth
>Priority: Major
>
> A location to track the 0.5 release process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (LIVY-434) Livy 0.5-incubating Release

2018-01-18 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331559#comment-16331559
 ] 

Saisai Shao commented on LIVY-434:
--

Sure. We can test with Spark 2.3. But my concern is that 2.3 is a big release 
after 6 month, there're bunch of big changes need to stabilize, it may not be 
so smoothly to release.

> Livy 0.5-incubating Release
> ---
>
> Key: LIVY-434
> URL: https://issues.apache.org/jira/browse/LIVY-434
> Project: Livy
>  Issue Type: Task
>Reporter: Alex Bozarth
>Priority: Major
>
> A location to track the 0.5 release process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (LIVY-434) Livy 0.5-incubating Release

2018-01-18 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331549#comment-16331549
 ] 

Saisai Shao commented on LIVY-434:
--

IMO, I think we should have a timeline for 0.5 release, which cannot highly 
depend on Spark 2.3. If Spark 2.3 is release before the timeline, then we also 
support it, if not we should not defer our release.

What do you think? [~ajbozarth] [~zjffdu].

> Livy 0.5-incubating Release
> ---
>
> Key: LIVY-434
> URL: https://issues.apache.org/jira/browse/LIVY-434
> Project: Livy
>  Issue Type: Task
>Reporter: Alex Bozarth
>Priority: Major
>
> A location to track the 0.5 release process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SPARK-23128) A new approach to do adaptive execution in Spark SQL

2018-01-18 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23128?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16331542#comment-16331542
 ] 

Saisai Shao commented on SPARK-23128:
-

I'm wondering if we need a SPIP for such changes?

> A new approach to do adaptive execution in Spark SQL
> 
>
> Key: SPARK-23128
> URL: https://issues.apache.org/jira/browse/SPARK-23128
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.3.0
>Reporter: Carson Wang
>Priority: Major
>
> SPARK-9850 proposed the basic idea of adaptive execution in Spark. In 
> DAGScheduler, a new API is added to support submitting a single map stage.  
> The current implementation of adaptive execution in Spark SQL supports 
> changing the reducer number at runtime. An Exchange coordinator is used to 
> determine the number of post-shuffle partitions for a stage that needs to 
> fetch shuffle data from one or multiple stages. The current implementation 
> adds ExchangeCoordinator while we are adding Exchanges. However there are 
> some limitations. First, it may cause additional shuffles that may decrease 
> the performance. We can see this from EnsureRequirements rule when it adds 
> ExchangeCoordinator.  Secondly, it is not a good idea to add 
> ExchangeCoordinators while we are adding Exchanges because we don’t have a 
> global picture of all shuffle dependencies of a post-shuffle stage. I.e. for 
> 3 tables’ join in a single stage, the same ExchangeCoordinator should be used 
> in three Exchanges but currently two separated ExchangeCoordinator will be 
> added. Thirdly, with the current framework it is not easy to implement other 
> features in adaptive execution flexibly like changing the execution plan and 
> handling skewed join at runtime.
> We'd like to introduce a new way to do adaptive execution in Spark SQL and 
> address the limitations. The idea is described at 
> [https://docs.google.com/document/d/1mpVjvQZRAkD-Ggy6-hcjXtBPiQoVbZGe3dLnAKgtJ4k/edit?usp=sharing]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23147) Stage page will throw exception when there's no complete tasks

2018-01-18 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-23147:

Description: 
Current Stage page's task table will throw an exception when there's no 
complete tasks, to reproduce this issue, user could submit code like:

{code}
sc.parallelize(1 to 20, 20).map { i => Thread.sleep(1); i }.collect()
{code}

Then open the UI and click into stage details. Below is the screenshot.

 !Screen Shot 2018-01-18 at 8.50.08 PM.png! 

Deep dive into the code, found that current UI can only show the completed 
tasks, it is different from 2.2 code.


  was:
Current Stage page's task table will throw an exception when there's no 
complete tasks, to reproduce this issue, user could submit code like:

{code}
sc.parallelize(1 to 20, 20).map { i => Thread.sleep(1); i }.collect()
{code}

Then open the UI and click into stage details.

Deep dive into the code, found that current UI can only show the completed 
tasks, it is different from 2.2 code.



> Stage page will throw exception when there's no complete tasks
> --
>
> Key: SPARK-23147
> URL: https://issues.apache.org/jira/browse/SPARK-23147
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
> Attachments: Screen Shot 2018-01-18 at 8.50.08 PM.png
>
>
> Current Stage page's task table will throw an exception when there's no 
> complete tasks, to reproduce this issue, user could submit code like:
> {code}
> sc.parallelize(1 to 20, 20).map { i => Thread.sleep(1); i }.collect()
> {code}
> Then open the UI and click into stage details. Below is the screenshot.
>  !Screen Shot 2018-01-18 at 8.50.08 PM.png! 
> Deep dive into the code, found that current UI can only show the completed 
> tasks, it is different from 2.2 code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-23147) Stage page will throw exception when there's no complete tasks

2018-01-18 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-23147:
---

 Summary: Stage page will throw exception when there's no complete 
tasks
 Key: SPARK-23147
 URL: https://issues.apache.org/jira/browse/SPARK-23147
 Project: Spark
  Issue Type: Bug
  Components: Web UI
Affects Versions: 2.3.0
Reporter: Saisai Shao
 Attachments: Screen Shot 2018-01-18 at 8.50.08 PM.png

Current Stage page's task table will throw an exception when there's no 
complete tasks, to reproduce this issue, user could submit code like:

{code}
sc.parallelize(1 to 20, 20).map { i => Thread.sleep(1); i }.collect()
{code}

Then open the UI and click into stage details.

Deep dive into the code, found that current UI can only show the completed 
tasks, it is different from 2.2 code.




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23147) Stage page will throw exception when there's no complete tasks

2018-01-18 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-23147:

Attachment: Screen Shot 2018-01-18 at 8.50.08 PM.png

> Stage page will throw exception when there's no complete tasks
> --
>
> Key: SPARK-23147
> URL: https://issues.apache.org/jira/browse/SPARK-23147
> Project: Spark
>  Issue Type: Bug
>  Components: Web UI
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
> Attachments: Screen Shot 2018-01-18 at 8.50.08 PM.png
>
>
> Current Stage page's task table will throw an exception when there's no 
> complete tasks, to reproduce this issue, user could submit code like:
> {code}
> sc.parallelize(1 to 20, 20).map { i => Thread.sleep(1); i }.collect()
> {code}
> Then open the UI and click into stage details.
> Deep dive into the code, found that current UI can only show the completed 
> tasks, it is different from 2.2 code.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-23140) DataSourceV2Strategy is missing in HiveSessionStateBuilder

2018-01-17 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-23140:

Description: 
DataSourceV2Strategy is not added into HiveSessionStateBuilder's planner, which 
will lead to exception when playing continuous query:

{noformat}
ERROR ContinuousExecution: Query abc [id = 
5cb6404a-e907-4662-b5d7-20037ccd6947, runId = 
617b8dea-018e-4082-935e-98d98d473fdd] terminated with error
java.lang.AssertionError: assertion failed: No plan for WriteToDataSourceV2 
org.apache.spark.sql.execution.streaming.sources.ContinuousMemoryWriter@3dba6d7c
+- StreamingDataSourceV2Relation [timestamp#15, value#16L], 
org.apache.spark.sql.execution.streaming.continuous.RateStreamContinuousReader@62ceac53

at scala.Predef$.assert(Predef.scala:170)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:78)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:75)
at 
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
at 
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at 
scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:75)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:67)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:84)
at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:80)
at 
org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:89)
at 
org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:89)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution$$anonfun$runContinuous$2.apply(ContinuousExecution.scala:221)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution$$anonfun$runContinuous$2.apply(ContinuousExecution.scala:212)
at 
org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271)
at 
org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution.runContinuous(ContinuousExecution.scala:212)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution.runActivatedStream(ContinuousExecution.scala:94)
at 
org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:279)
at 
org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:189)
{noformat}


> DataSourceV2Strategy is missing in HiveSessionStateBuilder
> --
>
> Key: SPARK-23140
> URL: https://issues.apache.org/jira/browse/SPARK-23140
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
>
> DataSourceV2Strategy is not added into HiveSessionStateBuilder's planner, 
> which will lead to exception when playing continuous query:
> {noformat}
> ERROR ContinuousExecution: Query abc [id = 
> 5cb6404a-e907-4662-b5d7-20037ccd6947, runId = 
> 617b8dea-018e-4082-935e-98d98d473fdd] terminated with error
> java.lang.AssertionError: assertion failed: No plan for WriteToDataSourceV2 
> org.apache.spark.sql.execution.streaming.sources.ContinuousMemoryWriter@3dba6d7c
> +- StreamingDataSourceV2Relation [timestamp#15, value#16L], 
> org.apache.spark.sql.execution.streaming.continuous.RateStreamContinuousReader@62ceac53
>   at scala.Predef$.assert(Predef.scala:170)
>   at 
> org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
>   at 
> org.apache.spark.sql.catalys

[jira] [Updated] (SPARK-23140) DataSourceV2Strategy is missing in HiveSessionStateBuilder

2018-01-17 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23140?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-23140:

Environment: (was: DataSourceV2Strategy is not added into 
HiveSessionStateBuilder's planner, which will lead to exception when playing 
continuous query:

{noformat}
ERROR ContinuousExecution: Query abc [id = 
5cb6404a-e907-4662-b5d7-20037ccd6947, runId = 
617b8dea-018e-4082-935e-98d98d473fdd] terminated with error
java.lang.AssertionError: assertion failed: No plan for WriteToDataSourceV2 
org.apache.spark.sql.execution.streaming.sources.ContinuousMemoryWriter@3dba6d7c
+- StreamingDataSourceV2Relation [timestamp#15, value#16L], 
org.apache.spark.sql.execution.streaming.continuous.RateStreamContinuousReader@62ceac53

at scala.Predef$.assert(Predef.scala:170)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:78)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:75)
at 
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
at 
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at 
scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:75)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:67)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:84)
at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:80)
at 
org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:89)
at 
org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:89)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution$$anonfun$runContinuous$2.apply(ContinuousExecution.scala:221)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution$$anonfun$runContinuous$2.apply(ContinuousExecution.scala:212)
at 
org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271)
at 
org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution.runContinuous(ContinuousExecution.scala:212)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution.runActivatedStream(ContinuousExecution.scala:94)
at 
org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:279)
at 
org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:189)
{noformat}
)

> DataSourceV2Strategy is missing in HiveSessionStateBuilder
> --
>
> Key: SPARK-23140
> URL: https://issues.apache.org/jira/browse/SPARK-23140
> Project: Spark
>  Issue Type: Bug
>  Components: SQL, Structured Streaming
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Priority: Major
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-23140) DataSourceV2Strategy is missing in HiveSessionStateBuilder

2018-01-17 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-23140:
---

 Summary: DataSourceV2Strategy is missing in HiveSessionStateBuilder
 Key: SPARK-23140
 URL: https://issues.apache.org/jira/browse/SPARK-23140
 Project: Spark
  Issue Type: Bug
  Components: SQL, Structured Streaming
Affects Versions: 2.3.0
 Environment: DataSourceV2Strategy is not added into 
HiveSessionStateBuilder's planner, which will lead to exception when playing 
continuous query:

{noformat}
ERROR ContinuousExecution: Query abc [id = 
5cb6404a-e907-4662-b5d7-20037ccd6947, runId = 
617b8dea-018e-4082-935e-98d98d473fdd] terminated with error
java.lang.AssertionError: assertion failed: No plan for WriteToDataSourceV2 
org.apache.spark.sql.execution.streaming.sources.ContinuousMemoryWriter@3dba6d7c
+- StreamingDataSourceV2Relation [timestamp#15, value#16L], 
org.apache.spark.sql.execution.streaming.continuous.RateStreamContinuousReader@62ceac53

at scala.Predef$.assert(Predef.scala:170)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:78)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2$$anonfun$apply$2.apply(QueryPlanner.scala:75)
at 
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
at 
scala.collection.TraversableOnce$$anonfun$foldLeft$1.apply(TraversableOnce.scala:157)
at scala.collection.Iterator$class.foreach(Iterator.scala:893)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1336)
at 
scala.collection.TraversableOnce$class.foldLeft(TraversableOnce.scala:157)
at scala.collection.AbstractIterator.foldLeft(Iterator.scala:1336)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:75)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner$$anonfun$2.apply(QueryPlanner.scala:67)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at 
org.apache.spark.sql.catalyst.planning.QueryPlanner.plan(QueryPlanner.scala:93)
at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan$lzycompute(QueryExecution.scala:84)
at 
org.apache.spark.sql.execution.QueryExecution.sparkPlan(QueryExecution.scala:80)
at 
org.apache.spark.sql.execution.QueryExecution.executedPlan$lzycompute(QueryExecution.scala:89)
at 
org.apache.spark.sql.execution.QueryExecution.executedPlan(QueryExecution.scala:89)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution$$anonfun$runContinuous$2.apply(ContinuousExecution.scala:221)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution$$anonfun$runContinuous$2.apply(ContinuousExecution.scala:212)
at 
org.apache.spark.sql.execution.streaming.ProgressReporter$class.reportTimeTaken(ProgressReporter.scala:271)
at 
org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:58)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution.runContinuous(ContinuousExecution.scala:212)
at 
org.apache.spark.sql.execution.streaming.continuous.ContinuousExecution.runActivatedStream(ContinuousExecution.scala:94)
at 
org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:279)
at 
org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:189)
{noformat}

Reporter: Saisai Shao






--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23123) Unable to run Spark Job with Hadoop NameNode Federation using ViewFS

2018-01-16 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328364#comment-16328364
 ] 

Saisai Shao commented on SPARK-23123:
-

[~ste...@apache.org], any thought on this issue?

> Unable to run Spark Job with Hadoop NameNode Federation using ViewFS
> 
>
> Key: SPARK-23123
> URL: https://issues.apache.org/jira/browse/SPARK-23123
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.3
>Reporter: Nihar Nayak
>Priority: Major
>  Labels: Hadoop, Spark
>
> Added following to core-site.xml in order to make use of ViewFS in a NameNode 
> federated cluster. 
> {noformat}
> 
>  fs.defaultFS
>  viewfs:///
>  
> 
>  fs.viewfs.mounttable.default.link./apps
>  hdfs://nameservice1/apps
>  
>  
>  fs.viewfs.mounttable.default.link./app-logs
>  hdfs://nameservice2/app-logs
>  
>  
>  fs.viewfs.mounttable.default.link./tmp
>  hdfs://nameservice2/tmp
>  
>  
>  fs.viewfs.mounttable.default.link./user
>  hdfs://nameservice2/user
>  
>  
>  fs.viewfs.mounttable.default.link./ns1/user
>  hdfs://nameservice1/user
>  
>  
>  fs.viewfs.mounttable.default.link./ns2/user
>  hdfs://nameservice2/user
>  
> {noformat}
> Got the following error .
> {noformat}
> spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client 
> --num-executors 3 --driver-memory 512m --executor-memory 512m 
> --executor-cores 1 ${SPARK_HOME}/lib/spark-examples*.jar 10
> 18/01/17 02:14:45 INFO spark.SparkContext: Added JAR 
> file:/home/nayak/hdp26_c4000_stg/spark2/lib/spark-examples_2.11-2.1.1.2.6.2.0-205.jar
>  at spark://x:35633/jars/spark-examples_2.11-2.1.1.2.6.2.0-205.jar with 
> timestamp 1516155285534
> 18/01/17 02:14:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm2
> 18/01/17 02:14:46 INFO yarn.Client: Requesting a new application from cluster 
> with 26 NodeManagers
> 18/01/17 02:14:46 INFO yarn.Client: Verifying our application has not 
> requested more than the maximum memory capability of the cluster (13800 MB 
> per container)
> 18/01/17 02:14:46 INFO yarn.Client: Will allocate AM container, with 896 MB 
> memory including 384 MB overhead
> 18/01/17 02:14:46 INFO yarn.Client: Setting up container launch context for 
> our AM
> 18/01/17 02:14:46 INFO yarn.Client: Setting up the launch environment for our 
> AM container
> 18/01/17 02:14:46 INFO yarn.Client: Preparing resources for our AM container
> 18/01/17 02:14:46 INFO security.HDFSCredentialProvider: getting token for 
> namenode: viewfs:/user/nayak
> 18/01/17 02:14:46 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 22488202 for nayak on ha-hdfs:nameservice1
> 18/01/17 02:14:46 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 50 
> for nayak on ha-hdfs:nameservice2
> 18/01/17 02:14:47 INFO hive.metastore: Trying to connect to metastore with 
> URI thrift://:9083
> 18/01/17 02:14:47 INFO hive.metastore: Connected to metastore.
> 18/01/17 02:14:49 INFO security.HiveCredentialProvider: Get Token from hive 
> metastore: Kind: HIVE_DELEGATION_TOKEN, Service: , Ident: 00 29 6e 61 79 61 
> 6b 6e 69 68 61 72 72 61 30 31 40 53 54 47 32 30 30 30 2e 48 41 44 4f 4f 50 2e 
> 52 41 4b 55 54 45 4e 2e 43 4f 4d 04 68 69 76 65 00 8a 01 61 01 e5 be 03 8a 01 
> 61 25 f2 42 03 8d 02 21 bb 8e 02 b7
> 18/01/17 02:14:49 WARN yarn.Client: Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 18/01/17 02:14:50 INFO yarn.Client: Uploading resource 
> file:/tmp/spark-7498ee81-d22b-426e-9466-3a08f7c827b1/__spark_libs__6643608006679813597.zip
>  -> 
> viewfs:/user/nayak/.sparkStaging/application_1515035441414_275503/__spark_libs__6643608006679813597.zip
> 18/01/17 02:14:55 INFO yarn.Client: Uploading resource 
> file:/tmp/spark-7498ee81-d22b-426e-9466-3a08f7c827b1/__spark_conf__405432153902988742.zip
>  -> 
> viewfs:/user/nayak/.sparkStaging/application_1515035441414_275503/__spark_conf__.zip
> 18/01/17 02:14:55 INFO spark.SecurityManager: Changing view acls to: nayak
> 18/01/17 02:14:55 INFO spark.SecurityManager: Changing modify acls to: 
> nayak
> 18/01/17 02:14:55 INFO spark.SecurityManager: Changing view acls groups to:
> 18/01/17 02:14:55 INFO spark.SecurityManager: Changing modify acls groups to:
> 18/01/17 02:14:55 INFO spark.SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; users  with view permissi

[jira] [Commented] (SPARK-23123) Unable to run Spark Job with Hadoop NameNode Federation using ViewFS

2018-01-16 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328361#comment-16328361
 ] 

Saisai Shao commented on SPARK-23123:
-

>From the stack, looks like the issue is from YARN to communicate with HDFS to 
>distribute application dependencies. So I'm guessing it is more like a YARN 
>issue communicating to HDFS. Why don't you also create a YARN issue, YARN 
>community will have a more deep insight about this issue.

> Unable to run Spark Job with Hadoop NameNode Federation using ViewFS
> 
>
> Key: SPARK-23123
> URL: https://issues.apache.org/jira/browse/SPARK-23123
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.3
>Reporter: Nihar Nayak
>Priority: Major
>  Labels: Hadoop, Spark
>
> Added following to core-site.xml in order to make use of ViewFS in a NameNode 
> federated cluster. 
> {noformat}
> 
>  fs.defaultFS
>  viewfs:///
>  
> 
>  fs.viewfs.mounttable.default.link./apps
>  hdfs://nameservice1/apps
>  
>  
>  fs.viewfs.mounttable.default.link./app-logs
>  hdfs://nameservice2/app-logs
>  
>  
>  fs.viewfs.mounttable.default.link./tmp
>  hdfs://nameservice2/tmp
>  
>  
>  fs.viewfs.mounttable.default.link./user
>  hdfs://nameservice2/user
>  
>  
>  fs.viewfs.mounttable.default.link./ns1/user
>  hdfs://nameservice1/user
>  
>  
>  fs.viewfs.mounttable.default.link./ns2/user
>  hdfs://nameservice2/user
>  
> {noformat}
> Got the following error .
> {noformat}
> spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client 
> --num-executors 3 --driver-memory 512m --executor-memory 512m 
> --executor-cores 1 ${SPARK_HOME}/lib/spark-examples*.jar 10
> 18/01/17 02:14:45 INFO spark.SparkContext: Added JAR 
> file:/home/nayak/hdp26_c4000_stg/spark2/lib/spark-examples_2.11-2.1.1.2.6.2.0-205.jar
>  at spark://x:35633/jars/spark-examples_2.11-2.1.1.2.6.2.0-205.jar with 
> timestamp 1516155285534
> 18/01/17 02:14:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm2
> 18/01/17 02:14:46 INFO yarn.Client: Requesting a new application from cluster 
> with 26 NodeManagers
> 18/01/17 02:14:46 INFO yarn.Client: Verifying our application has not 
> requested more than the maximum memory capability of the cluster (13800 MB 
> per container)
> 18/01/17 02:14:46 INFO yarn.Client: Will allocate AM container, with 896 MB 
> memory including 384 MB overhead
> 18/01/17 02:14:46 INFO yarn.Client: Setting up container launch context for 
> our AM
> 18/01/17 02:14:46 INFO yarn.Client: Setting up the launch environment for our 
> AM container
> 18/01/17 02:14:46 INFO yarn.Client: Preparing resources for our AM container
> 18/01/17 02:14:46 INFO security.HDFSCredentialProvider: getting token for 
> namenode: viewfs:/user/nayak
> 18/01/17 02:14:46 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 22488202 for nayak on ha-hdfs:nameservice1
> 18/01/17 02:14:46 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 50 
> for nayak on ha-hdfs:nameservice2
> 18/01/17 02:14:47 INFO hive.metastore: Trying to connect to metastore with 
> URI thrift://:9083
> 18/01/17 02:14:47 INFO hive.metastore: Connected to metastore.
> 18/01/17 02:14:49 INFO security.HiveCredentialProvider: Get Token from hive 
> metastore: Kind: HIVE_DELEGATION_TOKEN, Service: , Ident: 00 29 6e 61 79 61 
> 6b 6e 69 68 61 72 72 61 30 31 40 53 54 47 32 30 30 30 2e 48 41 44 4f 4f 50 2e 
> 52 41 4b 55 54 45 4e 2e 43 4f 4d 04 68 69 76 65 00 8a 01 61 01 e5 be 03 8a 01 
> 61 25 f2 42 03 8d 02 21 bb 8e 02 b7
> 18/01/17 02:14:49 WARN yarn.Client: Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 18/01/17 02:14:50 INFO yarn.Client: Uploading resource 
> file:/tmp/spark-7498ee81-d22b-426e-9466-3a08f7c827b1/__spark_libs__6643608006679813597.zip
>  -> 
> viewfs:/user/nayak/.sparkStaging/application_1515035441414_275503/__spark_libs__6643608006679813597.zip
> 18/01/17 02:14:55 INFO yarn.Client: Uploading resource 
> file:/tmp/spark-7498ee81-d22b-426e-9466-3a08f7c827b1/__spark_conf__405432153902988742.zip
>  -> 
> viewfs:/user/nayak/.sparkStaging/application_1515035441414_275503/__spark_conf__.zip
> 18/01/17 02:14:55 INFO spark.SecurityManager: Changing view acls to: nayak
> 18/01/17 02:14:55 INFO spark.SecurityManager: Changing modify acls to: 
> nayak
> 18/01/17 02:14:55 INFO spark.Securi

[jira] [Commented] (SPARK-23123) Unable to run Spark Job with Hadoop NameNode Federation using ViewFS

2018-01-16 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23123?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328297#comment-16328297
 ] 

Saisai Shao commented on SPARK-23123:
-

It looks like a Hadoop issue rather than a Spark issue from the stack.

> Unable to run Spark Job with Hadoop NameNode Federation using ViewFS
> 
>
> Key: SPARK-23123
> URL: https://issues.apache.org/jira/browse/SPARK-23123
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.3
>Reporter: Nihar Nayak
>Priority: Major
>  Labels: Hadoop, Spark
>
> Added following to core-site.xml in order to make use of ViewFS in a NameNode 
> federated cluster. 
> {noformat}
> 
>  fs.defaultFS
>  viewfs:///
>  
> 
>  fs.viewfs.mounttable.default.link./apps
>  hdfs://nameservice1/apps
>  
>  
>  fs.viewfs.mounttable.default.link./app-logs
>  hdfs://nameservice2/app-logs
>  
>  
>  fs.viewfs.mounttable.default.link./tmp
>  hdfs://nameservice2/tmp
>  
>  
>  fs.viewfs.mounttable.default.link./user
>  hdfs://nameservice2/user
>  
>  
>  fs.viewfs.mounttable.default.link./ns1/user
>  hdfs://nameservice1/user
>  
>  
>  fs.viewfs.mounttable.default.link./ns2/user
>  hdfs://nameservice2/user
>  
> {noformat}
> Got the following error .
> {noformat}
> spark-submit --class org.apache.spark.examples.SparkPi --master yarn-client 
> --num-executors 3 --driver-memory 512m --executor-memory 512m 
> --executor-cores 1 ${SPARK_HOME}/lib/spark-examples*.jar 10
> 18/01/17 02:14:45 INFO spark.SparkContext: Added JAR 
> file:/home/nayak/hdp26_c4000_stg/spark2/lib/spark-examples_2.11-2.1.1.2.6.2.0-205.jar
>  at spark://x:35633/jars/spark-examples_2.11-2.1.1.2.6.2.0-205.jar with 
> timestamp 1516155285534
> 18/01/17 02:14:46 INFO client.ConfiguredRMFailoverProxyProvider: Failing over 
> to rm2
> 18/01/17 02:14:46 INFO yarn.Client: Requesting a new application from cluster 
> with 26 NodeManagers
> 18/01/17 02:14:46 INFO yarn.Client: Verifying our application has not 
> requested more than the maximum memory capability of the cluster (13800 MB 
> per container)
> 18/01/17 02:14:46 INFO yarn.Client: Will allocate AM container, with 896 MB 
> memory including 384 MB overhead
> 18/01/17 02:14:46 INFO yarn.Client: Setting up container launch context for 
> our AM
> 18/01/17 02:14:46 INFO yarn.Client: Setting up the launch environment for our 
> AM container
> 18/01/17 02:14:46 INFO yarn.Client: Preparing resources for our AM container
> 18/01/17 02:14:46 INFO security.HDFSCredentialProvider: getting token for 
> namenode: viewfs:/user/nayak
> 18/01/17 02:14:46 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 
> 22488202 for nayak on ha-hdfs:nameservice1
> 18/01/17 02:14:46 INFO hdfs.DFSClient: Created HDFS_DELEGATION_TOKEN token 50 
> for nayak on ha-hdfs:nameservice2
> 18/01/17 02:14:47 INFO hive.metastore: Trying to connect to metastore with 
> URI thrift://:9083
> 18/01/17 02:14:47 INFO hive.metastore: Connected to metastore.
> 18/01/17 02:14:49 INFO security.HiveCredentialProvider: Get Token from hive 
> metastore: Kind: HIVE_DELEGATION_TOKEN, Service: , Ident: 00 29 6e 61 79 61 
> 6b 6e 69 68 61 72 72 61 30 31 40 53 54 47 32 30 30 30 2e 48 41 44 4f 4f 50 2e 
> 52 41 4b 55 54 45 4e 2e 43 4f 4d 04 68 69 76 65 00 8a 01 61 01 e5 be 03 8a 01 
> 61 25 f2 42 03 8d 02 21 bb 8e 02 b7
> 18/01/17 02:14:49 WARN yarn.Client: Neither spark.yarn.jars nor 
> spark.yarn.archive is set, falling back to uploading libraries under 
> SPARK_HOME.
> 18/01/17 02:14:50 INFO yarn.Client: Uploading resource 
> file:/tmp/spark-7498ee81-d22b-426e-9466-3a08f7c827b1/__spark_libs__6643608006679813597.zip
>  -> 
> viewfs:/user/nayak/.sparkStaging/application_1515035441414_275503/__spark_libs__6643608006679813597.zip
> 18/01/17 02:14:55 INFO yarn.Client: Uploading resource 
> file:/tmp/spark-7498ee81-d22b-426e-9466-3a08f7c827b1/__spark_conf__405432153902988742.zip
>  -> 
> viewfs:/user/nayak/.sparkStaging/application_1515035441414_275503/__spark_conf__.zip
> 18/01/17 02:14:55 INFO spark.SecurityManager: Changing view acls to: nayak
> 18/01/17 02:14:55 INFO spark.SecurityManager: Changing modify acls to: 
> nayak
> 18/01/17 02:14:55 INFO spark.SecurityManager: Changing view acls groups to:
> 18/01/17 02:14:55 INFO spark.SecurityManager: Changing modify acls groups to:
> 18/01/17 02:14:55 INFO spark.SecurityManager: SecurityManager: authentication 
> disabled; ui acls disabled; 

[jira] [Commented] (LIVY-434) Livy 0.5-incubating Release

2018-01-16 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-434?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16328089#comment-16328089
 ] 

Saisai Shao commented on LIVY-434:
--

Great, LGTM.:)

> Livy 0.5-incubating Release
> ---
>
> Key: LIVY-434
> URL: https://issues.apache.org/jira/browse/LIVY-434
> Project: Livy
>  Issue Type: Task
>Reporter: Alex Bozarth
>Priority: Major
>
> A location to track the 0.5 release process.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (SPARK-22958) Spark is stuck when the only one executor fails to register with driver

2018-01-15 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16325986#comment-16325986
 ] 

Saisai Shao commented on SPARK-22958:
-

Please see this line:

https://github.com/apache/spark/blob/9a96bfc8bf021cb4b6c62fac6ce1bcf87affcd43/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnAllocator.scala#L572

> Spark is stuck when the only one executor fails to register with driver
> ---
>
> Key: SPARK-22958
> URL: https://issues.apache.org/jira/browse/SPARK-22958
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.1.0
>Reporter: Shaoquan Zhang
>Priority: Major
> Attachments: How new executor is registered.png
>
>
> We have encountered the following scenario. We run a very simple job in yarn 
> cluster mode. This job needs only one executor to complete. In the running, 
> this job was stuck forever.
> After checking the job log, we found an issue in the Spark. When executor 
> fails to register with driver, YarnAllocator is blind to know it. As a 
> result, the variable (numExecutorsRunning) maintained by YarnAllocator does 
> not reflect the truth. When this variable is used to allocate resources to 
> the running job, misunderstanding happens. As for our job, the 
> misunderstanding results in forever stuck.
> The more details are as follows. The following figure shows how executor is 
> allocated when the job starts to run. Now suppose only one executor is 
> needed. In the figure, step 1,2,3 show how the executor is allocated. After 
> the executor is allocated, it needs to register with the driver (step 4) and 
> the driver responses to it (step 5). After the 5 steps, the executor can be 
> used to run tasks.
> !How new executor is registered.png!
> In YarnAllocator, when step 3 is finished, it will increase the the variable 
> "numExecutorsRunning" by one  as shown in the following code.
> {code:java}
> def updateInternalState(): Unit = synchronized {
> // increase the numExecutorsRunning 
> numExecutorsRunning += 1
> executorIdToContainer(executorId) = container
> containerIdToExecutorId(container.getId) = executorId
> val containerSet = 
> allocatedHostToContainersMap.getOrElseUpdate(executorHostname,
>   new HashSet[ContainerId])
> containerSet += containerId
> allocatedContainerToHostMap.put(containerId, executorHostname)
>   }
>   if (numExecutorsRunning < targetNumExecutors) {
> if (launchContainers) {
>   launcherPool.execute(new Runnable {
> override def run(): Unit = {
>   try {
> new ExecutorRunnable(
>   Some(container),
>   conf,
>   sparkConf,
>   driverUrl,
>   executorId,
>   executorHostname,
>   executorMemory,
>   executorCores,
>   appAttemptId.getApplicationId.toString,
>   securityMgr,
>   localResources
> ).run()
> // step 3 is finished
> updateInternalState()
>   } catch {
> case NonFatal(e) =>
>   logError(s"Failed to launch executor $executorId on 
> container $containerId", e)
>   // Assigned container should be released immediately to 
> avoid unnecessary resource
>   // occupation.
>   amClient.releaseAssignedContainer(containerId)
>   }
> }
>   })
> } else {
>   // For test only
>   updateInternalState()
> }
>   } else {
> logInfo(("Skip launching executorRunnable as runnning Excecutors 
> count: %d " +
>   "reached target Executors count: %d.").format(numExecutorsRunning, 
> targetNumExecutors))
>   }
> {code}
>
> Imagine the step 3 successes, but the step 4 is failed due to some reason 
> (for example network fluctuation). The variable "numExecutorsRunning" is 
> equal to 1. But, the fact is no executor is running. So, The variable 
> "numExecutorsRunning" does not reflect the real number of running executors. 
> For YarnAllocator, because the variable is equal to 1, it does not allocate 
> any new executor even though no executor is actually running. If one job only 
> needs one executor to complete, it will stuck forever since no executor runs 
> its tasks. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23029) Setting spark.shuffle.file.buffer will make the shuffle fail

2018-01-12 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323872#comment-16323872
 ] 

Saisai Shao commented on SPARK-23029:
-

There should be no issue unless you want to fix doc issue...

> Setting spark.shuffle.file.buffer will make the shuffle fail
> 
>
> Key: SPARK-23029
> URL: https://issues.apache.org/jira/browse/SPARK-23029
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Fernando Pereira
>
> When setting the spark.shuffle.file.buffer setting, even to its default 
> value, shuffles fail.
> This appears to affect small to medium size partitions. Strangely the error 
> message is OutOfMemoryError, but it works with large partitions (at least 
> >32MB).
> {code}
> pyspark --conf "spark.shuffle.file.buffer=$((32*1024))"
> /gpfs/bbp.cscs.ch/scratch/gss/spykfunc/_sparkenv/lib/python2.7/site-packages/pyspark/bin/spark-submit
>  pyspark-shell-main --name PySparkShell --conf spark.shuffle.file.buffer=32768
> version 2.2.1
> >>> spark.range(1e7, numPartitions=10).sort("id").write.parquet("a", 
> >>> mode="overwrite")
> [Stage 1:>(0 + 10) / 
> 10]18/01/10 19:34:21 ERROR Executor: Exception in task 1.0 in stage 1.0 (TID 
> 11)
> java.lang.OutOfMemoryError: Java heap space
>   at java.io.BufferedOutputStream.(BufferedOutputStream.java:75)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter$ManualCloseBufferedOutputStream$1.(DiskBlockObjectWriter.scala:107)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:108)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:116)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:237)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>   at org.apache.spark.scheduler.Task.run(Task.scala:108)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23029) Setting spark.shuffle.file.buffer will make the shuffle fail

2018-01-12 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323871#comment-16323871
 ] 

Saisai Shao commented on SPARK-23029:
-

Total buffer size for one executor is {{32MB * C (core) * R (reducer)}}, if you 
have 10 cores with 200 reducers (by default), then the total buffer size is 32 
* 2000 MB, which is 64GB. That's probably why you met this OOM even for 64GB 
memory.

> Setting spark.shuffle.file.buffer will make the shuffle fail
> 
>
> Key: SPARK-23029
> URL: https://issues.apache.org/jira/browse/SPARK-23029
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Fernando Pereira
>
> When setting the spark.shuffle.file.buffer setting, even to its default 
> value, shuffles fail.
> This appears to affect small to medium size partitions. Strangely the error 
> message is OutOfMemoryError, but it works with large partitions (at least 
> >32MB).
> {code}
> pyspark --conf "spark.shuffle.file.buffer=$((32*1024))"
> /gpfs/bbp.cscs.ch/scratch/gss/spykfunc/_sparkenv/lib/python2.7/site-packages/pyspark/bin/spark-submit
>  pyspark-shell-main --name PySparkShell --conf spark.shuffle.file.buffer=32768
> version 2.2.1
> >>> spark.range(1e7, numPartitions=10).sort("id").write.parquet("a", 
> >>> mode="overwrite")
> [Stage 1:>(0 + 10) / 
> 10]18/01/10 19:34:21 ERROR Executor: Exception in task 1.0 in stage 1.0 (TID 
> 11)
> java.lang.OutOfMemoryError: Java heap space
>   at java.io.BufferedOutputStream.(BufferedOutputStream.java:75)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter$ManualCloseBufferedOutputStream$1.(DiskBlockObjectWriter.scala:107)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:108)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:116)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:237)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>   at org.apache.spark.scheduler.Task.run(Task.scala:108)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-21475) Change to use NIO's Files API for external shuffle service

2018-01-11 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-21475?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323590#comment-16323590
 ] 

Saisai Shao commented on SPARK-21475:
-

[~zsxwing] is 3.0.0 the valid fix version?

> Change to use NIO's Files API for external shuffle service
> --
>
> Key: SPARK-21475
> URL: https://issues.apache.org/jira/browse/SPARK-21475
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle, Spark Core
>Affects Versions: 2.3.0
>Reporter: Saisai Shao
>Assignee: Saisai Shao
>Priority: Minor
> Fix For: 2.3.0, 3.0.0
>
>
> Java's {{FileInputStream}} and {{FileOutputStream}} overrides {{finalize()}}, 
> even this file input/output stream is closed correctly and promptly, it will 
> still leave some memory footprints which will get cleaned in Full GC. This 
> will introduce two side effects:
> 1. Lots of memory footprints regarding to Finalizer will be kept in memory 
> and this will increase the memory overhead. In our use case of external 
> shuffle service, a busy shuffle service will have bunch of this object and 
> potentially lead to OOM.
> 2. The Finalizer will only be called in Full GC, and this will increase the 
> overhead of Full GC and lead to long GC pause.
> So to fix this potential issue, here propose to use NIO's 
> Files#newInput/OutputStream instead in some critical paths like shuffle.
> https://www.cloudbees.com/blog/fileinputstream-fileoutputstream-considered-harmful



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22958) Spark is stuck when the only one executor fails to register with driver

2018-01-11 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323569#comment-16323569
 ] 

Saisai Shao commented on SPARK-22958:
-

If executor is failed to register itself to driver, it will exit itself after 
timeout. In your case for Spark on YARN, the exit of container will be detected 
by NM and report back to RM and AM, then AM will readjust the running executor 
number and launch a new executor. So I doubt the issue you met may not be 
exactly the same as you described above.

> Spark is stuck when the only one executor fails to register with driver
> ---
>
> Key: SPARK-22958
> URL: https://issues.apache.org/jira/browse/SPARK-22958
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.1.0
>Reporter: Shaoquan Zhang
> Attachments: How new executor is registered.png
>
>
> We have encountered the following scenario. We run a very simple job in yarn 
> cluster mode. This job needs only one executor to complete. In the running, 
> this job was stuck forever.
> After checking the job log, we found an issue in the Spark. When executor 
> fails to register with driver, YarnAllocator is blind to know it. As a 
> result, the variable (numExecutorsRunning) maintained by YarnAllocator does 
> not reflect the truth. When this variable is used to allocate resources to 
> the running job, misunderstanding happens. As for our job, the 
> misunderstanding results in forever stuck.
> The more details are as follows. The following figure shows how executor is 
> allocated when the job starts to run. Now suppose only one executor is 
> needed. In the figure, step 1,2,3 show how the executor is allocated. After 
> the executor is allocated, it needs to register with the driver (step 4) and 
> the driver responses to it (step 5). After the 5 steps, the executor can be 
> used to run tasks.
> !How new executor is registered.png!
> In YarnAllocator, when step 3 is finished, it will increase the the variable 
> "numExecutorsRunning" by one  as shown in the following code.
> {code:java}
> def updateInternalState(): Unit = synchronized {
> // increase the numExecutorsRunning 
> numExecutorsRunning += 1
> executorIdToContainer(executorId) = container
> containerIdToExecutorId(container.getId) = executorId
> val containerSet = 
> allocatedHostToContainersMap.getOrElseUpdate(executorHostname,
>   new HashSet[ContainerId])
> containerSet += containerId
> allocatedContainerToHostMap.put(containerId, executorHostname)
>   }
>   if (numExecutorsRunning < targetNumExecutors) {
> if (launchContainers) {
>   launcherPool.execute(new Runnable {
> override def run(): Unit = {
>   try {
> new ExecutorRunnable(
>   Some(container),
>   conf,
>   sparkConf,
>   driverUrl,
>   executorId,
>   executorHostname,
>   executorMemory,
>   executorCores,
>   appAttemptId.getApplicationId.toString,
>   securityMgr,
>   localResources
> ).run()
> // step 3 is finished
> updateInternalState()
>   } catch {
> case NonFatal(e) =>
>   logError(s"Failed to launch executor $executorId on 
> container $containerId", e)
>   // Assigned container should be released immediately to 
> avoid unnecessary resource
>   // occupation.
>   amClient.releaseAssignedContainer(containerId)
>   }
> }
>   })
> } else {
>   // For test only
>   updateInternalState()
> }
>   } else {
> logInfo(("Skip launching executorRunnable as runnning Excecutors 
> count: %d " +
>   "reached target Executors count: %d.").format(numExecutorsRunning, 
> targetNumExecutors))
>   }
> {code}
>
> Imagine the step 3 successes, but the step 4 is failed due to some reason 
> (for example network fluctuation). The variable "numExecutorsRunning" is 
> equal to 1. But, the fact is no executor is running. So, The variable 
> "numExecutorsRunning" does not reflect the real number of running executors. 
> For YarnAllocator, because the variable is e

[jira] [Resolved] (SPARK-23029) Setting spark.shuffle.file.buffer will make the shuffle fail

2018-01-11 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-23029?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved SPARK-23029.
-
Resolution: Not A Problem

> Setting spark.shuffle.file.buffer will make the shuffle fail
> 
>
> Key: SPARK-23029
> URL: https://issues.apache.org/jira/browse/SPARK-23029
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Fernando Pereira
>
> When setting the spark.shuffle.file.buffer setting, even to its default 
> value, shuffles fail.
> This appears to affect small to medium size partitions. Strangely the error 
> message is OutOfMemoryError, but it works with large partitions (at least 
> >32MB).
> {code}
> pyspark --conf "spark.shuffle.file.buffer=$((32*1024))"
> /gpfs/bbp.cscs.ch/scratch/gss/spykfunc/_sparkenv/lib/python2.7/site-packages/pyspark/bin/spark-submit
>  pyspark-shell-main --name PySparkShell --conf spark.shuffle.file.buffer=32768
> version 2.2.1
> >>> spark.range(1e7, numPartitions=10).sort("id").write.parquet("a", 
> >>> mode="overwrite")
> [Stage 1:>(0 + 10) / 
> 10]18/01/10 19:34:21 ERROR Executor: Exception in task 1.0 in stage 1.0 (TID 
> 11)
> java.lang.OutOfMemoryError: Java heap space
>   at java.io.BufferedOutputStream.(BufferedOutputStream.java:75)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter$ManualCloseBufferedOutputStream$1.(DiskBlockObjectWriter.scala:107)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:108)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:116)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:237)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>   at org.apache.spark.scheduler.Task.run(Task.scala:108)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-23029) Setting spark.shuffle.file.buffer will make the shuffle fail

2018-01-11 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-23029?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16323496#comment-16323496
 ] 

Saisai Shao commented on SPARK-23029:
-

Please see the comments in the code in {{BypassMergeSortShuffleWriter}}:

{code}
// Use getSizeAsKb (not bytes) to maintain backwards compatibility if no 
units are provided
this.fileBufferSize = (int) conf.getSizeAsKb("spark.shuffle.file.buffer", 
"32k") * 1024;
{code}

Here in your configuration, it means 32768KB buffer size for each file handler, 
I think you need to set to "32k" or "32" instead.

It's actually not a problem.

> Setting spark.shuffle.file.buffer will make the shuffle fail
> 
>
> Key: SPARK-23029
> URL: https://issues.apache.org/jira/browse/SPARK-23029
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.1
>Reporter: Fernando Pereira
>
> When setting the spark.shuffle.file.buffer setting, even to its default 
> value, shuffles fail.
> This appears to affect small to medium size partitions. Strangely the error 
> message is OutOfMemoryError, but it works with large partitions (at least 
> >32MB).
> {code}
> pyspark --conf "spark.shuffle.file.buffer=$((32*1024))"
> /gpfs/bbp.cscs.ch/scratch/gss/spykfunc/_sparkenv/lib/python2.7/site-packages/pyspark/bin/spark-submit
>  pyspark-shell-main --name PySparkShell --conf spark.shuffle.file.buffer=32768
> version 2.2.1
> >>> spark.range(1e7, numPartitions=10).sort("id").write.parquet("a", 
> >>> mode="overwrite")
> [Stage 1:>(0 + 10) / 
> 10]18/01/10 19:34:21 ERROR Executor: Exception in task 1.0 in stage 1.0 (TID 
> 11)
> java.lang.OutOfMemoryError: Java heap space
>   at java.io.BufferedOutputStream.(BufferedOutputStream.java:75)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter$ManualCloseBufferedOutputStream$1.(DiskBlockObjectWriter.scala:107)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.initialize(DiskBlockObjectWriter.scala:108)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.open(DiskBlockObjectWriter.scala:116)
>   at 
> org.apache.spark.storage.DiskBlockObjectWriter.write(DiskBlockObjectWriter.scala:237)
>   at 
> org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:96)
>   at 
> org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
>   at org.apache.spark.scheduler.Task.run(Task.scala:108)
>   at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:338)
>   at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>   at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>   at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22921) Merge script should prompt for assigning jiras

2018-01-10 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22921?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16321669#comment-16321669
 ] 

Saisai Shao commented on SPARK-22921:
-

Hi [~irashid], looks like the changes will throw an exception when the assignee 
is not yet a contributor. Please see the stack.

{code}
Traceback (most recent call last):
  File "./dev/merge_spark_pr.py", line 501, in 
main()
  File "./dev/merge_spark_pr.py", line 487, in main
resolve_jira_issues(title, merged_refs, jira_comment)
  File "./dev/merge_spark_pr.py", line 327, in resolve_jira_issues
resolve_jira_issue(merge_branches, comment, jira_id)
  File "./dev/merge_spark_pr.py", line 245, in resolve_jira_issue
cur_assignee = choose_jira_assignee(issue, asf_jira)
  File "./dev/merge_spark_pr.py", line 317, in choose_jira_assignee
asf_jira.assign_issue(issue.key, assignee.key)
  File "/Library/Python/2.7/site-packages/jira/client.py", line 108, in wrapper
result = func(*arg_list, **kwargs)
{code}

> Merge script should prompt for assigning jiras
> --
>
> Key: SPARK-22921
> URL: https://issues.apache.org/jira/browse/SPARK-22921
> Project: Spark
>  Issue Type: Improvement
>  Components: Project Infra
>Affects Versions: 2.3.0
>Reporter: Imran Rashid
>Assignee: Imran Rashid
>Priority: Trivial
> Fix For: 2.3.0
>
>
> Its a bit of a nuisance to have to go into jira to assign the issue when you 
> merge a pr.  In general you assign to either the original reporter or a 
> commentor, would be nice if the merge script made that easy to do.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-22587) Spark job fails if fs.defaultFS and application jar are different url

2018-01-10 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved SPARK-22587.
-
Resolution: Fixed

> Spark job fails if fs.defaultFS and application jar are different url
> -
>
> Key: SPARK-22587
> URL: https://issues.apache.org/jira/browse/SPARK-22587
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.3
>Reporter: Prabhu Joseph
>Assignee: Mingjie Tang
>
> Spark Job fails if the fs.defaultFs and url where application jar resides are 
> different and having same scheme,
> spark-submit  --conf spark.master=yarn-cluster wasb://XXX/tmp/test.py
> core-site.xml fs.defaultFS is set to wasb:///YYY. Hadoop list works (hadoop 
> fs -ls) works for both the url XXX and YYY.
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: 
> wasb://XXX/tmp/test.py, expected: wasb://YYY 
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:665) 
> at 
> org.apache.hadoop.fs.azure.NativeAzureFileSystem.checkPath(NativeAzureFileSystem.java:1251)
>  
> at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:485) 
> at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:396) 
> at 
> org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:507)
>  
> at 
> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:660) 
> at 
> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:912)
>  
> at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:172) 
> at org.apache.spark.deploy.yarn.Client.run(Client.scala:1248) 
> at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1307) 
> at org.apache.spark.deploy.yarn.Client.main(Client.scala) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:498) 
> at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:751)
>  
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) 
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) 
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) 
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 
> {code}
> The code Client.copyFileToRemote tries to resolve the path of application jar 
> (XXX) from the FileSystem object created using fs.defaultFS url (YYY) instead 
> of the actual url of application jar.
> val destFs = destDir.getFileSystem(hadoopConf)
> val srcFs = srcPath.getFileSystem(hadoopConf)
> getFileSystem will create the filesystem based on the url of the path and so 
> this is fine. But the below lines of code tries to get the srcPath (XXX url) 
> from the destFs (YYY url) and so it fails.
> var destPath = srcPath
> val qualifiedDestPath = destFs.makeQualified(destPath)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22587) Spark job fails if fs.defaultFS and application jar are different url

2018-01-10 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-22587:

Fix Version/s: 2.3.0

> Spark job fails if fs.defaultFS and application jar are different url
> -
>
> Key: SPARK-22587
> URL: https://issues.apache.org/jira/browse/SPARK-22587
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.3
>Reporter: Prabhu Joseph
>Assignee: Mingjie Tang
> Fix For: 2.3.0
>
>
> Spark Job fails if the fs.defaultFs and url where application jar resides are 
> different and having same scheme,
> spark-submit  --conf spark.master=yarn-cluster wasb://XXX/tmp/test.py
> core-site.xml fs.defaultFS is set to wasb:///YYY. Hadoop list works (hadoop 
> fs -ls) works for both the url XXX and YYY.
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: 
> wasb://XXX/tmp/test.py, expected: wasb://YYY 
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:665) 
> at 
> org.apache.hadoop.fs.azure.NativeAzureFileSystem.checkPath(NativeAzureFileSystem.java:1251)
>  
> at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:485) 
> at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:396) 
> at 
> org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:507)
>  
> at 
> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:660) 
> at 
> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:912)
>  
> at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:172) 
> at org.apache.spark.deploy.yarn.Client.run(Client.scala:1248) 
> at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1307) 
> at org.apache.spark.deploy.yarn.Client.main(Client.scala) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:498) 
> at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:751)
>  
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) 
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) 
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) 
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 
> {code}
> The code Client.copyFileToRemote tries to resolve the path of application jar 
> (XXX) from the FileSystem object created using fs.defaultFS url (YYY) instead 
> of the actual url of application jar.
> val destFs = destDir.getFileSystem(hadoopConf)
> val srcFs = srcPath.getFileSystem(hadoopConf)
> getFileSystem will create the filesystem based on the url of the path and so 
> this is fine. But the below lines of code tries to get the srcPath (XXX url) 
> from the destFs (YYY url) and so it fails.
> var destPath = srcPath
> val qualifiedDestPath = destFs.makeQualified(destPath)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-22587) Spark job fails if fs.defaultFS and application jar are different url

2018-01-10 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22587?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao reassigned SPARK-22587:
---

Assignee: Mingjie Tang

> Spark job fails if fs.defaultFS and application jar are different url
> -
>
> Key: SPARK-22587
> URL: https://issues.apache.org/jira/browse/SPARK-22587
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.3
>Reporter: Prabhu Joseph
>Assignee: Mingjie Tang
>
> Spark Job fails if the fs.defaultFs and url where application jar resides are 
> different and having same scheme,
> spark-submit  --conf spark.master=yarn-cluster wasb://XXX/tmp/test.py
> core-site.xml fs.defaultFS is set to wasb:///YYY. Hadoop list works (hadoop 
> fs -ls) works for both the url XXX and YYY.
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: 
> wasb://XXX/tmp/test.py, expected: wasb://YYY 
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:665) 
> at 
> org.apache.hadoop.fs.azure.NativeAzureFileSystem.checkPath(NativeAzureFileSystem.java:1251)
>  
> at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:485) 
> at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:396) 
> at 
> org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:507)
>  
> at 
> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:660) 
> at 
> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:912)
>  
> at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:172) 
> at org.apache.spark.deploy.yarn.Client.run(Client.scala:1248) 
> at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1307) 
> at org.apache.spark.deploy.yarn.Client.main(Client.scala) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:498) 
> at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:751)
>  
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) 
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) 
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) 
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 
> {code}
> The code Client.copyFileToRemote tries to resolve the path of application jar 
> (XXX) from the FileSystem object created using fs.defaultFS url (YYY) instead 
> of the actual url of application jar.
> val destFs = destDir.getFileSystem(hadoopConf)
> val srcFs = srcPath.getFileSystem(hadoopConf)
> getFileSystem will create the filesystem based on the url of the path and so 
> this is fine. But the below lines of code tries to get the srcPath (XXX url) 
> from the destFs (YYY url) and so it fails.
> var destPath = srcPath
> val qualifiedDestPath = destFs.makeQualified(destPath)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: [DRAFT] Incubator PMC Board Report - January 2018

2018-01-08 Thread Saisai Shao
Sorry I was on paternity leave last two weeks, didn't have time to check
it. Maybe you can also help on this next time.

Thanks
Jerry

2018-01-09 4:32 GMT+08:00 Alex Bozarth :

> Saw that we didn't report this month, any reason why, or was everyone just
> on holiday?
>
>
> *Alex Bozarth*
> Software Engineer
> Spark Technology Center
> --
> *E-mail:* *ajboz...@us.ibm.com* 
> *GitHub: **github.com/ajbozarth* 
>
>
> 505 Howard Street
> 
> San Francisco, CA 94105
> 
> United States
> 
>
>
>
> [image: Inactive hide details for "John D. Ament" ---01/08/2018 11:06:22
> AM---All, Below is the current draft on the incubator report.]"John D.
> Ament" ---01/08/2018 11:06:22 AM---All, Below is the current draft on the
> incubator report. We have a high number
>
> From: "John D. Ament" 
> To: "gene...@incubator.apache.org" 
> Cc: "d...@annotator.apache.org" ,
> d...@crail.apache.org, "d...@htrace.incubator.apache.org" <
> d...@htrace.incubator.apache.org>, "dev@livy.incubator.apache.org" <
> dev@livy.incubator.apache.org>, "d...@milagro.apache.org" <
> d...@milagro.apache.org>, "d...@myriad.incubator.apache.org" <
> d...@myriad.incubator.apache.org>, d...@sdap.apache.org,
> d...@senssoft.apache.org, "d...@spot.incubator.apache.org" <
> d...@spot.incubator.apache.org>, d...@airflow.apache.org, "
> odf-...@incubator.apache.org" 
> Date: 01/08/2018 11:06 AM
> Subject: [DRAFT] Incubator PMC Board Report - January 2018
> --
>
>
>
> All,
>
> Below is the current draft on the incubator report.  We have a high number
> of podlings not reporting.  They are CC'd in hopes they can report.
>
> Incubator PMC report for January 2018
>
> The Apache Incubator is the entry path into the ASF for projects and
> codebases wishing to become part of the Foundation's efforts.
>
> There are currently 53 podlings in the inubator.  We added two new podlings
> in December, and executed four podling releases.  Two new PMC members
> joined, in support of mentoring podlings.
>
> * Community
>
>  New IPMC members:
>
>  - Stefan Bodewig
>  - Carl Johan Erik Edstrom
>
>  People who left the IPMC:
>
>
>
> * New Podlings
>
>  - PLC4X
>  - SkyWalking
>
> * Podlings that failed to report, expected next month
>
>  - Annotator
>  - Crail
>  - HTrace
>  - Livy
>  - Milagro
>  - Myriad
>  - SDAP
>  - SensSoft
>  - Spot
>  - Wave
>
> * Podlings missing sign off
>
>  - Airflow
>  - ODF Toolkit
>
> * Graduations
>
>  The board has motions for the following:
>
>  - Your podling here?
>  - Your podling here?
>
> * Releases
>
>  The following releases entered distribution during the month of
>  December:
>
>  - 2017-12-03 Apache MXNet 1.0.0
>  - 2017-12-13 Apache BatchEE 0.5
>  - 2017-12-14 Apache Edgent 1.2.0
>  - 2017-12-17 Apache Pulsar 1.21.0
>
> * IP Clearance
>
>
>
> * Legal / Trademarks
>
>
>
> * Infrastructure
>
>
>
> * Miscellaneous
>
>
>
> * Credits
>
> --
>   Table of Contents
> Airflow
> Amaterasu
> Annotator
> BatchEE
> Crail
> DataFu
> FreeMarker
> Gobblin
> Gossip
> HAWQ
> HTrace
> Livy
> Milagro
> MXNet
> Myriad
> NetBeans
> ODF Toolkit
> PageSpeed
> PLC4X
> Pony Mail
> Rya
> SDAP
> SensSoft
> ServiceComb
> SkyWalking
> Spot
> Traffic Control
> Wave
> Weex
>
> --
>
> 
> Airflow
>
> Airflow is a workflow automation and scheduling system that can be used to
> author and manage data pipelines.
>
> Airflow has been incubating since 2016-03-31.
>
> Three most important issues to address in the move towards graduation:
>
>  1. We have completed 4 apache releases. We are nearing graduation!
>  2.
>  3.
>
> Any issues that the Incubator PMC (IPMC) or ASF Board wish/need to be
> aware of?
> None
>
>
> How has the community developed since the last report?
>  1. Since our last podling report 3 months ago (i.e. between Sept 25 & Dec
> 26, inclusive), we grew our contributors from 315 to 360
>  2. Since our last podling report 3 months ago (i.e. between Sept 25 & Dec
> 26, inclusive), we resolved 221 pull requests (currently at 2018 closed
> PRs)
>  3. Since being accepted into the incubator, the number of companies
> officially using Apache Airflow has risen from 30 to 123, 9 new from
> the last podling report.
>
>
> How has the project developed since the last report?
> See above : 221 PR resolved, 45 new contributors, & 9 new companies
> officially using it.
>
>
> How would you assess the podling's maturity?
> Please feel free to add your own commentary.
>
>  [ ] Initial setup
>  [ ] 

Re: Is Apache Spark-2.2.1 compatible with Hadoop-3.0.0

2018-01-07 Thread Saisai Shao
AFAIK, there's no large scale test for Hadoop 3.0 in the community. So it
is not clear whether it is supported or not (or has some issues). I think
in the download page "Pre-Built for Apache Hadoop 2.7 and later" mostly
means that it supports Hadoop 2.7+ (2.8...), but not 3.0 (IIUC).

Thanks
Jerry

2018-01-08 4:50 GMT+08:00 Raj Adyanthaya :

> Hi Akshay
>
> On the Spark Download page when you select Spark 2.2.1 it gives you an
> option to select package type. In that, there is an option to select
> "Pre-Built for Apache Hadoop 2.7 and later". I am assuming it means that it
> does support Hadoop 3.0.
>
> http://spark.apache.org/downloads.html
>
> Thanks,
> Raj A.
>
> On Sat, Jan 6, 2018 at 8:23 PM, akshay naidu 
> wrote:
>
>> hello Users,
>> I need to know whether we can run latest spark on  latest hadoop version
>> i.e., spark-2.2.1 released on 1st dec and hadoop-3.0.0 released on 13th dec.
>> thanks.
>>
>
>


[jira] [Assigned] (LIVY-429) Livy skips url fragment identifier in spark.yarn.dist.archives setting

2018-01-03 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao reassigned LIVY-429:


Assignee: Artem Plotnikov

> Livy skips url fragment identifier in spark.yarn.dist.archives setting
> --
>
> Key: LIVY-429
> URL: https://issues.apache.org/jira/browse/LIVY-429
> Project: Livy
>  Issue Type: Bug
>  Components: Server
>Affects Versions: 0.3
>Reporter: Artem Plotnikov
>Assignee: Artem Plotnikov
> Fix For: 0.5.0
>
>
> When submitting Spark session via Livy API with the following parameters:
> {code}
> {
>   "kind": "pyspark",
>   "name": "my-app"
>   "conf": {
>   "spark.yarn.appMasterEnv.PYSPARK_PYTHON": 
> "./ENVS/custom-python/bin/python"
>   },
>   "archives": ["/path/to/custom-python.zip#ENVS"]
> }
> {code}
> Spark session fails with:
> {code}
> java.io.IOException: Cannot run program "./ENVS/custom-python/bin/python": 
> error=2, No such file or directory
> {code}
> Because Livy uses java.net.URI#getPath method under the hood, which just 
> skips URI fragment identifier part and Spark is being submitted with 
> spark.yarn.dist.archives=/path/to/custom-python.zip and does not even 
> extracts the archive in this case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (LIVY-429) Livy skips url fragment identifier in spark.yarn.dist.archives setting

2018-01-03 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-429?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved LIVY-429.
--
   Resolution: Fixed
Fix Version/s: 0.5.0

Issue resolved by pull request 71
[https://github.com/apache/incubator-livy/pull/71]

> Livy skips url fragment identifier in spark.yarn.dist.archives setting
> --
>
> Key: LIVY-429
> URL: https://issues.apache.org/jira/browse/LIVY-429
> Project: Livy
>  Issue Type: Bug
>  Components: Server
>Affects Versions: 0.3
>Reporter: Artem Plotnikov
> Fix For: 0.5.0
>
>
> When submitting Spark session via Livy API with the following parameters:
> {code}
> {
>   "kind": "pyspark",
>   "name": "my-app"
>   "conf": {
>   "spark.yarn.appMasterEnv.PYSPARK_PYTHON": 
> "./ENVS/custom-python/bin/python"
>   },
>   "archives": ["/path/to/custom-python.zip#ENVS"]
> }
> {code}
> Spark session fails with:
> {code}
> java.io.IOException: Cannot run program "./ENVS/custom-python/bin/python": 
> error=2, No such file or directory
> {code}
> Because Livy uses java.net.URI#getPath method under the hood, which just 
> skips URI fragment identifier part and Spark is being submitted with 
> spark.yarn.dist.archives=/path/to/custom-python.zip and does not even 
> extracts the archive in this case.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SPARK-22876) spark.yarn.am.attemptFailuresValidityInterval does not work correctly

2018-01-03 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22876?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309363#comment-16309363
 ] 

Saisai Shao commented on SPARK-22876:
-

This looks like a bug, ideally we should get number of attempts from RM, not by 
calculating attempt id.

> spark.yarn.am.attemptFailuresValidityInterval does not work correctly
> -
>
> Key: SPARK-22876
> URL: https://issues.apache.org/jira/browse/SPARK-22876
> Project: Spark
>  Issue Type: Bug
>  Components: YARN
>Affects Versions: 2.2.0
> Environment: hadoop version 2.7.3
>Reporter: Jinhan Zhong
>Priority: Minor
>
> I assume we can use spark.yarn.maxAppAttempts together with 
> spark.yarn.am.attemptFailuresValidityInterval to make a long running 
> application avoid stopping  after acceptable number of failures.
> But after testing, I found that the application always stops after failing n 
> times ( n is minimum value of spark.yarn.maxAppAttempts and 
> yarn.resourcemanager.am.max-attempts from client yarn-site.xml)
> for example, following setup will allow the application master to fail 20 
> times.
> * spark.yarn.am.attemptFailuresValidityInterval=1s
> * spark.yarn.maxAppAttempts=20
> * yarn client: yarn.resourcemanager.am.max-attempts=20
> * yarn resource manager: yarn.resourcemanager.am.max-attempts=3
> And after checking the source code, I found in source file 
> ApplicationMaster.scala 
> https://github.com/apache/spark/blob/master/resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/ApplicationMaster.scala#L293
> there's a ShutdownHook that checks the attempt id against the maxAppAttempts, 
> if attempt id >= maxAppAttempts, it will try to unregister the application 
> and the application will finish.
> is this a expected design or a bug?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Closed] (LIVY-428) Does livy support concurrency for centain spark session?

2018-01-02 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-428?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao closed LIVY-428.

Resolution: Duplicate

> Does livy support concurrency for centain spark session?
> 
>
> Key: LIVY-428
> URL: https://issues.apache.org/jira/browse/LIVY-428
> Project: Livy
>  Issue Type: Bug
>  Components: API
>Affects Versions: 0.4
> Environment: livy 0.4
> spark2.2.0-cloudera1
>Reporter: suheng.cloud
> Attachments: spark-fair.png
>
>
> To use spark sql concurrency for internal fair schedule
> in [post] /sessions/{id}/statements use param "conf":
> {"spark.scheduler.mode":"FAIR","spark.scheduler.allocation.file":"/home/.../spark-schedule.xml"}
> spark-schedule.xml is as follows:
> 
>   
> FAIR
>   
> 
> Log shows tow jobs surely be added into fair pool:
> INFO scheduler.FairSchedulableBuilder: Added task set TaskSet_10.0 tasks to 
> pool FAIR
> ...
> INFO scheduler.FairSchedulableBuilder: Added task set TaskSet_11.0 tasks to 
> pool FAIR
> But from spark web console the second job starts only when the first finished.
> I also tried with zeppelin, which concurrency works well with spark but not 
> work in livy.
> Did I miss some configuration? Thanks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (LIVY-428) Does livy support concurrency for centain spark session?

2018-01-02 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-428?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309020#comment-16309020
 ] 

Saisai Shao commented on LIVY-428:
--

This issue is similar to LIVY-421. This is the current design choice and we 
don't have a good solution to fix it.

> Does livy support concurrency for centain spark session?
> 
>
> Key: LIVY-428
> URL: https://issues.apache.org/jira/browse/LIVY-428
> Project: Livy
>  Issue Type: Bug
>  Components: API
>Affects Versions: 0.4
> Environment: livy 0.4
> spark2.2.0-cloudera1
>Reporter: suheng.cloud
> Attachments: spark-fair.png
>
>
> To use spark sql concurrency for internal fair schedule
> in [post] /sessions/{id}/statements use param "conf":
> {"spark.scheduler.mode":"FAIR","spark.scheduler.allocation.file":"/home/.../spark-schedule.xml"}
> spark-schedule.xml is as follows:
> 
>   
> FAIR
>   
> 
> Log shows tow jobs surely be added into fair pool:
> INFO scheduler.FairSchedulableBuilder: Added task set TaskSet_10.0 tasks to 
> pool FAIR
> ...
> INFO scheduler.FairSchedulableBuilder: Added task set TaskSet_11.0 tasks to 
> pool FAIR
> But from spark web console the second job starts only when the first finished.
> I also tried with zeppelin, which concurrency works well with spark but not 
> work in livy.
> Did I miss some configuration? Thanks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Can Spark jobs which are submitted via Livy run concurrently?

2017-12-13 Thread Saisai Shao
No, current Livy interactive session doesn't support running statements
concurrently. Because Livy doesn't know if multiple statements have
dependencies or not, running concurrently will lead to unexpected results.
So submitted statements can only be run one by one.

Thanks
Jerry

2017-12-14 14:25 GMT+08:00 Keiji Yoshida :

> Hi Apache Livy developers,
>
> I would like to ask a question.
>
> Can Spark jobs which are submitted via Livy run concurrently on a single
> Spark application (= Livy session)?
>
> I set up Spark configuration in accordance with
> https://spark.apache.org/docs/2.1.1/job-scheduling.html#
> scheduling-within-an-application
> so that Spark jobs can run concurrently on a FAIR Spark scheduler and a
> FAIR Spark pool.
>
> However, when I submitted multiple Spark jobs (= Livy statements) by
> executing the following command several times, they just ran sequentially
> on a FAIR Spark pool.
>
> [command which I executed several times]
> curl -XPOST -H "Content-Type: application/json"
> mylivyhost.com:/sessions//statements -d '{"code":
> "sc.setLocalProperty(\"spark.scheduler.pool\",
> \"my-fair-pool\")\nspark.sql(\"select count(1) from mytable\").show()"}'
>
> It seems for me that a single Livy session submits a Spark job (= Livy
> statement) only after the Spark job which has been submitted previously
> finished.
>
> Is my guess true? Can I make Spark jobs which are submitted via Livy run
> concurrently in some way?
>
> I'm using Livy 0.3.0.
>
> Thanks,
> Keiji Yoshida
>


[jira] [Commented] (LIVY-426) Livy stops session quietly without error message

2017-12-06 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16281417#comment-16281417
 ] 

Saisai Shao commented on LIVY-426:
--

Sorry I misunderstood your issue, what I mean is that I also saw this netty 
exception, but I don't see "Livy session stops quietly" issue.

It seems that AM is killed somehow, but I cannot figure out why, do you find 
some clues in AM log and Livy server log?

> Livy stops session quietly without error message
> 
>
> Key: LIVY-426
> URL: https://issues.apache.org/jira/browse/LIVY-426
> Project: Livy
>  Issue Type: Bug
>Reporter: Jialiang Tan
>
> Livy stops interactive sessions quietly without any error messages. In YARN 
> log it only shows RECEIVED SIGNAL TERM which I think indicates Livy wants to 
> terminate the driver in yarn. In Livy log it only gives Client RPC channel 
> closed unexpectedly. No other information found.
> Spark Version: 2.0.1
> Scala Version: 2.11.8
> Livy Version: 0.4.0-incubating
> Zeppelin Version: 0.7.2
> Hadoop Version: 2.7.3
> YARN log:
> {code:java}
> 17/12/06 00:39:23 INFO storage.BlockManagerMaster: Removed 3 successfully in 
> removeExecutor
> 17/12/06 00:39:23 INFO spark.ExecutorAllocationManager: Existing executor 3 
> has been removed (new total is 0)
> 17/12/06 00:46:07 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL TERM
> 17/12/06 00:46:07 INFO spark.SparkContext: Invoking stop() from shutdown hook
> 17/12/06 00:46:07 INFO server.ServerConnector: Stopped 
> ServerConnector@690ca6ae{HTTP/1.1}{0.0.0.0:0}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@5959d99{/stages/stage/kill,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@5daa4dcf{/api,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@1f30b9f2{/,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@1a36836{/static,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@56c1c4df{/executors/threadDump/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@5ea39f64{/executors/threadDump,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@5be8acbc{/executors/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@17a813b2{/executors,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@48780245{/environment/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@66bc6e53{/environment,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@7b06e14a{/storage/rdd/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@58f0{/storage/rdd,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@4cfc52cb{/storage/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@731dd75e{/storage,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@1716986b{/stages/pool/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@54380417{/stages/pool,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@5b75d33{/stages/stage/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@24b81ae5{/stages/stage,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@68356b10{/stages/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@4422af95{/stages,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@39431dbc{/jobs/job/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@162b5d5e{/jobs/job,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@a84d7e6{/jobs/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler

[jira] [Commented] (LIVY-426) Livy stops session quietly without error message

2017-12-06 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16281335#comment-16281335
 ] 

Saisai Shao commented on LIVY-426:
--

I also encountered this exception before. Do you have a fix for this?

> Livy stops session quietly without error message
> 
>
> Key: LIVY-426
> URL: https://issues.apache.org/jira/browse/LIVY-426
> Project: Livy
>  Issue Type: Bug
>Reporter: Jialiang Tan
>
> Livy stops interactive sessions quietly without any error messages. In YARN 
> log it only shows RECEIVED SIGNAL TERM which I think indicates Livy wants to 
> terminate the driver in yarn. In Livy log it only gives Client RPC channel 
> closed unexpectedly. No other information found.
> Spark Version: 2.0.1
> Scala Version: 2.11.8
> Livy Version: 0.4.0-incubating
> Zeppelin Version: 0.7.2
> Hadoop Version: 2.7.3
> YARN log:
> {code:java}
> 17/12/06 00:39:23 INFO storage.BlockManagerMaster: Removed 3 successfully in 
> removeExecutor
> 17/12/06 00:39:23 INFO spark.ExecutorAllocationManager: Existing executor 3 
> has been removed (new total is 0)
> 17/12/06 00:46:07 ERROR yarn.ApplicationMaster: RECEIVED SIGNAL TERM
> 17/12/06 00:46:07 INFO spark.SparkContext: Invoking stop() from shutdown hook
> 17/12/06 00:46:07 INFO server.ServerConnector: Stopped 
> ServerConnector@690ca6ae{HTTP/1.1}{0.0.0.0:0}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@5959d99{/stages/stage/kill,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@5daa4dcf{/api,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@1f30b9f2{/,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@1a36836{/static,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@56c1c4df{/executors/threadDump/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@5ea39f64{/executors/threadDump,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@5be8acbc{/executors/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@17a813b2{/executors,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@48780245{/environment/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@66bc6e53{/environment,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@7b06e14a{/storage/rdd/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@58f0{/storage/rdd,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@4cfc52cb{/storage/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@731dd75e{/storage,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@1716986b{/stages/pool/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@54380417{/stages/pool,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@5b75d33{/stages/stage/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@24b81ae5{/stages/stage,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@68356b10{/stages/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@4422af95{/stages,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@39431dbc{/jobs/job/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@162b5d5e{/jobs/job,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@a84d7e6{/jobs/json,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO handler.ContextHandler: Stopped 
> o.s.j.s.ServletContextHandler@76d179b8{/jobs,null,UNAVAILABLE}
> 17/12/06 00:46:07 INFO ui.SparkUI: Stopped Spark web UI at 
> http://10.28.24.141:

Re: How to set driverMemory, driverCores, executorMemory using livy?

2017-12-06 Thread Saisai Shao
You should write as  *Spark configuration* like spark.xxx.

2017-12-06 16:23 GMT+08:00 kant kodali :

> Does the below code look right?
>
> new LivyClientBuilder()
> .setURI(new URI(livyUrl))
> .setConf("driverMemory", "4g").setConf("executorMemory", 
> "4g").setConf("driverCores", "4").build();
>
>
> On Wed, Dec 6, 2017 at 12:20 AM, kant kodali  wrote:
>
>> Thanks much!
>>
>> On Wed, Dec 6, 2017 at 12:16 AM, Saisai Shao 
>> wrote:
>>
>>> Using this API "public LivyClientBuilder setConf(String key, String
>>> value)"  to set Spark configurations you wanted.
>>>
>>> 2017-12-06 15:34 GMT+08:00 kant kodali :
>>>
>>>> Hi All,
>>>>
>>>>
>>>> I do see POST /sessions API where I can pass driverMemory, driverCores,
>>>> executorMemory as part of the request body but I am using programmatic API
>>>> to submit upload the Jar and submit my job so how do I set values
>>>> for driverMemory, driverCores, executorMemory ?
>>>>
>>>> Thanks!
>>>>
>>>>
>>>
>>
>


Re: How to set driverMemory, driverCores, executorMemory using livy?

2017-12-06 Thread Saisai Shao
Using this API "public LivyClientBuilder setConf(String key, String
value)"  to set Spark configurations you wanted.

2017-12-06 15:34 GMT+08:00 kant kodali :

> Hi All,
>
>
> I do see POST /sessions API where I can pass driverMemory, driverCores,
> executorMemory as part of the request body but I am using programmatic API
> to submit upload the Jar and submit my job so how do I set values
> for driverMemory, driverCores, executorMemory ?
>
> Thanks!
>
>


[jira] [Commented] (LIVY-423) Adding Scala 2.12 support

2017-12-04 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-423?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16277918#comment-16277918
 ] 

Saisai Shao commented on LIVY-423:
--

Based on the communication in Spark community, Scala 2.12 may not be shipped 
with Spark 2.3, it still has some pending issues. So this one can be postponed.

> Adding Scala 2.12 support
> -
>
> Key: LIVY-423
> URL: https://issues.apache.org/jira/browse/LIVY-423
> Project: Livy
>  Issue Type: New Feature
>  Components: Build, Core, REPL
>    Reporter: Saisai Shao
>
> Spark 2.3 already integrates with Scala 2.12 support, it will possibly 
> release 2.12 artifacts. So in the Livy side we should support Scala 2.12 
> build and interpreter. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (LIVY-19) Add Spark SQL support

2017-12-04 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-19?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao reassigned LIVY-19:
---

Assignee: Saisai Shao

> Add Spark SQL support
> -
>
> Key: LIVY-19
> URL: https://issues.apache.org/jira/browse/LIVY-19
> Project: Livy
>  Issue Type: New Feature
>  Components: Core
>Affects Versions: 0.1
>Reporter: Erick Tryzelaar
>Assignee: Saisai Shao
>Priority: Minor
> Fix For: 0.5.0
>
>
> We should have a Spark SQL mode to better support workflows like 
> [Mojito|https://www.appsflyer.com/blog/meet-mojito-our-new-not-just-support-tool/#.VdsgUjlidaA.twitter].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (LIVY-19) Add Spark SQL support

2017-12-04 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-19?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved LIVY-19.
-
   Resolution: Fixed
Fix Version/s: 0.5.0

Issue resolved by pull request 68
[https://github.com/apache/incubator-livy/pull/68]

> Add Spark SQL support
> -
>
> Key: LIVY-19
> URL: https://issues.apache.org/jira/browse/LIVY-19
> Project: Livy
>  Issue Type: New Feature
>  Components: Core
>Affects Versions: 0.1
>Reporter: Erick Tryzelaar
>Priority: Minor
> Fix For: 0.5.0
>
>
> We should have a Spark SQL mode to better support workflows like 
> [Mojito|https://www.appsflyer.com/blog/meet-mojito-our-new-not-just-support-tool/#.VdsgUjlidaA.twitter].



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: How to use multiple programming languages in the same Spark context in Livy?

2017-12-03 Thread Saisai Shao
This feature is targeted for Livy 0.5.0 community version. But we already
back-ported this in HDP 2.6.3, so you can try this feature in HDP 2.6.3.

You can check this doc (
https://github.com/apache/incubator-livy/blob/master/docs/rest-api.md) to
see the API difference for this feature.

2017-12-03 9:55 GMT+08:00 Jeff Zhang :

>
> It is implemented in https://issues.apache.org/jira/browse/LIVY-194
>
> But not release in apache version, HDP backport it in their distribution
>
>
>
> 胡大为(David) 于2017年12月2日周六 上午10:58写道:
>
>> I forgot to add the link reference and here it is.
>>
>> https://hortonworks.com/blog/hdp-2-6-3-dataplane-service/
>>
>> Regards, Dawei
>>
>> On 2 Dec 2017, at 8:24 AM, 胡大为(David)  wrote:
>>
>>
>> Hi all,
>>
>> I was reading the HDP 2.6.3 release notes and it mentions that Livy
>> service is able to multiple programming languages in the same Spark
>> context, but I went through all the Livy document and examples I can find
>> but so far haven’t found out how to get it work. Currently I am using the
>> latest Livy 0.4 to submit Scala code only and it would be awesome to mix it
>> with Python or R code in the same session. Much appreciate it anyone could
>> give me some clue about this.
>>
>> Thanks in advance and have a good day :)
>>
>> Regards, Dawei
>>
>>
>>


[jira] [Commented] (LIVY-104) Switch to scala 2.11

2017-12-03 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16276172#comment-16276172
 ] 

Saisai Shao commented on LIVY-104:
--

Livy server itself requires Spark artifacts to run unit tests, if we change 
Scala version to 2.12, it also requires Spark Scala 2.12 dependencies, which 
are not available currently. So from my understanding a workable solution is to 
change to Scala 2.11, which is available from Spark 1.6 to 2.3.

> Switch to scala 2.11
> 
>
> Key: LIVY-104
> URL: https://issues.apache.org/jira/browse/LIVY-104
> Project: Livy
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.2
>Reporter: Marcelo Vanzin
>
> Livy should switch to Scala 2.11; 2.10 is EOL.
> Note this doesn't mean dropping support for Spark + 2.10; we should still 
> support running on Spark built for 2.10. This just affects the Livy server 
> itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (LIVY-421) support multi statements run at the same time

2017-12-01 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274174#comment-16274174
 ] 

Saisai Shao commented on LIVY-421:
--

bq. Adding concurrency for sql make sense for me as each sql statements should 
be independent.

Not exactly. For example, create table A, load data, select from table A. These 
3 sql queries should not be run concurrently.

Actually even for one thread per interpreter, there still has minor issues like 
shared variables, like one interpreter register a table, then another 
interpreter do some transformations on the registered table. 

It's really difficult for Livy to identify what kind of statements can be run 
concurrently. I think for the resource under-utilize issue, dynamic allocation 
is your friend!

> support multi statements run at the same time
> -
>
> Key: LIVY-421
> URL: https://issues.apache.org/jira/browse/LIVY-421
> Project: Livy
>  Issue Type: Improvement
>  Components: REPL
>Affects Versions: 0.4
>Reporter: zhuweibin
>  Labels: performance
>
> In the v0.4.0-incubating's’ implementation, the number of threads used to 
> execute the statement code is hardcoded to one. When idle resources are 
> sufficient to execute multiple statements, a lot of resources are wasted and 
> blocking statements run. We are considering improvements in one of two ways:
> # 1. Add a configuration in livy.conf to globally control the number of 
> threads in interpreterExecutor for all repl/sessions to execute statements
> # 2 in the session to create a rest api parameter to set the maximum number 
> of statements can be run at the same time the number of statements
> hope to discuss this issue, thanks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (LIVY-104) Switch to scala 2.11

2017-12-01 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16274123#comment-16274123
 ] 

Saisai Shao commented on LIVY-104:
--

Hi [~ajbozarth] would you please drive this issue if you have time, I'm 
thinking of shifting to 2.11 for Livy 0.5. Any thought?

> Switch to scala 2.11
> 
>
> Key: LIVY-104
> URL: https://issues.apache.org/jira/browse/LIVY-104
> Project: Livy
>  Issue Type: Improvement
>  Components: Core
>Affects Versions: 0.2
>Reporter: Marcelo Vanzin
>
> Livy should switch to Scala 2.11; 2.10 is EOL.
> Note this doesn't mean dropping support for Spark + 2.10; we should still 
> support running on Spark built for 2.10. This just affects the Livy server 
> itself.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Created] (LIVY-423) Adding Scala 2.12 support

2017-11-30 Thread Saisai Shao (JIRA)
Saisai Shao created LIVY-423:


 Summary: Adding Scala 2.12 support
 Key: LIVY-423
 URL: https://issues.apache.org/jira/browse/LIVY-423
 Project: Livy
  Issue Type: New Feature
  Components: Build, Core, REPL
Reporter: Saisai Shao


Spark 2.3 already integrates with Scala 2.12 support, it will possibly release 
2.12 artifacts. So in the Livy side we should support Scala 2.12 build and 
interpreter. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SPARK-22393) spark-shell can't find imported types in class constructors, extends clause

2017-11-30 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22393?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16273985#comment-16273985
 ] 

Saisai Shao commented on SPARK-22393:
-

Shall we wait for this before 2.2.1 is out? 

> spark-shell can't find imported types in class constructors, extends clause
> ---
>
> Key: SPARK-22393
> URL: https://issues.apache.org/jira/browse/SPARK-22393
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Shell
>Affects Versions: 2.0.2, 2.1.2, 2.2.0
>Reporter: Ryan Williams
>Priority: Minor
>
> {code}
> $ spark-shell
> …
> scala> import org.apache.spark.Partition
> import org.apache.spark.Partition
> scala> class P(p: Partition)
> :11: error: not found: type Partition
>class P(p: Partition)
>   ^
> scala> class P(val index: Int) extends Partition
> :11: error: not found: type Partition
>class P(val index: Int) extends Partition
>^
> {code}
> Any class that I {{import}} gives "not found: type ___" when used as a 
> parameter to a class, or in an extends clause; this applies to classes I 
> import from JARs I provide via {{--jars}} as well as core Spark classes as 
> above.
> This worked in 1.6.3 but has been broken since 2.0.0.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: Session Memory Management

2017-11-30 Thread Saisai Shao
Hi Lauren,

Thanks for the feedbacks. How do you identify this REPL memory issue? If it
is a problem of Scala REPL, then I think we don't have pretty much
solutions for it. Do you have any thought on it?

BTW, which version of Spark are you using?

@Jeff, if it is a problem of Scala REPL, then I believe Zeppelin also
suffers from such issue, did you see any report from Zeppelin community?

Thanks
Jerry

2017-12-01 11:17 GMT+08:00 Lauren Spiegel :

> The spark driver. I think it is related to this since a session is a spark
> shell/scala REPL: https://issues.scala-lang.org/browse/SI-4331
>
>
> On Thu, Nov 30, 2017 at 6:11 PM, Jeff Zhang  wrote:
>
>>
>> Which component's memory keeps growing ? livy server or the spark driver ?
>>
>>
>> Lauren Spiegel 于2017年12月1日周五 上午8:55写道:
>>
>>> Hi Livyers,
>>>
>>> When I have an active session for a couple hours, the memory keeps
>>> growing and I have to kill it. What do people do to manage ever-growing
>>> memory? Are there any plans to limit the memory growth so that sessions can
>>> be long living?
>>>
>>> Thank you
>>>
>>>
>>> 
>>
>>
>
>
> 


Re: Does Apache Livy support Spark Structured Streaming 2.2.0?

2017-11-28 Thread Saisai Shao
Livy doesn't support WebSocket. I think for your scenario, you need to use
Livy's JobAPI instead of interactive query, Livy doesn't push results back
to client in real time. So this maybe slightly different from what you want
(IIUC).

2017-11-29 14:34 GMT+08:00 kant kodali :

> Nice! so if I submit a streaming query over REST can I get the results
> back over REST or Websocket?
>
> On Tue, Nov 28, 2017 at 10:29 PM, Saisai Shao 
> wrote:
>
>> Livy doesn't add any restriction on how user uses Spark API, so of course
>> Structured Streaming is supported.
>>
>> 2017-11-29 14:21 GMT+08:00 kant kodali :
>>
>>> Hi All,
>>>
>>> Does Apache Livy support Spark Structured Streaming 2.2.0? If so, any
>>> examples please? preferably in Java.
>>>
>>> Thanks,
>>> kant
>>>
>>
>>
>


Re: Does Apache Livy support Spark Structured Streaming 2.2.0?

2017-11-28 Thread Saisai Shao
Livy doesn't add any restriction on how user uses Spark API, so of course
Structured Streaming is supported.

2017-11-29 14:21 GMT+08:00 kant kodali :

> Hi All,
>
> Does Apache Livy support Spark Structured Streaming 2.2.0? If so, any
> examples please? preferably in Java.
>
> Thanks,
> kant
>


Re: Does anyone know how to build spark with scala12.4?

2017-11-28 Thread Saisai Shao
I see, thanks for your quick response.

Best regards,
Jerry

2017-11-29 10:45 GMT+08:00 Sean Owen :

> Until the 2.12 build passes tests, no. There is still a real outstanding
> issue with the closure cleaner and serialization of closures as Java 8
> lambdas. I haven't cracked it, and don't think it's simple, but not
> insurmountable.
>
> The funny thing is most stuff appears to just work without cleaning said
> lambdas, because they don't generally capture references in the same way.
> So it may be reasonable to advertise 2.12 support as experimental and for
> people willing to make their own build. That's why I wanted it in good
> enough shape that the scala-2.12 profile produces something basically
> functional.
>
> On Tue, Nov 28, 2017 at 8:43 PM Saisai Shao 
> wrote:
>
>> Hi Sean,
>>
>> Two questions about Scala 2.12 for release artifacts.
>>
>> Are we planning to ship 2.12 artifacts for Spark 2.3 release? If not,
>> will we only ship 2.11 artifacts?
>>
>> Thanks
>> Jerry
>>
>> 2017-11-28 21:51 GMT+08:00 Sean Owen :
>>
>>> The Scala 2.12 profile mostly works, but not all tests pass. Use
>>> -Pscala-2.12 on the command line to build.
>>>
>>> On Tue, Nov 28, 2017 at 5:36 AM Ofir Manor 
>>> wrote:
>>>
>>>> Hi,
>>>> as far as I know, Spark does not support Scala 2.12.
>>>> There is on-going work to make refactor / fix Spark source code to
>>>> support Scala 2.12 - look for multiple emails on this list in the last
>>>> months from Sean Owen on his progress.
>>>> Once Spark supports Scala 2.12, I think the next target would be JDK 9
>>>> support.
>>>>
>>>> Ofir Manor
>>>>
>>>> Co-Founder & CTO | Equalum
>>>>
>>>> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io
>>>>
>>>> On Tue, Nov 28, 2017 at 9:20 AM, Zhang, Liyun 
>>>> wrote:
>>>>
>>>>> Hi all:
>>>>>
>>>>>   Does anyone know how to build spark with scala12.4? I want to test
>>>>> whether spark can work on jdk9 or not.  Scala12.4 supports jdk9.  Does
>>>>> anyone try to build spark with scala 12.4 or compile successfully with
>>>>> jdk9.Appreciate to get some feedback from you.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Best Regards
>>>>>
>>>>> Kelly Zhang/Zhang,Liyun
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>


Re: Does anyone know how to build spark with scala12.4?

2017-11-28 Thread Saisai Shao
Hi Sean,

Two questions about Scala 2.12 for release artifacts.

Are we planning to ship 2.12 artifacts for Spark 2.3 release? If not, will
we only ship 2.11 artifacts?

Thanks
Jerry

2017-11-28 21:51 GMT+08:00 Sean Owen :

> The Scala 2.12 profile mostly works, but not all tests pass. Use
> -Pscala-2.12 on the command line to build.
>
> On Tue, Nov 28, 2017 at 5:36 AM Ofir Manor  wrote:
>
>> Hi,
>> as far as I know, Spark does not support Scala 2.12.
>> There is on-going work to make refactor / fix Spark source code to
>> support Scala 2.12 - look for multiple emails on this list in the last
>> months from Sean Owen on his progress.
>> Once Spark supports Scala 2.12, I think the next target would be JDK 9
>> support.
>>
>> Ofir Manor
>>
>> Co-Founder & CTO | Equalum
>>
>> Mobile: +972-54-7801286 | Email: ofir.ma...@equalum.io
>>
>> On Tue, Nov 28, 2017 at 9:20 AM, Zhang, Liyun 
>> wrote:
>>
>>> Hi all:
>>>
>>>   Does anyone know how to build spark with scala12.4? I want to test
>>> whether spark can work on jdk9 or not.  Scala12.4 supports jdk9.  Does
>>> anyone try to build spark with scala 12.4 or compile successfully with
>>> jdk9.Appreciate to get some feedback from you.
>>>
>>>
>>>
>>>
>>>
>>> Best Regards
>>>
>>> Kelly Zhang/Zhang,Liyun
>>>
>>>
>>>
>>
>>


Re: How to access Python script's stdout when running in Yarn cluster mode?

2017-11-27 Thread Saisai Shao
I think you have plenty of ways to get application log, either via command
line or programmatically, even with yarn RM UI. Since there already has
several ways to get application log, so currently we don't have a plan to
address this.

2017-11-27 20:04 GMT+08:00 Partridge, Lucas (GE Aviation) <
lucas.partri...@ge.com>:

> Thanks Jerry.
>
>
>
> “Currently there's no Livy api for you to get application log via REST
> API.”
>
> - That’s a real shame. Are there any plans to address this?  Because this
> really limits the usefulness of Livy when I’m using Yarn cluster mode.  It
> seems I have to make a choice between running my client in a scalable
> manner (yarn cluster) or being able to see the logs programmatically (yarn
> client), but not both.  My client is multi-threaded; I don’t want it to
> have to host multiple concurrent Spark driver applications but it looks
> like I might have no choice about this.
>
>
>
> Thanks, Lucas.
>
>
>
> *From:* Saisai Shao [mailto:sai.sai.s...@gmail.com]
> *Sent:* 27 November 2017 02:19
> *To:* user@livy.incubator.apache.org
> *Subject:* EXT: Re: How to access Python script's stdout when running in
> Yarn cluster mode?
>
>
>
> Since you're running with yarn cluster mode, the output from your python
> script should be part of your yarn application log. you can get it via yarn
> command like yarn log -applicationId , or others like Yarn
> UI. Currently there's no Livy api for you to get application log via REST
> API.
>
>
>
> Thanks
>
> Jerry
>
>
>
> 2017-11-24 20:27 GMT+08:00 Partridge, Lucas (GE Aviation) <
> lucas.partri...@ge.com>:
>
> Hi,
>
>
>
> I’m using Livy’s GET /batches/{batchId}/log method to fetch the log lines
> from a Python script I’m running on Spark in Yarn cluster mode.
> Unfortunately the stdout from my Python script is not included in the log
> lines returned by GET /batches/{batchId}/log!
>
>
>
> Is this by design, or an unfortunate by-product of running in Yarn cluster
> mode?
>
>
>
> If this is intentional does anyone know how I can access the stdout from
> my Python script via Livy please? Preferably without having to change my
> REST client (a Java app) to use Yarn’s client deployment mode.
>
>
>
> Thanks,
>
> Lucas.
>
>
>
>
>


[jira] [Commented] (LIVY-421) support multi statements run at the same time

2017-11-27 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-421?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16266535#comment-16266535
 ] 

Saisai Shao commented on LIVY-421:
--

I agree with what Jeff mentioned, multi-threading is dangerous for statement 
executing, except one thread per interpreter.

> support multi statements run at the same time
> -
>
> Key: LIVY-421
> URL: https://issues.apache.org/jira/browse/LIVY-421
> Project: Livy
>  Issue Type: Improvement
>  Components: REPL
>Affects Versions: 0.4
>Reporter: zhuweibin
>  Labels: performance
>
> In the v0.4.0-incubating's’ implementation, the number of threads used to 
> execute the statement code is hardcoded to one. When idle resources are 
> sufficient to execute multiple statements, a lot of resources are wasted and 
> blocking statements run. We are considering improvements in one of two ways:
> # 1. Add a configuration in livy.conf to globally control the number of 
> threads in interpreterExecutor for all repl/sessions to execute statements
> # 2 in the session to create a rest api parameter to set the maximum number 
> of statements can be run at the same time the number of statements
> hope to discuss this issue, thanks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (LIVY-421) support multi statements run at the same time

2017-11-27 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated LIVY-421:
-
Priority: Major  (was: Critical)

> support multi statements run at the same time
> -
>
> Key: LIVY-421
> URL: https://issues.apache.org/jira/browse/LIVY-421
> Project: Livy
>  Issue Type: Improvement
>  Components: REPL
>Affects Versions: 0.4
>Reporter: zhuweibin
>  Labels: performance
>
> In the v0.4.0-incubating's’ implementation, the number of threads used to 
> execute the statement code is hardcoded to one. When idle resources are 
> sufficient to execute multiple statements, a lot of resources are wasted and 
> blocking statements run. We are considering improvements in one of two ways:
> # 1. Add a configuration in livy.conf to globally control the number of 
> threads in interpreterExecutor for all repl/sessions to execute statements
> # 2 in the session to create a rest api parameter to set the maximum number 
> of statements can be run at the same time the number of statements
> hope to discuss this issue, thanks



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (LIVY-397) Support multiple languages in one session

2017-11-27 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-397?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved LIVY-397.
--
   Resolution: Fixed
Fix Version/s: 0.5.0

> Support multiple languages in one session
> -
>
> Key: LIVY-397
> URL: https://issues.apache.org/jira/browse/LIVY-397
> Project: Livy
>  Issue Type: New Feature
>  Components: REPL
>Affects Versions: 0.4
>    Reporter: Saisai Shao
>Assignee: Saisai Shao
> Fix For: 0.5.0
>
>
> This JIRA aims to support multiple languages in one session.
> Currently in the Livy one session can only have one interpreter, which means 
> user can only choose one language in one session, this limits the use case of 
> using different languages under one Spark application. So here propose to 
> support this feature.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (LIVY-399) Enable real test for PySpark and SparkR interpreters

2017-11-27 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-399?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved LIVY-399.
--
   Resolution: Fixed
Fix Version/s: 0.5.0

> Enable real test for PySpark and SparkR interpreters
> 
>
> Key: LIVY-399
> URL: https://issues.apache.org/jira/browse/LIVY-399
> Project: Livy
>  Issue Type: Sub-task
>  Components: REPL
>Affects Versions: 0.4
>    Reporter: Saisai Shao
>Assignee: Saisai Shao
> Fix For: 0.5.0
>
>  Time Spent: 2h
>  Remaining Estimate: 0h
>
> Currently because we lack pyspark and sparkr dependencies in our environment, 
> so we neglect the pyspark and sparkr integration test, also the UT for 
> PySpark and SparkR doesn't involve Spark things, only test the plain python 
> and R REPL, so we should figure out a way to support real test for PySpark 
> and SparkR interpreters.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: How to access Python script's stdout when running in Yarn cluster mode?

2017-11-26 Thread Saisai Shao
Since you're running with yarn cluster mode, the output from your python
script should be part of your yarn application log. you can get it via yarn
command like yarn log -applicationId , or others like Yarn
UI. Currently there's no Livy api for you to get application log via REST
API.

Thanks
Jerry

2017-11-24 20:27 GMT+08:00 Partridge, Lucas (GE Aviation) <
lucas.partri...@ge.com>:

> Hi,
>
>
>
> I’m using Livy’s GET /batches/{batchId}/log method to fetch the log lines
> from a Python script I’m running on Spark in Yarn cluster mode.
> Unfortunately the stdout from my Python script is not included in the log
> lines returned by GET /batches/{batchId}/log!
>
>
>
> Is this by design, or an unfortunate by-product of running in Yarn cluster
> mode?
>
>
>
> If this is intentional does anyone know how I can access the stdout from
> my Python script via Livy please? Preferably without having to change my
> REST client (a Java app) to use Yarn’s client deployment mode.
>
>
>
> Thanks,
>
> Lucas.
>
>
>


[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-24 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265035#comment-16265035
 ] 

Saisai Shao commented on SPARK-2926:


So your saying of 12x-30x boosting is only referring to reduce stage? of course 
this solution can boost the reduce stage, but it will also increase the time of 
map stage, so we'd better to use job time to evaluate. 

> Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle
> --
>
> Key: SPARK-2926
> URL: https://issues.apache.org/jira/browse/SPARK-2926
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle
>Affects Versions: 1.1.0
>Reporter: Saisai Shao
>Assignee: Saisai Shao
> Attachments: SortBasedShuffleRead.pdf, SortBasedShuffleReader on 
> Spark 2.x.pdf, Spark Shuffle Test Report(contd).pdf, Spark Shuffle Test 
> Report.pdf
>
>
> Currently Spark has already integrated sort-based shuffle write, which 
> greatly improve the IO performance and reduce the memory consumption when 
> reducer number is very large. But for the reducer side, it still adopts the 
> implementation of hash-based shuffle reader, which neglects the ordering 
> attributes of map output data in some situations.
> Here we propose a MR style sort-merge like shuffle reader for sort-based 
> shuffle to better improve the performance of sort-based shuffle.
> Working in progress code and performance test report will be posted later 
> when some unit test bugs are fixed.
> Any comments would be greatly appreciated. 
> Thanks a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22579) BlockManager.getRemoteValues and BlockManager.getRemoteBytes should be implemented using streaming

2017-11-23 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16265021#comment-16265021
 ] 

Saisai Shao commented on SPARK-22579:
-

I think this issue should have already been fixed by SPARK-22062 and 
PR(https://github.com/apache/spark/pull/19476), what you need to do is to set a 
proper size for large blocks.

> BlockManager.getRemoteValues and BlockManager.getRemoteBytes should be 
> implemented using streaming
> --
>
> Key: SPARK-22579
> URL: https://issues.apache.org/jira/browse/SPARK-22579
> Project: Spark
>  Issue Type: Improvement
>  Components: Block Manager, Spark Core
>Affects Versions: 2.1.0
>Reporter: Eyal Farago
>
> when an RDD partition is cached on an executor bu the task requiring it is 
> running on another executor (process locality ANY), the cached partition is 
> fetched via BlockManager.getRemoteValues which delegates to 
> BlockManager.getRemoteBytes, both calls are blocking.
> in my use case I had a 700GB RDD spread over 1000 partitions on a 6 nodes 
> cluster, cached to disk. rough math shows that average partition size is 
> 700MB.
> looking at spark UI it was obvious that tasks running with process locality 
> 'ANY' are much slower than local tasks (~40 seconds to 8-10 minutes ratio), I 
> was able to capture thread dumps of executors executing remote tasks and got 
> this stake trace:
> {quote}Thread ID  Thread Name Thread StateThread Locks
> 1521  Executor task launch worker-1000WAITING 
> Lock(java.util.concurrent.ThreadPoolExecutor$Worker@196462978})
> sun.misc.Unsafe.park(Native Method)
> java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
> java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
> java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireSharedInterruptibly(AbstractQueuedSynchronizer.java:997)
> java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1304)
> scala.concurrent.impl.Promise$DefaultPromise.tryAwait(Promise.scala:202)
> scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:218)
> scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)
> scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)
> scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)
> scala.concurrent.Await$.result(package.scala:190)
> org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:190)
> org.apache.spark.network.BlockTransferService.fetchBlockSync(BlockTransferService.scala:104)
> org.apache.spark.storage.BlockManager.getRemoteBytes(BlockManager.scala:582)
> org.apache.spark.storage.BlockManager.getRemoteValues(BlockManager.scala:550)
> org.apache.spark.storage.BlockManager.get(BlockManager.scala:638)
> org.apache.spark.storage.BlockManager.getOrElseUpdate(BlockManager.scala:690)
> org.apache.spark.rdd.RDD.getOrCompute(RDD.scala:334)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:285)
> org.apache.spark.rdd.ZippedPartitionsRDD2.compute(ZippedPartitionsRDD.scala:89)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:287)
> org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38)
> org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:323)
> org.apache.spark.rdd.RDD.iterator(RDD.scala:287){quote}
> digging into the code showed that the block manager first fetches all bytes 
> (getRemoteBytes) and then wraps it with a deserialization stream, this has 
> several draw backs:
> 1. blocking, requesting executor is blocked while the remote executor is 
> serving the block.
> 2. potentially large memory footprint on requesting executor, in my use case 
> a 700mb of raw bytes stored in a ChunkedByteBuffer.
> 3. inefficient, requesting side usually don't need all values at once as it 
> consumes the values via an iterator.
> 4. potentially large memory footprint on serving executor, in case the block 
> is cached in deserialized form the serving executor has to serialize it into 
> a ChunkedByteBuffer (BlockManager.doGetLocalBytes). this is both memory & CPU 
> intensive, memory footprint can be reduced by using a limited buffer for 
> serialization 'spilling' to the response stream.
> I suggest improving this either by implementing full streaming mechanism or 
> some kind of pagination mechanism, in addition the requesting executor should 
> be able to 

[jira] [Commented] (SPARK-22587) Spark job fails if fs.defaultFS and application jar are different url

2017-11-23 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16264986#comment-16264986
 ] 

Saisai Shao commented on SPARK-22587:
-

[~Prabhu Joseph], I think it is because the logic of comparing two FS 
{{compareFs}} is not worked as expected for wasb, it identifies these two FSs 
as the same FS, but in fact they're two FSs. that's why the following 
{{makeQualified}} will throw an exception.

> Spark job fails if fs.defaultFS and application jar are different url
> -
>
> Key: SPARK-22587
> URL: https://issues.apache.org/jira/browse/SPARK-22587
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Submit
>Affects Versions: 1.6.3
>Reporter: Prabhu Joseph
>
> Spark Job fails if the fs.defaultFs and url where application jar resides are 
> different and having same scheme,
> spark-submit  --conf spark.master=yarn-cluster wasb://XXX/tmp/test.py
> core-site.xml fs.defaultFS is set to wasb:///YYY. Hadoop list works (hadoop 
> fs -ls) works for both the url XXX and YYY.
> {code}
> Exception in thread "main" java.lang.IllegalArgumentException: Wrong FS: 
> wasb://XXX/tmp/test.py, expected: wasb://YYY 
> at org.apache.hadoop.fs.FileSystem.checkPath(FileSystem.java:665) 
> at 
> org.apache.hadoop.fs.azure.NativeAzureFileSystem.checkPath(NativeAzureFileSystem.java:1251)
>  
> at org.apache.hadoop.fs.FileSystem.makeQualified(FileSystem.java:485) 
> at org.apache.spark.deploy.yarn.Client.copyFileToRemote(Client.scala:396) 
> at 
> org.apache.spark.deploy.yarn.Client.org$apache$spark$deploy$yarn$Client$$distribute$1(Client.scala:507)
>  
> at 
> org.apache.spark.deploy.yarn.Client.prepareLocalResources(Client.scala:660) 
> at 
> org.apache.spark.deploy.yarn.Client.createContainerLaunchContext(Client.scala:912)
>  
> at org.apache.spark.deploy.yarn.Client.submitApplication(Client.scala:172) 
> at org.apache.spark.deploy.yarn.Client.run(Client.scala:1248) 
> at org.apache.spark.deploy.yarn.Client$.main(Client.scala:1307) 
> at org.apache.spark.deploy.yarn.Client.main(Client.scala) 
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) 
> at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) 
> at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>  
> at java.lang.reflect.Method.invoke(Method.java:498) 
> at 
> org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:751)
>  
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:187) 
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:212) 
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:126) 
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) 
> {code}
> The code Client.copyFileToRemote tries to resolve the path of application jar 
> (XXX) from the FileSystem object created using fs.defaultFS url (YYY) instead 
> of the actual url of application jar.
> val destFs = destDir.getFileSystem(hadoopConf)
> val srcFs = srcPath.getFileSystem(hadoopConf)
> getFileSystem will create the filesystem based on the url of the path and so 
> this is fine. But the below lines of code tries to get the srcPath (XXX url) 
> from the destFs (YYY url) and so it fails.
> var destPath = srcPath
> val qualifiedDestPath = destFs.makeQualified(destPath)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (LIVY-299) Only the output of last line is returned

2017-11-23 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved LIVY-299.
--
   Resolution: Fixed
Fix Version/s: 0.5.0

> Only the output of last line is returned
> 
>
> Key: LIVY-299
> URL: https://issues.apache.org/jira/browse/LIVY-299
> Project: Livy
>  Issue Type: Bug
>  Components: Interpreter
>Affects Versions: 0.3
>Reporter: Jeff Zhang
>Assignee: Saisai Shao
> Fix For: 0.5.0
>
>
> Request:
> {code}
> {"code": "print(1);\nprint(1)"}
> {code}
> Response:
> {code}
> {
>   "total_statements": 1,
>   "statements": [
> {
>   "id": 0,
>   "state": "available",
>   "output": {
> "status": "ok",
> "execution_count": 0,
> "data": {
>   "text/plain": "1"
> }
>   }
> }
>   ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SPARK-2926) Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle

2017-11-23 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-2926?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16264904#comment-16264904
 ] 

Saisai Shao commented on SPARK-2926:


[~XuanYuan], would you please use spark-perf's micro benchmark 
(https://github.com/databricks/spark-perf) to verify again with same workload 
as mentioned in original test report? That would be more comparable. 
Theoretically this solution cannot get 12x-30x boosting according to my test, 
because this solution don't actually reduce the computation in logic, just 
moving part of comparison from reduce to map, which potentially reduces some 
cpu cycling and improves cache hit.

Can you please explain the key difference and the reason of such boosting? 
Thanks! 

> Add MR-style (merge-sort) SortShuffleReader for sort-based shuffle
> --
>
> Key: SPARK-2926
> URL: https://issues.apache.org/jira/browse/SPARK-2926
> Project: Spark
>  Issue Type: Improvement
>  Components: Shuffle
>Affects Versions: 1.1.0
>Reporter: Saisai Shao
>Assignee: Saisai Shao
> Attachments: SortBasedShuffleRead.pdf, SortBasedShuffleReader on 
> Spark 2.x.pdf, Spark Shuffle Test Report(contd).pdf, Spark Shuffle Test 
> Report.pdf
>
>
> Currently Spark has already integrated sort-based shuffle write, which 
> greatly improve the IO performance and reduce the memory consumption when 
> reducer number is very large. But for the reducer side, it still adopts the 
> implementation of hash-based shuffle reader, which neglects the ordering 
> attributes of map output data in some situations.
> Here we propose a MR style sort-merge like shuffle reader for sort-based 
> shuffle to better improve the performance of sort-based shuffle.
> Working in progress code and performance test report will be posted later 
> when some unit test bugs are fixed.
> Any comments would be greatly appreciated. 
> Thanks a lot.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (LIVY-416) com.fasterxml.jackson.core.JsonGenerationException is sometimes thrown

2017-11-22 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao reassigned LIVY-416:


Assignee: Keiji Yoshida

> com.fasterxml.jackson.core.JsonGenerationException is sometimes thrown
> --
>
> Key: LIVY-416
> URL: https://issues.apache.org/jira/browse/LIVY-416
> Project: Livy
>  Issue Type: Bug
>Reporter: Keiji Yoshida
>Assignee: Keiji Yoshida
> Fix For: 0.5.0
>
> Attachments: JsonGenerationException.txt
>
>
> com.fasterxml.jackson.core.JsonGenerationException is sometimes thrown. The 
> full stack trace is show on JsonGenerationException.txt which is attached to 
> this issue.
> This is because of the Jackson's bug 
> (https://github.com/FasterXML/jackson-core/issues/307) which is fixed at 
> Jackson 2.7.7.
> To fix this issue, the version of Jackson should be updated from 2.4.4 to the 
> latest one (2.9.2).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (LIVY-416) com.fasterxml.jackson.core.JsonGenerationException is sometimes thrown

2017-11-22 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-416?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved LIVY-416.
--
   Resolution: Fixed
Fix Version/s: 0.5.0

> com.fasterxml.jackson.core.JsonGenerationException is sometimes thrown
> --
>
> Key: LIVY-416
> URL: https://issues.apache.org/jira/browse/LIVY-416
> Project: Livy
>  Issue Type: Bug
>Reporter: Keiji Yoshida
>Assignee: Keiji Yoshida
> Fix For: 0.5.0
>
> Attachments: JsonGenerationException.txt
>
>
> com.fasterxml.jackson.core.JsonGenerationException is sometimes thrown. The 
> full stack trace is show on JsonGenerationException.txt which is attached to 
> this issue.
> This is because of the Jackson's bug 
> (https://github.com/FasterXML/jackson-core/issues/307) which is fixed at 
> Jackson 2.7.7.
> To fix this issue, the version of Jackson should be updated from 2.4.4 to the 
> latest one (2.9.2).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (LIVY-418) Respect spark.pyspark.python for PythonInterpreter

2017-11-22 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao reassigned LIVY-418:


Assignee: Jeff Zhang

> Respect spark.pyspark.python for PythonInterpreter
> --
>
> Key: LIVY-418
> URL: https://issues.apache.org/jira/browse/LIVY-418
> Project: Livy
>  Issue Type: Improvement
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Fix For: 0.5.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (LIVY-418) Respect spark.pyspark.python for PythonInterpreter

2017-11-22 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-418?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved LIVY-418.
--
   Resolution: Fixed
Fix Version/s: 0.5.0

> Respect spark.pyspark.python for PythonInterpreter
> --
>
> Key: LIVY-418
> URL: https://issues.apache.org/jira/browse/LIVY-418
> Project: Livy
>  Issue Type: Improvement
>Reporter: Jeff Zhang
>Assignee: Jeff Zhang
> Fix For: 0.5.0
>
>




--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Assigned] (LIVY-299) Only the output of last line is returned

2017-11-22 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-299?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao reassigned LIVY-299:


Assignee: Saisai Shao

> Only the output of last line is returned
> 
>
> Key: LIVY-299
> URL: https://issues.apache.org/jira/browse/LIVY-299
> Project: Livy
>  Issue Type: Bug
>  Components: Interpreter
>Affects Versions: 0.3
>Reporter: Jeff Zhang
>Assignee: Saisai Shao
>
> Request:
> {code}
> {"code": "print(1);\nprint(1)"}
> {code}
> Response:
> {code}
> {
>   "total_statements": 1,
>   "statements": [
> {
>   "id": 0,
>   "state": "available",
>   "output": {
> "status": "ok",
> "execution_count": 0,
> "data": {
>   "text/plain": "1"
> }
>   }
> }
>   ]
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (LIVY-417) Not able to work with dataframes on livy

2017-11-20 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/LIVY-417?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16260351#comment-16260351
 ] 

Saisai Shao commented on LIVY-417:
--

Which version of Livy are you using? I think shared variables was just 
supported in master branch.

> Not able to work with dataframes on livy
> 
>
> Key: LIVY-417
> URL: https://issues.apache.org/jira/browse/LIVY-417
> Project: Livy
>  Issue Type: Bug
>Reporter: Partha Pratim Ghosh
>
> I am using livy's programmatic API. The requirement is to create multiple 
> contexts through livy, pull dataframes and later persist them back. Through a 
> job a DataFrame can be pulled into my application. However, when persistence 
> is required, the dataframe is sent to another Job which receives a DataFrame 
> and tries to persist it. Here DataFrame internals are null. 
> So, what is the procedure to extract a DataFrame from spark context using 
> livy to an application and later persisting it back to the same spark context 
> through livy. At present not able to find any such route.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SPARK-22528) History service and non-HDFS filesystems

2017-11-15 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22528?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16254669#comment-16254669
 ] 

Saisai Shao commented on SPARK-22528:
-

Hi [~adobe_pmackles] I think this issue might be due to the difference of 
FileSystems, I didn't test on ADLS, it works fine on HDFS, so I'm guessing 
ADLS/Hadoop bindings may not have consistent semantics with HDFS.

Unfortunately I cannot test on ADLS because I don't have such facilities at 
hand, it would be better if you can debug on it and find a fix solution. I 
think you already find the code, just to know the difference of access 
permission should be enough.

> History service and non-HDFS filesystems
> 
>
> Key: SPARK-22528
> URL: https://issues.apache.org/jira/browse/SPARK-22528
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: paul mackles
>Priority: Minor
>
> We are using Azure Data Lake (ADL) to store our event logs. This worked fine 
> in 2.1.x but in 2.2.0, the event logs are no longer visible to the history 
> server. I tracked it down to the call to:
> {code}
> SparkHadoopUtil.get.checkAccessPermission()
> {code}
> which was added to "FSHistoryProvider" in 2.2.0.
> I was able to workaround it by:
> * setting the files on ADL to world readable
> * setting HADOOP_PROXY to the Azure objectId of the service principal that 
> owns file
> Neither of those workaround are particularly desirable in our environment. 
> That said, I am not sure how this should be addressed:
> * Is this an issue with the Azure/Hadoop bindings not setting up the user 
> context correctly so that the "checkAccessPermission()" call succeeds w/out 
> having to use the username under which the process is running?
> * Is this an issue with "checkAccessPermission()" not really accounting for 
> all of the possible FileSystem implementations? If so, I would imagine that 
> there are similar issues when using S3.
> In spite of this check, I know the files are accessible through the 
> underlying FileSystem object so it feels like the latter but I don't think 
> that the FileSystem object alone could be used to implement this check.
> Any thoughts [~jerryshao]?



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (LIVY-415) Use objects and abstract classes in for Kind and SessionState.

2017-11-14 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao reassigned LIVY-415:


Assignee: Arun Allamsetty

> Use objects and abstract classes in for Kind and SessionState.
> --
>
> Key: LIVY-415
> URL: https://issues.apache.org/jira/browse/LIVY-415
> Project: Livy
>  Issue Type: Improvement
>  Components: REPL, Server
>Reporter: Arun Allamsetty
>Assignee: Arun Allamsetty
>Priority: Trivial
> Fix For: 0.5.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Scala lets us define Singleton Objects 
> (https://docs.scala-lang.org/tour/singleton-objects.html) rather than 
> creating (case) classes and which just have a default constructor. Also, 
> using abstract classes helps with brevity in this case as we can assign 
> default definitions to the functions we need.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (LIVY-415) Use objects and abstract classes in for Kind and SessionState.

2017-11-14 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved LIVY-415.
--
Resolution: Fixed

> Use objects and abstract classes in for Kind and SessionState.
> --
>
> Key: LIVY-415
> URL: https://issues.apache.org/jira/browse/LIVY-415
> Project: Livy
>  Issue Type: Improvement
>  Components: REPL, Server
>Reporter: Arun Allamsetty
>Priority: Trivial
> Fix For: 0.5.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Scala lets us define Singleton Objects 
> (https://docs.scala-lang.org/tour/singleton-objects.html) rather than 
> creating (case) classes and which just have a default constructor. Also, 
> using abstract classes helps with brevity in this case as we can assign 
> default definitions to the functions we need.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (LIVY-415) Use objects and abstract classes in for Kind and SessionState.

2017-11-14 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated LIVY-415:
-
Fix Version/s: 0.5.0

> Use objects and abstract classes in for Kind and SessionState.
> --
>
> Key: LIVY-415
> URL: https://issues.apache.org/jira/browse/LIVY-415
> Project: Livy
>  Issue Type: Improvement
>  Components: REPL, Server
>Reporter: Arun Allamsetty
>Priority: Trivial
> Fix For: 0.5.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Scala lets us define Singleton Objects 
> (https://docs.scala-lang.org/tour/singleton-objects.html) rather than 
> creating (case) classes and which just have a default constructor. Also, 
> using abstract classes helps with brevity in this case as we can assign 
> default definitions to the functions we need.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Updated] (LIVY-415) Use objects and abstract classes in for Kind and SessionState.

2017-11-14 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated LIVY-415:
-
Component/s: Server
 REPL

> Use objects and abstract classes in for Kind and SessionState.
> --
>
> Key: LIVY-415
> URL: https://issues.apache.org/jira/browse/LIVY-415
> Project: Livy
>  Issue Type: Improvement
>  Components: REPL, Server
>Reporter: Arun Allamsetty
>Priority: Trivial
> Fix For: 0.5.0
>
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> Scala lets us define Singleton Objects 
> (https://docs.scala-lang.org/tour/singleton-objects.html) rather than 
> creating (case) classes and which just have a default constructor. Also, 
> using abstract classes helps with brevity in this case as we can assign 
> default definitions to the functions we need.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Commented] (SPARK-9104) expose network layer memory usage

2017-11-13 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-9104?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16250675#comment-16250675
 ] 

Saisai Shao commented on SPARK-9104:


[~vsr] I think SPARK-21934 already exposed Netty shuffle metrics to metrics 
system, you can follow SPARK-21934 for the details.

For other Netty context like RPC, I don't have a strong feeling of support 
them, because usually memory usage is not so heavy for context like NettyRpcEnv.

> expose network layer memory usage
> -
>
> Key: SPARK-9104
> URL: https://issues.apache.org/jira/browse/SPARK-9104
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Zhang, Liye
>Assignee: Saisai Shao
> Fix For: 2.3.0
>
>
> The default network transportation is netty, and when transfering blocks for 
> shuffle, the network layer will consume a decent size of memory, we shall 
> collect the memory usage of this part and expose it. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (LIVY-414) Support setting environment variables during session creation

2017-11-12 Thread Saisai Shao (JIRA)
Saisai Shao created LIVY-414:


 Summary: Support setting environment variables during session 
creation
 Key: LIVY-414
 URL: https://issues.apache.org/jira/browse/LIVY-414
 Project: Livy
  Issue Type: Improvement
  Components: API, RSC, Server
Affects Versions: 0.5.0
Reporter: Saisai Shao
Assignee: Saisai Shao


In the current Livy, we don't support setting environment variables per 
session, Livy server by default will honor env variables set in spark-env.sh 
during SparkSubmit process launches. This is not flexible enough if we want to 
set different env variables per session. So here propose to support this 
feature.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: [Vote] SPIP: Continuous Processing Mode for Structured Streaming

2017-11-07 Thread Saisai Shao
+1, looking forward to more design details of this feature.

Thanks
Jerry

On Wed, Nov 8, 2017 at 6:40 AM, Shixiong(Ryan) Zhu 
wrote:

> +1
>
> On Tue, Nov 7, 2017 at 1:34 PM, Joseph Bradley 
> wrote:
>
>> +1
>>
>> On Mon, Nov 6, 2017 at 5:11 PM, Michael Armbrust 
>> wrote:
>>
>>> +1
>>>
>>> On Sat, Nov 4, 2017 at 11:02 AM, Xiao Li  wrote:
>>>
 +1

 2017-11-04 11:00 GMT-07:00 Burak Yavuz :

> +1
>
> On Fri, Nov 3, 2017 at 10:02 PM, vaquar khan 
> wrote:
>
>> +1
>>
>> On Fri, Nov 3, 2017 at 8:14 PM, Weichen Xu > > wrote:
>>
>>> +1.
>>>
>>> On Sat, Nov 4, 2017 at 8:04 AM, Matei Zaharia <
>>> matei.zaha...@gmail.com> wrote:
>>>
 +1 from me too.

 Matei

 > On Nov 3, 2017, at 4:59 PM, Wenchen Fan 
 wrote:
 >
 > +1.
 >
 > I think this architecture makes a lot of sense to let executors
 talk to source/sink directly, and bring very low latency.
 >
 > On Thu, Nov 2, 2017 at 9:01 AM, Sean Owen 
 wrote:
 > +0 simply because I don't feel I know enough to have an opinion.
 I have no reason to doubt the change though, from a skim through the 
 doc.
 >
 >
 > On Wed, Nov 1, 2017 at 3:37 PM Reynold Xin 
 wrote:
 > Earlier I sent out a discussion thread for CP in Structured
 Streaming:
 >
 > https://issues.apache.org/jira/browse/SPARK-20928
 >
 > It is meant to be a very small, surgical change to Structured
 Streaming to enable ultra-low latency. This is great timing because we 
 are
 also designing and implementing data source API v2. If designed 
 properly,
 we can have the same data source API working for both streaming and 
 batch.
 >
 >
 > Following the SPIP process, I'm putting this SPIP up for a vote.
 >
 > +1: Let's go ahead and design / implement the SPIP.
 > +0: Don't really care.
 > -1: I do not think this is a good idea for the following reasons.
 >
 >
 >


 
 -
 To unsubscribe e-mail: dev-unsubscr...@spark.apache.org


>>>
>>
>>
>> --
>> Regards,
>> Vaquar Khan
>> +1 -224-436-0783 <(224)%20436-0783>
>> Greater Chicago
>>
>
>

>>>
>>
>>
>> --
>>
>> Joseph Bradley
>>
>> Software Engineer - Machine Learning
>>
>> Databricks, Inc.
>>
>> [image: http://databricks.com] 
>>
>
>


[jira] [Assigned] (LIVY-412) CreateSession should be reject when too many child process "spark-submit" is running

2017-11-07 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao reassigned LIVY-412:


Assignee: Hong Shen

> CreateSession should be reject when too many child process "spark-submit" is 
> running
> 
>
> Key: LIVY-412
> URL: https://issues.apache.org/jira/browse/LIVY-412
> Project: Livy
>  Issue Type: Improvement
>  Components: Server
>Affects Versions: 0.4
>Reporter: Hong Shen
>Assignee: Hong Shen
> Fix For: 0.5.0
>
>
> In our cluster, livy server run with spark yarn cluster mode, when 
> createSession request is too frequently, livyServer will start too more 
> spark-submit child process, it will cause the machine oom. I think livy 
> server should reject the create session request when there is too more 
> spark-submit child process. I have fix it in our own cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


[jira] [Resolved] (LIVY-412) CreateSession should be reject when too many child process "spark-submit" is running

2017-11-07 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/LIVY-412?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved LIVY-412.
--
   Resolution: Fixed
Fix Version/s: 0.5.0

Issue resolved by pull request 58
[https://github.com/apache/incubator-livy/pull/58]

> CreateSession should be reject when too many child process "spark-submit" is 
> running
> 
>
> Key: LIVY-412
> URL: https://issues.apache.org/jira/browse/LIVY-412
> Project: Livy
>  Issue Type: Improvement
>  Components: Server
>Affects Versions: 0.4
>Reporter: Hong Shen
> Fix For: 0.5.0
>
>
> In our cluster, livy server run with spark yarn cluster mode, when 
> createSession request is too frequently, livyServer will start too more 
> spark-submit child process, it will cause the machine oom. I think livy 
> server should reject the create session request when there is too more 
> spark-submit child process. I have fix it in our own cluster.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)


Re: Livy POST Sessions can not work with conf

2017-11-03 Thread Saisai Shao
I think it should be worked, can you please test with 0.4 version of Livy.
Also "conf" should be a map of string key to string value.

"conf" : {"spark.dynamicAllocation.enabled":"false","spark.shuffle.
service.enabled":"false"}

Besides, please be aware in the current Livy we only tested on local and
yarn mode, we don't guarantee the correct behavior using Mesos cluster
manager.

On Fri, Nov 3, 2017 at 5:36 PM, 王峰  wrote:

> Hello everyone , I have meet a problem about Livy-0.3 when I run `POST
> /sessions` to create a new interactive spark session
>
>
> here is the post body
> ```Content-Type: application/json; charset=utf-8
> {
>   "kind" : "spark",
>   "proxyUser" : "root",
>   "executorMemory" : "4G",
>   "executorCores": 4,
>   "numExecutors" : 4,
>"conf" : {"spark.dynamicAllocation.enabled":false,"spark.shuffle.
> service.enabled":false}
> }
> ```
>
> However, I found that this Livy session  allocated all of resources in
> Mesos UI as Pic shows
>
> [image: 内嵌图片 1]
>
> It seem like that `conf` did not worked but numExecutors , executorCores
> and executorMemory worked well..
>
> please help me  thanks...
>


[jira] [Commented] (SPARK-22426) Spark AM launching containers on node where External spark shuffle service failed to initialize

2017-11-02 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22426?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16236939#comment-16236939
 ] 

Saisai Shao commented on SPARK-22426:
-

This kind of scenario was handled in SPARK-13669 regarding to blacklist 
mechanism.

> Spark AM launching containers on node where External spark shuffle service 
> failed to initialize
> ---
>
> Key: SPARK-22426
> URL: https://issues.apache.org/jira/browse/SPARK-22426
> Project: Spark
>  Issue Type: Bug
>  Components: Shuffle, YARN
>Affects Versions: 1.6.3
>Reporter: Prabhu Joseph
>Priority: Major
>
> When Spark External Shuffle Service on a NodeManager fails, the remote 
> executors will fail while fetching the data from the executors launched on 
> this Node. Spark ApplicationMaster should not launch containers on this Node 
> or not use external shuffle service.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22405) Enrich the event information and add new event of ExternalCatalogEvent

2017-11-01 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16235042#comment-16235042
 ] 

Saisai Shao commented on SPARK-22405:
-

Thanks [~hvanhovell] for your comment, let me propose a solution for this.

> Enrich the event information and add new event of ExternalCatalogEvent
> --
>
> Key: SPARK-22405
> URL: https://issues.apache.org/jira/browse/SPARK-22405
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Saisai Shao
>Priority: Minor
>
> We're building a data lineage tool in which we need to monitor the metadata 
> changes in {{ExternalCatalog}}, current {{ExternalCatalog}} already provides 
> several useful events like "CreateDatabaseEvent" for custom SparkListener to 
> use. But the information provided by such event is not rich enough, for 
> example {{CreateTablePreEvent}} only provides "database" name and "table" 
> name, not all the table metadata, which is hard for user to get all the table 
> related useful information.
> So here propose to and new {{ExternalCatalogEvent}} and enrich the current 
> existing events for all the catalog related updates.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Assigned] (SPARK-22172) Worker hangs when the external shuffle service port is already in use

2017-11-01 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao reassigned SPARK-22172:
---

Assignee: Devaraj K

> Worker hangs when the external shuffle service port is already in use
> -
>
> Key: SPARK-22172
> URL: https://issues.apache.org/jira/browse/SPARK-22172
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Devaraj K
>Assignee: Devaraj K
>Priority: Major
> Fix For: 2.3.0
>
>
> When the external shuffle service port is already in use, Worker throws the 
> below BindException and hangs forever, I think the exception should be 
> handled gracefully. 
> {code:xml}
> 17/09/29 11:16:30 INFO ExternalShuffleService: Starting shuffle service on 
> port 7337 (auth enabled = false)
> 17/09/29 11:16:30 ERROR Inbox: Ignoring error
> java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind0(Native Method)
> at sun.nio.ch.Net.bind(Net.java:433)
> at sun.nio.ch.Net.bind(Net.java:425)
> at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
> at 
> io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)
> at 
> io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:500)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1218)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:495)
> at 
> io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:480)
> at 
> io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:965)
> at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:209)
> at 
> io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:355)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Resolved] (SPARK-22172) Worker hangs when the external shuffle service port is already in use

2017-11-01 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22172?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao resolved SPARK-22172.
-
   Resolution: Fixed
Fix Version/s: 2.3.0

Issue resolved by pull request 19396
[https://github.com/apache/spark/pull/19396]

> Worker hangs when the external shuffle service port is already in use
> -
>
> Key: SPARK-22172
> URL: https://issues.apache.org/jira/browse/SPARK-22172
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.2.0
>Reporter: Devaraj K
>Priority: Major
> Fix For: 2.3.0
>
>
> When the external shuffle service port is already in use, Worker throws the 
> below BindException and hangs forever, I think the exception should be 
> handled gracefully. 
> {code:xml}
> 17/09/29 11:16:30 INFO ExternalShuffleService: Starting shuffle service on 
> port 7337 (auth enabled = false)
> 17/09/29 11:16:30 ERROR Inbox: Ignoring error
> java.net.BindException: Address already in use
> at sun.nio.ch.Net.bind0(Native Method)
> at sun.nio.ch.Net.bind(Net.java:433)
> at sun.nio.ch.Net.bind(Net.java:425)
> at 
> sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
> at 
> io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:128)
> at 
> io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:500)
> at 
> io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1218)
> at 
> io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:495)
> at 
> io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:480)
> at 
> io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:965)
> at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:209)
> at 
> io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:355)
> at 
> io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:399)
> {code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22405) Enrich the event information and add new event of ExternalCatalogEvent

2017-10-31 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16233569#comment-16233569
 ] 

Saisai Shao commented on SPARK-22405:
-

Thanks [~hvanhovell] for your comments.

bq. When implementing this we just wanted to have a way to know that some 
metadata was about to change. A consumer could always retrieve more information 
about the the (to-be) changed by querying the catalog

I think this is a feasible approach to satisfy my needs. But there still 
requires more events to be posted, like "AlterTableEvent" or 
"AlterDatabaseEvent", so that user could query the catalog based on the posted 
events, without this user doesn't know when table/db is altered. What do you 
think?

> Enrich the event information and add new event of ExternalCatalogEvent
> --
>
> Key: SPARK-22405
> URL: https://issues.apache.org/jira/browse/SPARK-22405
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>    Affects Versions: 2.2.0
>Reporter: Saisai Shao
>Priority: Minor
>
> We're building a data lineage tool in which we need to monitor the metadata 
> changes in {{ExternalCatalog}}, current {{ExternalCatalog}} already provides 
> several useful events like "CreateDatabaseEvent" for custom SparkListener to 
> use. But the information provided by such event is not rich enough, for 
> example {{CreateTablePreEvent}} only provides "database" name and "table" 
> name, not all the table metadata, which is hard for user to get all the table 
> related useful information.
> So here propose to and new {{ExternalCatalogEvent}} and enrich the current 
> existing events for all the catalog related updates.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22405) Enrich the event information and add new event of ExternalCatalogEvent

2017-10-31 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226724#comment-16226724
 ] 

Saisai Shao commented on SPARK-22405:
-

Thanks [~cloud_fan] for your comments. Using fake {{ExternalCatalog}} to 
delegate might be one solution, but this will also force user to use this 
custom {{ExternalCatalog}} only as far as I know. This might be OK for the end 
user, but for us who will deliver packages to user, such restriction seems not 
so feasible.

> Enrich the event information and add new event of ExternalCatalogEvent
> --
>
> Key: SPARK-22405
> URL: https://issues.apache.org/jira/browse/SPARK-22405
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Saisai Shao
>Priority: Minor
>
> We're building a data lineage tool in which we need to monitor the metadata 
> changes in {{ExternalCatalog}}, current {{ExternalCatalog}} already provides 
> several useful events like "CreateDatabaseEvent" for custom SparkListener to 
> use. But the information provided by such event is not rich enough, for 
> example {{CreateTablePreEvent}} only provides "database" name and "table" 
> name, not all the table metadata, which is hard for user to get all the table 
> related useful information.
> So here propose to and new {{ExternalCatalogEvent}} and enrich the current 
> existing events for all the catalog related updates.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22405) Enrich the event information and add new event of ExternalCatalogEvent

2017-10-31 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226474#comment-16226474
 ] 

Saisai Shao commented on SPARK-22405:
-

CC [~smilegator], what's your opinion about this proposal? Thanks!

> Enrich the event information and add new event of ExternalCatalogEvent
> --
>
> Key: SPARK-22405
> URL: https://issues.apache.org/jira/browse/SPARK-22405
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Saisai Shao
>Priority: Minor
>
> We're building a data lineage tool in which we need to monitor the metadata 
> changes in {{ExternalCatalog}}, current {{ExternalCatalog}} already provides 
> several useful events like "CreateDatabaseEvent" for custom SparkListener to 
> use. But the information provided by such event is not rich enough, for 
> example {{CreateTablePreEvent}} only provides "database" name and "table" 
> name, not all the table metadata, which is hard for user to get all the table 
> related useful information.
> So here propose to and new {{ExternalCatalogEvent}} and enrich the current 
> existing events for all the catalog related updates.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22405) Enrich the event information and add new event of ExternalCatalogEvent

2017-10-31 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22405?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226468#comment-16226468
 ] 

Saisai Shao commented on SPARK-22405:
-

This is the WIP branch 
(https://github.com/jerryshao/apache-spark/tree/SPARK-22405).

[~cloud_fan], do you think this proposal is feasible or not?

> Enrich the event information and add new event of ExternalCatalogEvent
> --
>
> Key: SPARK-22405
> URL: https://issues.apache.org/jira/browse/SPARK-22405
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>Reporter: Saisai Shao
>Priority: Minor
>
> We're building a data lineage tool in which we need to monitor the metadata 
> changes in {{ExternalCatalog}}, current {{ExternalCatalog}} already provides 
> several useful events like "CreateDatabaseEvent" for custom SparkListener to 
> use. But the information provided by such event is not rich enough, for 
> example {{CreateTablePreEvent}} only provides "database" name and "table" 
> name, not all the table metadata, which is hard for user to get all the table 
> related useful information.
> So here propose to and new {{ExternalCatalogEvent}} and enrich the current 
> existing events for all the catalog related updates.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-22405) Enrich the event information of ExternalCatalogEvent

2017-10-31 Thread Saisai Shao (JIRA)
Saisai Shao created SPARK-22405:
---

 Summary: Enrich the event information of ExternalCatalogEvent
 Key: SPARK-22405
 URL: https://issues.apache.org/jira/browse/SPARK-22405
 Project: Spark
  Issue Type: Improvement
  Components: SQL
Affects Versions: 2.2.0
Reporter: Saisai Shao
Priority: Minor


We're building a data lineage tool in which we need to monitor the metadata 
changes in {{ExternalCatalog}}, current {{ExternalCatalog}} already provides 
several useful events like "CreateDatabaseEvent" for custom SparkListener to 
use. But the information provided by such event is not rich enough, for example 
{{CreateTablePreEvent}} only provides "database" name and "table" name, not all 
the table metadata, which is hard for user to get all the table related useful 
information.

So here propose to and new {{ExternalCatalogEvent}} and enrich the current 
existing events for all the catalog related updates.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-22405) Enrich the event information and add new event of ExternalCatalogEvent

2017-10-31 Thread Saisai Shao (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-22405?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Saisai Shao updated SPARK-22405:

Summary: Enrich the event information and add new event of 
ExternalCatalogEvent  (was: Enrich the event information of 
ExternalCatalogEvent)

> Enrich the event information and add new event of ExternalCatalogEvent
> --
>
> Key: SPARK-22405
> URL: https://issues.apache.org/jira/browse/SPARK-22405
> Project: Spark
>  Issue Type: Improvement
>  Components: SQL
>Affects Versions: 2.2.0
>    Reporter: Saisai Shao
>Priority: Minor
>
> We're building a data lineage tool in which we need to monitor the metadata 
> changes in {{ExternalCatalog}}, current {{ExternalCatalog}} already provides 
> several useful events like "CreateDatabaseEvent" for custom SparkListener to 
> use. But the information provided by such event is not rich enough, for 
> example {{CreateTablePreEvent}} only provides "database" name and "table" 
> name, not all the table metadata, which is hard for user to get all the table 
> related useful information.
> So here propose to and new {{ExternalCatalogEvent}} and enrich the current 
> existing events for all the catalog related updates.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Re: ClassNotFoundException on job submit

2017-10-26 Thread Saisai Shao
You can choose to set "livy.spark.master" to "local" and
"livy.spark.deploy-mode" to "client" to start Spark with local mode, in
such case YARN is not required.

Otherwise if you plan to run on YARN, you have to install Hadoop and
configure HADOOP_CONF_DIR in livy-env.sh.

On Thu, Oct 26, 2017 at 9:40 PM, Stefan Miklosovic 
wrote:

> Hi,
>
> I am running Livy server in connection with Spark without Hadoop. I am
> setting only SPARK_HOME and I am getting this in Livy UI logs after
> job submission.
>
> I am using pretty much standard configuration but
> livy.spark.deploy-mode = cluster
>
> Do I need to run with Hadoop installation as well and specify
> HADOOP_CONF_DIR?
>
> Is not it possible to run Livy with "plain" Spark without YARN?
>
> stderr:
> java.lang.ClassNotFoundException:
> at java.lang.Class.forName0(Native Method)
> at java.lang.Class.forName(Class.java:348)
> at org.apache.spark.util.Utils$.classForName(Utils.scala:230)
> at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$
> deploy$SparkSubmit$$runMain(SparkSubmit.scala:712)
> at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
> at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
> at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
> at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
>
> Thanks!
>
> --
> Stefan Miklosovic
>


[jira] [Commented] (SPARK-22357) SparkContext.binaryFiles ignore minPartitions parameter

2017-10-26 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16220431#comment-16220431
 ] 

Saisai Shao commented on SPARK-22357:
-

Yes, I know this parameter is ignored, but I'm not sure is it intended or not. 
If it breaks your case I think we should fix it anyway.

> SparkContext.binaryFiles ignore minPartitions parameter
> ---
>
> Key: SPARK-22357
> URL: https://issues.apache.org/jira/browse/SPARK-22357
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.2, 2.2.0
>Reporter: Weichen Xu
>
> this is a bug in binaryFiles - even though we give it the partitions, 
> binaryFiles ignores it.
> This is a bug introduced in spark 2.1 from spark 2.0, in file 
> PortableDataStream.scala the argument “minPartitions” is no longer used (with 
> the push to master on 11/7/6):
> {code}
> /**
> Allow minPartitions set by end-user in order to keep compatibility with old 
> Hadoop API
> which is set through setMaxSplitSize
> */
> def setMinPartitions(sc: SparkContext, context: JobContext, minPartitions: 
> Int) {
> val defaultMaxSplitBytes = 
> sc.getConf.get(config.FILES_MAX_PARTITION_BYTES)
> val openCostInBytes = sc.getConf.get(config.FILES_OPEN_COST_IN_BYTES)
> val defaultParallelism = sc.defaultParallelism
> val files = listStatus(context).asScala
> val totalBytes = files.filterNot(.isDirectory).map(.getLen + 
> openCostInBytes).sum
> val bytesPerCore = totalBytes / defaultParallelism
> val maxSplitSize = Math.min(defaultMaxSplitBytes, 
> Math.max(openCostInBytes, bytesPerCore))
> super.setMaxSplitSize(maxSplitSize)
> }
> {code}
> The code previously, in version 2.0, was:
> {code}
> def setMinPartitions(context: JobContext, minPartitions: Int) {
> val totalLen = 
> listStatus(context).asScala.filterNot(.isDirectory).map(.getLen).sum
> val maxSplitSize = math.ceil(totalLen / math.max(minPartitions, 
> 1.0)).toLong
> super.setMaxSplitSize(maxSplitSize)
> }
> {code}
> The new code is very smart, but it ignores what the user passes in and uses 
> the data size, which is kind of a breaking change in some sense
> In our specific case this was a problem, because we initially read in just 
> the files names and only after that the dataframe becomes very large, when 
> reading in the images themselves – and in this case the new code does not 
> handle the partitioning very well.
> I’m not sure if it can be easily fixed because I don’t understand the full 
> context of the change in spark (but at the very least the unused parameter 
> should be removed to avoid confusion).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-22357) SparkContext.binaryFiles ignore minPartitions parameter

2017-10-26 Thread Saisai Shao (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-22357?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16220395#comment-16220395
 ] 

Saisai Shao commented on SPARK-22357:
-

bq. In our specific case this was a problem, because we initially read in just 
the files names and only after that the dataframe becomes very large, when 
reading in the images themselves – and in this case the new code does not 
handle the partitioning very well.

Would you please explain more about this?

> SparkContext.binaryFiles ignore minPartitions parameter
> ---
>
> Key: SPARK-22357
> URL: https://issues.apache.org/jira/browse/SPARK-22357
> Project: Spark
>  Issue Type: Bug
>  Components: Spark Core
>Affects Versions: 2.1.2, 2.2.0
>Reporter: Weichen Xu
>
> this is a bug in binaryFiles - even though we give it the partitions, 
> binaryFiles ignores it.
> This is a bug introduced in spark 2.1 from spark 2.0, in file 
> PortableDataStream.scala the argument “minPartitions” is no longer used (with 
> the push to master on 11/7/6):
> {code}
> /**
> Allow minPartitions set by end-user in order to keep compatibility with old 
> Hadoop API
> which is set through setMaxSplitSize
> */
> def setMinPartitions(sc: SparkContext, context: JobContext, minPartitions: 
> Int) {
> val defaultMaxSplitBytes = 
> sc.getConf.get(config.FILES_MAX_PARTITION_BYTES)
> val openCostInBytes = sc.getConf.get(config.FILES_OPEN_COST_IN_BYTES)
> val defaultParallelism = sc.defaultParallelism
> val files = listStatus(context).asScala
> val totalBytes = files.filterNot(.isDirectory).map(.getLen + 
> openCostInBytes).sum
> val bytesPerCore = totalBytes / defaultParallelism
> val maxSplitSize = Math.min(defaultMaxSplitBytes, 
> Math.max(openCostInBytes, bytesPerCore))
> super.setMaxSplitSize(maxSplitSize)
> }
> {code}
> The code previously, in version 2.0, was:
> {code}
> def setMinPartitions(context: JobContext, minPartitions: Int) {
> val totalLen = 
> listStatus(context).asScala.filterNot(.isDirectory).map(.getLen).sum
> val maxSplitSize = math.ceil(totalLen / math.max(minPartitions, 
> 1.0)).toLong
> super.setMaxSplitSize(maxSplitSize)
> }
> {code}
> The new code is very smart, but it ignores what the user passes in and uses 
> the data size, which is kind of a breaking change in some sense
> In our specific case this was a problem, because we initially read in just 
> the files names and only after that the dataframe becomes very large, when 
> reading in the images themselves – and in this case the new code does not 
> handle the partitioning very well.
> I’m not sure if it can be easily fixed because I don’t understand the full 
> context of the change in spark (but at the very least the unused parameter 
> should be removed to avoid confusion).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



<    3   4   5   6   7   8   9   10   11   12   >