[jira] [Commented] (SPARK-15032) When we create a new JDBC session, we may need to create a new session of executionHive

2016-05-08 Thread Sagar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15275874#comment-15275874
 ] 

Sagar commented on SPARK-15032:
---

At the time of JDBC creation, we use thrift server executioHive, so what you 
are proposing is to create a new session of executionHive right? 

> When we create a new JDBC session, we may need to create a new session of 
> executionHive
> ---
>
> Key: SPARK-15032
> URL: https://issues.apache.org/jira/browse/SPARK-15032
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Yin Huai
>Priority: Critical
>
> Right now, we only use executionHive in thriftserver. When we create a new 
> jdbc session, we probably need to create a new session of executionHive. I am 
> not sure what will break if we leave the code as is. But, I feel it will be 
> safer to create a new session of executionHive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15125) CSV data source recognizes empty quoted strings in the input as null.

2016-05-05 Thread Sagar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272017#comment-15272017
 ] 

Sagar commented on SPARK-15125:
---

Shouldn't we always infer these as empty strings, and then users can do a 
simple project to turn them into nulls?
I think we take these as empty strings and user can initiate NULL to all those 
empty string.

> CSV data source recognizes empty quoted strings in the input as null. 
> --
>
> Key: SPARK-15125
> URL: https://issues.apache.org/jira/browse/SPARK-15125
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Affects Versions: 2.0.0
>Reporter: Suresh Thalamati
>
> CSV data source does not differentiate between empty quoted strings and empty 
> fields  as null. In some scenarios user would want  to differentiate between 
> these values,  especially in the context of SQL where NULL , and empty string 
> have different meanings  If input data happens to be dump from traditional 
> relational data source, users will see different results for the SQL queries. 
> {code}
> Repro:
> Test Data: (test.csv)
> year,make,model,comment,price
> 2017,Tesla,Mode 3,looks nice.,35000.99
> 2016,Chevy,Bolt,"",29000.00
> 2015,Porsche,"",,
> scala> val df= sqlContext.read.format("csv").option("header", 
> "true").option("inferSchema", "true").option("nullValue", 
> null).load("/tmp/test.csv")
> df: org.apache.spark.sql.DataFrame = [year: int, make: string ... 3 more 
> fields]
> scala> df.show
> ++---+--+---++
> |year|   make| model|comment|   price|
> ++---+--+---++
> |2017|  Tesla|Mode 3|looks nice.|35000.99|
> |2016|  Chevy|  Bolt|   null| 29000.0|
> |2015|Porsche|  null|   null|null|
> ++---+--+---++
> Expected:
> ++---+--+---++
> |year|   make| model|comment|   price|
> ++---+--+---++
> |2017|  Tesla|Mode 3|looks nice.|35000.99|
> |2016|  Chevy|  Bolt|   | 29000.0|
> |2015|Porsche|  |   null|null|
> ++---+--+---++
> {code}
> Testing a fix for the this issue. I will give a shot at submitting a PR for 
> this soon. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15142) Spark Mesos dispatcher becomes unusable when the Mesos master restarts

2016-05-05 Thread Sagar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272013#comment-15272013
 ] 

Sagar commented on SPARK-15142:
---

Spark Mesos dispatcher queues all the applications when Mesos master restarted 
running application lost their reference or I think after Mesos master restarts 
it includes previously queued up applications and start running them.

> Spark Mesos dispatcher becomes unusable when the Mesos master restarts
> --
>
> Key: SPARK-15142
> URL: https://issues.apache.org/jira/browse/SPARK-15142
> Project: Spark
>  Issue Type: Bug
>  Components: Deploy, Mesos
>Reporter: Devaraj K
>Priority: Minor
>
> While Spark Mesos dispatcher running if the Mesos master gets restarted then 
> Spark Mesos dispatcher will keep running and queues up all the submitted 
> applications and will not launch them.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15063) filtering and joining back doesn't work

2016-05-04 Thread Sagar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270450#comment-15270450
 ] 

Sagar commented on SPARK-15063:
---

Yes but where we are using seq, there we can use t1 to get required results for 
each filter as mentioned above. 
In order to refer its column we need to involve t1.

val t2a = sc.makeRDD(accounts).toDF("uid", "type", "amount")
val t2s = t2a.filter(t2a("type") <=> "savings")

t1.
  join(t2c, t1("uid") <=> t2c("uid"), "left").
  join(t2s, t1("uid") <=> t2s("uid"), "left").
  take(10)

> filtering and joining back doesn't work
> ---
>
> Key: SPARK-15063
> URL: https://issues.apache.org/jira/browse/SPARK-15063
> Project: Spark
>  Issue Type: Bug
>Affects Versions: 1.6.1
>Reporter: Neville Kadwa
>
> I'm trying to filter and join to do a simple pivot but getting very odd 
> results.
> {quote} {noformat}
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext.implicits._
> val people = Array((1, "sam"), (2, "joe"), (3, "sally"), (4, "joanna"))
> val accounts = Array(
>   (1, "checking", 100.0),
>   (1, "savings", 300.0),
>   (2, "savings", 1000.0),
>   (3, "carloan", 12000.0),
>   (3, "checking", 400.0)
> )
> val t1 = sc.makeRDD(people).toDF("uid", "name")
> val t2 = sc.makeRDD(accounts).toDF("uid", "type", "amount")
> val t2c = t2.filter(t2("type") <=> "checking")
> val t2s = t2.filter(t2("type") <=> "savings")
> t1.
>   join(t2c, t1("uid") <=> t2c("uid"), "left").
>   join(t2s, t1("uid") <=> t2s("uid"), "left").
>   take(10)
> {noformat} {quote}
> The results are wrong:
> {quote} {noformat}
> Array(
>   [1,sam,1,checking,100.0,1,savings,300.0],
>   [1,sam,1,checking,100.0,2,savings,1000.0],
>   [2,joe,null,null,null,null,null,null],
>   [3,sally,3,checking,400.0,1,savings,300.0],
>   [3,sally,3,checking,400.0,2,savings,1000.0],
>   [4,joanna,null,null,null,null,null,null]
> )
> {noformat} {quote}
> The way I can force it to work properly is to create a new df for each filter:
> {quote} {noformat}
> val t2a = sc.makeRDD(accounts).toDF("uid", "type", "amount")
> val t2s = t2a.filter(t2a("type") <=> "savings")
> t1.
>   join(t2c, t1("uid") <=> t2c("uid"), "left").
>   join(t2s, t1("uid") <=> t2s("uid"), "left").
>   take(10)
> {noformat} {quote}
> The results are right:
> {quote} {noformat}
> Array(
>   [1,sam,1,checking,100.0,1,savings,300.0],
>   [2,joe,null,null,null,2,savings,1000.0],
>   [3,sally,3,checking,400.0,null,null,null],
>   [4,joanna,null,null,null,null,null,null]
> )
> {noformat} {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15072) Remove SparkSession.withHiveSupport

2016-05-03 Thread Sagar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270155#comment-15270155
 ] 

Sagar commented on SPARK-15072:
---

[~techaddict]  Yes it fails as assembly/assembly removed, test is ignored right 
now, means they are not considering it or what?

> Remove SparkSession.withHiveSupport
> ---
>
> Key: SPARK-15072
> URL: https://issues.apache.org/jira/browse/SPARK-15072
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Reynold Xin
>Assignee: Sandeep Singh
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15072) Remove SparkSession.withHiveSupport

2016-05-03 Thread Sagar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15072?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270032#comment-15270032
 ] 

Sagar commented on SPARK-15072:
---

This helps to build test.jar  
$ ./build/sbt -Pyarn -Phadoop-2.3 -Pkinesis-asl -Phive-thriftserver -Phive 
package assembly/assembly streaming-kafka-assembly/assembly 
streaming-flume-assembly/assembly streaming-mqtt-assembly/assembly 
streaming-mqtt/test:assembly streaming-kinesis-asl-assembly/assembly
$ cd sql/hive/src/test/resources/regression-test-SPARK-8489/
$ scalac -classpath 
~/spark/assembly/target/scala-2.11/spark-assembly-2.0.0-SNAPSHOT-hadoop2.3.0.jar
 Main.scala MyCoolClass.scala
$ rm test.jar
$ jar cvf test.jar *.class
$ cd ~/spark
$ ~/bin/spark-submit' '--conf' 'spark.ui.enabled=false' '--conf' 
'spark.master.rest.enabled=false' '--driver-java-options' 
'-Dderby.system.durability=test' '--class' 'Main' 
'sql/hive/src/test/resources/regression-test-SPARK-8489/test.jar'

Let me know if you are still working on it.

> Remove SparkSession.withHiveSupport
> ---
>
> Key: SPARK-15072
> URL: https://issues.apache.org/jira/browse/SPARK-15072
> Project: Spark
>  Issue Type: Sub-task
>  Components: SQL
>Reporter: Reynold Xin
>Assignee: Sandeep Singh
> Fix For: 2.0.0
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15032) When we create a new JDBC session, we may need to create a new session of executionHive

2016-05-03 Thread Sagar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270018#comment-15270018
 ] 

Sagar commented on SPARK-15032:
---

You are right! It is safer to create new session of executionHive while 
creating JDBC session but I think the problem is that it terminates the 
executionHive  process, let me know if you figured out other way, I can work on 
it.

> When we create a new JDBC session, we may need to create a new session of 
> executionHive
> ---
>
> Key: SPARK-15032
> URL: https://issues.apache.org/jira/browse/SPARK-15032
> Project: Spark
>  Issue Type: Bug
>  Components: SQL
>Reporter: Yin Huai
>Priority: Critical
>
> Right now, we only use executionHive in thriftserver. When we create a new 
> jdbc session, we probably need to create a new session of executionHive. I am 
> not sure what will break if we leave the code as is. But, I feel it will be 
> safer to create a new session of executionHive.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15063) filtering and joining back doesn't work

2016-05-03 Thread Sagar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15063?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270012#comment-15270012
 ] 

Sagar commented on SPARK-15063:
---

What else is required to do it in new df for each filter can you elaborate?

> filtering and joining back doesn't work
> ---
>
> Key: SPARK-15063
> URL: https://issues.apache.org/jira/browse/SPARK-15063
> Project: Spark
>  Issue Type: Bug
>Affects Versions: 1.6.1
>Reporter: Neville Kadwa
>
> I'm trying to filter and join to do a simple pivot but getting very odd 
> results.
> {quote} {noformat}
> val sqlContext = new org.apache.spark.sql.SQLContext(sc)
> import sqlContext.implicits._
> val people = Array((1, "sam"), (2, "joe"), (3, "sally"), (4, "joanna"))
> val accounts = Array(
>   (1, "checking", 100.0),
>   (1, "savings", 300.0),
>   (2, "savings", 1000.0),
>   (3, "carloan", 12000.0),
>   (3, "checking", 400.0)
> )
> val t1 = sc.makeRDD(people).toDF("uid", "name")
> val t2 = sc.makeRDD(accounts).toDF("uid", "type", "amount")
> val t2c = t2.filter(t2("type") <=> "checking")
> val t2s = t2.filter(t2("type") <=> "savings")
> t1.
>   join(t2c, t1("uid") <=> t2c("uid"), "left").
>   join(t2s, t1("uid") <=> t2s("uid"), "left").
>   take(10)
> {noformat} {quote}
> The results are wrong:
> {quote} {noformat}
> Array(
>   [1,sam,1,checking,100.0,1,savings,300.0],
>   [1,sam,1,checking,100.0,2,savings,1000.0],
>   [2,joe,null,null,null,null,null,null],
>   [3,sally,3,checking,400.0,1,savings,300.0],
>   [3,sally,3,checking,400.0,2,savings,1000.0],
>   [4,joanna,null,null,null,null,null,null]
> )
> {noformat} {quote}
> The way I can force it to work properly is to create a new df for each filter:
> {quote} {noformat}
> val t2a = sc.makeRDD(accounts).toDF("uid", "type", "amount")
> val t2s = t2a.filter(t2a("type") <=> "savings")
> t1.
>   join(t2c, t1("uid") <=> t2c("uid"), "left").
>   join(t2s, t1("uid") <=> t2s("uid"), "left").
>   take(10)
> {noformat} {quote}
> The results are right:
> {quote} {noformat}
> Array(
>   [1,sam,1,checking,100.0,1,savings,300.0],
>   [2,joe,null,null,null,2,savings,1000.0],
>   [3,sally,3,checking,400.0,null,null,null],
>   [4,joanna,null,null,null,null,null,null]
> )
> {noformat} {quote}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-15086) Update Java API once the Scala one is finalized

2016-05-03 Thread Sagar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-15086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15270007#comment-15270007
 ] 

Sagar commented on SPARK-15086:
---

In order to update Java API once Scala terminates. Please provide more 
information in order to make it work what else it includes.

> Update Java API once the Scala one is finalized
> ---
>
> Key: SPARK-15086
> URL: https://issues.apache.org/jira/browse/SPARK-15086
> Project: Spark
>  Issue Type: Sub-task
>  Components: Spark Core
>Reporter: Reynold Xin
> Fix For: 2.0.0
>
>
> We should make sure we update the Java API once the Scala one is finalized. 
> This includes adding the equivalent API in Java as well as deprecating the 
> old ones.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-7523) ERROR LiveListenerBus: Listener EventLoggingListener threw an exception

2015-05-11 Thread sagar (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-7523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

sagar updated SPARK-7523:
-
Attachment: schema.txt
spark-0.0.1-SNAPSHOT.jar

Spark jar and schema.txt attached

This files i am using while executing the commands.

 ERROR LiveListenerBus: Listener EventLoggingListener threw an exception
 ---

 Key: SPARK-7523
 URL: https://issues.apache.org/jira/browse/SPARK-7523
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.3.0
 Environment: Prod
Reporter: sagar
Priority: Blocker
 Attachments: schema.txt, spark-0.0.1-SNAPSHOT.jar


 Hi Team,
 I am using CDH 5.4 with spark 1.3.0.
 I am getting below error while executing below command -
 I see jira's (SPARK-2906/SPARK-1407) specifying the issue is resolved, but i 
 didnt get any solution what the fix for that. Can you pls guide/suggest as 
 this is production issue.
 $ spark-submit   --master local[4]   --class org.sample.spark.SparkFilter   
 --name Spark Sample Program   spark-0.0.1-SNAPSHOT.jar  
 /user/user1/schema.txt
 ==
 15/05/11 06:28:36 ERROR LiveListenerBus: Listener EventLoggingListener threw 
 an exception
 java.lang.reflect.InvocationTargetException
   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
   at 
 sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
   at 
 sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
   at java.lang.reflect.Method.invoke(Method.java:606)
   at 
 org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
   at 
 org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
   at scala.Option.foreach(Option.scala:236)
   at 
 org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
   at 
 org.apache.spark.scheduler.EventLoggingListener.onJobEnd(EventLoggingListener.scala:169)
   at 
 org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:36)
   at 
 org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
   at 
 org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
   at 
 org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
   at 
 org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
   at 
 org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
   at 
 org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
   at 
 org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
   at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
   at 
 org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
 Caused by: java.io.IOException: Filesystem closed
   at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:792)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1998)
   at 
 org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
   at 
 org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
   ... 19 more
 ==



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-7523) ERROR LiveListenerBus: Listener EventLoggingListener threw an exception

2015-05-11 Thread sagar (JIRA)
sagar created SPARK-7523:


 Summary: ERROR LiveListenerBus: Listener EventLoggingListener 
threw an exception
 Key: SPARK-7523
 URL: https://issues.apache.org/jira/browse/SPARK-7523
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit
Affects Versions: 1.3.0
 Environment: Prod
Reporter: sagar
Priority: Blocker


Hi Team,

I am using CDH 5.4 with spark 1.3.0.
I am getting below error while executing below command -

I see jira's (SPARK-2906/SPARK-1407) specifying the issue is resolved, but i 
didnt get any solution what the fix for that. Can you pls guide/suggest as this 
is production issue.

$ spark-submit   --master local[4]   --class org.sample.spark.SparkFilter   
--name Spark Sample Program   spark-0.0.1-SNAPSHOT.jar  /user/user1/schema.txt


==
15/05/11 06:28:36 ERROR LiveListenerBus: Listener EventLoggingListener threw an 
exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
at 
org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
at scala.Option.foreach(Option.scala:236)
at 
org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
at 
org.apache.spark.scheduler.EventLoggingListener.onJobEnd(EventLoggingListener.scala:169)
at 
org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:36)
at 
org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at 
org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at 
org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
at 
org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
at 
org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
at 
org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
at 
org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
at 
org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:792)
at 
org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1998)
at 
org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
at 
org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 19 more
==



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-1132) Persisting Web UI through refactoring the SparkListener interface

2015-05-10 Thread sagar (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-1132?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14537292#comment-14537292
 ] 

sagar commented on SPARK-1132:
--

Hi Team,

I see the issue is resolved and Fix Version/s: is -1.0.0.
Is 1.0.0 is spark version ?

Where i can get the spark version 1.0.0.

Currently i am getting below error -

15/05/10 08:42:20 INFO deprecation: mapred.tip.id is deprecated. Instead, use 
mapreduce.task.id
15/05/10 08:42:20 INFO deprecation: mapred.task.id is deprecated. Instead, use 
mapreduce.task.attempt.id
15/05/10 08:42:20 INFO deprecation: mapred.task.is.map is deprecated. Instead, 
use mapreduce.task.ismap
15/05/10 08:42:20 INFO deprecation: mapred.task.partition is deprecated. 
Instead, use mapreduce.task.partition
15/05/10 08:42:20 INFO deprecation: mapred.job.id is deprecated. Instead, use 
mapreduce.job.id
15/05/10 08:42:20 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 1750 
bytes result sent to driver
15/05/10 08:42:20 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) 
in 207 ms on localhost (1/1)
15/05/10 08:42:20 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have 
all completed, from pool 
15/05/10 08:42:20 INFO DAGScheduler: Stage 0 (count at SparkFilter.java:22) 
finished in 0.225 s
15/05/10 08:42:20 INFO DAGScheduler: Job 0 finished: count at 
SparkFilter.java:22, took 0.314437 s
0
15/05/10 08:42:20 ERROR LiveListenerBus: Listener EventLoggingListener threw an 
exception
java.lang.reflect.InvocationTargetException
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at 
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at 
org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
at 
org.apache.spark.scheduler.EventLoggingListener$$anonfun$logEvent$3.apply(EventLoggingListener.scala:144)
at scala.Option.foreach(Option.scala:236)
at 
org.apache.spark.scheduler.EventLoggingListener.logEvent(EventLoggingListener.scala:144)
at 
org.apache.spark.scheduler.EventLoggingListener.onStageCompleted(EventLoggingListener.scala:165)
at 
org.apache.spark.scheduler.SparkListenerBus$class.onPostEvent(SparkListenerBus.scala:32)
at 
org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at 
org.apache.spark.scheduler.LiveListenerBus.onPostEvent(LiveListenerBus.scala:31)
at 
org.apache.spark.util.ListenerBus$class.postToAll(ListenerBus.scala:53)
at 
org.apache.spark.util.AsynchronousListenerBus.postToAll(AsynchronousListenerBus.scala:36)
at 
org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply$mcV$sp(AsynchronousListenerBus.scala:76)
at 
org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
at 
org.apache.spark.util.AsynchronousListenerBus$$anon$1$$anonfun$run$1.apply(AsynchronousListenerBus.scala:61)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:1617)
at 
org.apache.spark.util.AsynchronousListenerBus$$anon$1.run(AsynchronousListenerBus.scala:60)
Caused by: java.io.IOException: Filesystem closed
at org.apache.hadoop.hdfs.DFSClient.checkOpen(DFSClient.java:792)
at 
org.apache.hadoop.hdfs.DFSOutputStream.flushOrSync(DFSOutputStream.java:1998)
at 
org.apache.hadoop.hdfs.DFSOutputStream.hflush(DFSOutputStream.java:1959)
at 
org.apache.hadoop.fs.FSDataOutputStream.hflush(FSDataOutputStream.java:130)
... 19 more


 Persisting Web UI through refactoring the SparkListener interface
 -

 Key: SPARK-1132
 URL: https://issues.apache.org/jira/browse/SPARK-1132
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core, Web UI
Affects Versions: 0.9.0
Reporter: Andrew Or
Assignee: Andrew Or
Priority: Blocker
 Fix For: 1.0.0


 This issue is a spin-off from another issue - 
 https://spark-project.atlassian.net/browse/SPARK-969
 The main issue with the existing Spark Web UI is that its information is lost 
 as soon as the application terminates. This is the direct result of the 
 SparkUI being coupled with SparkContext, which is stopped when the 
 application is finished.
 The attached document proposes to tackle this by logging SparkListenerEvents 
 to persist information displayed on the Web UI. We take this opportunity to 
 replace the existing format for storing this information, HTML, with one that 
 is more flexible, JSON. This allows further post-hoc analysis of a particular