[jira] [Updated] (AMBARI-18542) Ambari API request returns 500 when no stale service need to be restarted

2016-10-05 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-18542?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-18542:
---
Summary: Ambari API request returns 500 when no stale service need to be 
restarted  (was: Ambari API request returns unexpected error code when no stale 
service need to be restarted)

> Ambari API request returns 500 when no stale service need to be restarted
> -
>
> Key: AMBARI-18542
> URL: https://issues.apache.org/jira/browse/AMBARI-18542
> Project: Ambari
>  Issue Type: Bug
>Affects Versions: 2.4.0
>Reporter: Judy Nash
>Priority: Minor
>
> Bug related to a recent API addition to Ambari at this JIRA: 
> https://issues.apache.org/jira/browse/AMBARI-14394
> At Ambari 2.4, a new API is introduced to restart all stale services in one 
> call. 
> When API is called and there is no stale service needed to restart, APi 
> returns the following status:
> {
>   "status" : 500,
>   "message" : "An internal system exception occurred: Command execution 
> cannot proceed without a resource filter."
> }
> A legit but no ops needed operations shouldn't return a 500 status. 200 seems 
> more appropriate. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AMBARI-18542) Ambari API request returns unexpected error code when no stale service need to be restarted

2016-10-05 Thread Judy Nash (JIRA)
Judy Nash created AMBARI-18542:
--

 Summary: Ambari API request returns unexpected error code when no 
stale service need to be restarted
 Key: AMBARI-18542
 URL: https://issues.apache.org/jira/browse/AMBARI-18542
 Project: Ambari
  Issue Type: Bug
Affects Versions: 2.4.0
Reporter: Judy Nash
Priority: Minor


Bug related to a recent API addition to Ambari at this JIRA: 
https://issues.apache.org/jira/browse/AMBARI-14394

At Ambari 2.4, a new API is introduced to restart all stale services in one 
call. 
When API is called and there is no stale service needed to restart, APi returns 
the following status:
{
  "status" : 500,
  "message" : "An internal system exception occurred: Command execution cannot 
proceed without a resource filter."
}

A legit but no ops needed operations shouldn't return a 500 status. 200 seems 
more appropriate. 




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


security testing on spark ?

2015-12-15 Thread Judy Nash
Hi all,

Does anyone know of any effort from the community on security testing spark 
clusters.
I.e.
Static source code analysis to find security flaws
Penetration testing to identify ways to compromise spark cluster
Fuzzing to crash spark

Thanks,
Judy



security testing on spark ?

2015-12-15 Thread Judy Nash
Hi all,

Does anyone know of any effort from the community on security testing spark 
clusters.
I.e.
Static source code analysis to find security flaws
Penetration testing to identify ways to compromise spark cluster
Fuzzing to crash spark

Thanks,
Judy



[jira] [Commented] (AMBARI-13729) Change the Spark thrift server security configurations

2015-11-05 Thread Judy Nash (JIRA)

[ 
https://issues.apache.org/jira/browse/AMBARI-13729?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14992649#comment-14992649
 ] 

Judy Nash commented on AMBARI-13729:


+1 LGTM. 

> Change the Spark thrift server security configurations
> --
>
> Key: AMBARI-13729
> URL: https://issues.apache.org/jira/browse/AMBARI-13729
> Project: Ambari
>  Issue Type: Bug
>  Components: stacks
>Affects Versions: trunk
>Reporter: Saisai Shao
> Fix For: trunk
>
> Attachments: AMBARI-13729.patch
>
>
> "hive.security.authorization.enabled" would be better changed to 
> "hive.server2.authentication". And "hive.server2.enable.doAs" is not well 
> supported currently.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


RE: Error building Spark on Windows with sbt

2015-10-30 Thread Judy Nash
I have not had any success building using sbt/sbt on windows.
However, I have been able to binary by using maven command directly.

From: Richard Eggert [mailto:richard.egg...@gmail.com]
Sent: Sunday, October 25, 2015 12:51 PM
To: Ted Yu 
Cc: User 
Subject: Re: Error building Spark on Windows with sbt

Yes, I know, but it would be nice to be able to test things myself before I 
push commits.

On Sun, Oct 25, 2015 at 3:50 PM, Ted Yu 
> wrote:
If you have a pull request, Jenkins can test your change for you.

FYI

On Oct 25, 2015, at 12:43 PM, Richard Eggert 
> wrote:
Also, if I run the Maven build on Windows or Linux without setting 
-DskipTests=true, it hangs indefinitely when it gets to 
org.apache.spark.JavaAPISuite.

It's hard to test patches when the build doesn't work. :-/

On Sun, Oct 25, 2015 at 3:41 PM, Richard Eggert 
> wrote:
By "it works", I mean, "It gets past that particular error". It still fails 
several minutes later with a different error:

java.lang.IllegalStateException: impossible to get artifacts when data has not 
been loaded. IvyNode = org.scala-lang#scala-library;2.10.3


On Sun, Oct 25, 2015 at 3:38 PM, Richard Eggert 
> wrote:

When I try to start up sbt for the Spark build,  or if I try to import it in 
IntelliJ IDEA as an sbt project, it fails with a "No such file or directory" 
error when it attempts to "git clone" sbt-pom-reader into 
.sbt/0.13/staging/some-sha1-hash.

If I manually create the expected directory before running sbt or importing 
into IntelliJ, then it works. Why is it necessary to do this,  and what can be 
done to make it not necessary?

Rich



--
Rich



--
Rich



--
Rich


[jira] [Updated] (AMBARI-13513) Add cmd option config to spark thrift server

2015-10-26 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13513:
---
Fix Version/s: trunk
   2.1.2

> Add cmd option config to spark thrift server 
> -
>
> Key: AMBARI-13513
> URL: https://issues.apache.org/jira/browse/AMBARI-13513
> Project: Ambari
>  Issue Type: Improvement
>  Components: ambari-server
>Affects Versions: trunk
>    Reporter: Judy Nash
>Assignee: Judy Nash
> Fix For: 2.1.2, trunk
>
> Attachments: SPARK-13513.patch
>
>
> When spark thrift server launches, customers may want to add additional 
> options to configure spark.
> i.e. 
> --driver-memory 
> --jars 
> Some of these configurations can only be submitted as command line options, 
> so adding a new config on ambari to give customers access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13513) Add cmd option config to spark thrift server

2015-10-26 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13513:
---
Attachment: SPARK-13513.patch

> Add cmd option config to spark thrift server 
> -
>
> Key: AMBARI-13513
> URL: https://issues.apache.org/jira/browse/AMBARI-13513
> Project: Ambari
>  Issue Type: Improvement
>  Components: ambari-server
>Affects Versions: trunk
>    Reporter: Judy Nash
>Assignee: Judy Nash
> Fix For: 2.1.2, trunk
>
> Attachments: SPARK-13513.patch
>
>
> When spark thrift server launches, customers may want to add additional 
> options to configure spark.
> i.e. 
> --driver-memory 
> --jars 
> Some of these configurations can only be submitted as command line options, 
> so adding a new config on ambari to give customers access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13513) Add cmd option config to spark thrift server

2015-10-26 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13513:
---
Attachment: (was: SPARK-13513.patch)

> Add cmd option config to spark thrift server 
> -
>
> Key: AMBARI-13513
> URL: https://issues.apache.org/jira/browse/AMBARI-13513
> Project: Ambari
>  Issue Type: Improvement
>  Components: ambari-server
>Affects Versions: trunk
>    Reporter: Judy Nash
>Assignee: Judy Nash
>
> When spark thrift server launches, customers may want to add additional 
> options to configure spark.
> i.e. 
> --driver-memory 
> --jars 
> Some of these configurations can only be submitted as command line options, 
> so adding a new config on ambari to give customers access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 39530: Ambari-13513:[trunk][2.1.2] Add cmd option config to spark thrift server

2015-10-22 Thread Judy Nash


> On Oct. 22, 2015, 1:37 a.m., Sumit Mohanty wrote:
> > ambari-server/src/main/resources/stacks/HDP/2.3/services/SPARK/configuration/spark-thrift-cmd-opts-properties.xml,
> >  line 21
> > <https://reviews.apache.org/r/39530/diff/1/?file=1102701#file1102701line21>
> >
> > What we meant is that you can introduce a property 
> > spark-thrift-cmd-opts in spark-env.xml and have it contain the value user 
> > wants to supply. Then in the scripts you can refer to it as 
> > spark_thrift_cmd_opts_properties = 
> > str(config['configurations']['spark-env']['spark-thrift-cmd-opts'])

Got it! Great feedback. Updating it now.


- Judy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39530/#review103507
---


On Oct. 21, 2015, 10:33 p.m., Judy Nash wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39530/
> ---
> 
> (Updated Oct. 21, 2015, 10:33 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-13513
> https://issues.apache.org/jira/browse/AMBARI-13513
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Ambari-13513:[trunk][2.1.2] Add cmd option config to spark thrift server
> 
> 
> Diffs
> -
> 
>   
> ambari-server/src/main/resources/common-services/SPARK/1.2.0.2.2/package/scripts/params.py
>  518ba6d 
>   
> ambari-server/src/main/resources/common-services/SPARK/1.2.0.2.2/package/scripts/spark_service.py
>  68a395b 
>   
> ambari-server/src/main/resources/stacks/HDP/2.3/services/SPARK/configuration/spark-thrift-cmd-opts-properties.xml
>  PRE-CREATION 
>   ambari-server/src/main/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml 
> 14161b4 
>   ambari-server/src/test/python/stacks/2.3/SPARK/test_spark_thrift_server.py 
> b3b8235 
>   ambari-server/src/test/python/stacks/2.3/configs/spark_default.json 9f3fb90 
> 
> Diff: https://reviews.apache.org/r/39530/diff/
> 
> 
> Testing
> ---
> 
> 1. Updated Unit Tests 
> 2. E2E test on HDInsight cluster
> 
> 
> Thanks,
> 
> Judy Nash
> 
>



Re: Review Request 39530: Ambari-13513:[trunk][2.1.2] Add cmd option config to spark thrift server

2015-10-22 Thread Judy Nash

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39530/
---

(Updated Oct. 22, 2015, 9:40 p.m.)


Review request for Ambari, Alejandro Fernandez and Sumit Mohanty.


Changes
---

Update patch to read value from spark-env


Bugs: AMBARI-13513
https://issues.apache.org/jira/browse/AMBARI-13513


Repository: ambari


Description
---

Ambari-13513:[trunk][2.1.2] Add cmd option config to spark thrift server


Diffs (updated)
-

  
ambari-server/src/main/resources/common-services/SPARK/1.2.0.2.2/configuration/spark-env.xml
 79e3b52 
  
ambari-server/src/main/resources/common-services/SPARK/1.2.0.2.2/package/scripts/params.py
 518ba6d 
  
ambari-server/src/main/resources/common-services/SPARK/1.2.0.2.2/package/scripts/spark_service.py
 68a395b 
  ambari-server/src/test/python/stacks/2.3/SPARK/test_spark_thrift_server.py 
b3b8235 
  ambari-server/src/test/python/stacks/2.3/configs/spark_default.json 9f3fb90 

Diff: https://reviews.apache.org/r/39530/diff/


Testing
---

1. Updated Unit Tests 
2. E2E test on HDInsight cluster


Thanks,

Judy Nash



[jira] [Updated] (AMBARI-13513) Add cmd option config to spark thrift server

2015-10-21 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13513:
---
Summary: Add cmd option config to spark thrift server   (was: Add cmd opt 
config to spark thrift server )

> Add cmd option config to spark thrift server 
> -
>
> Key: AMBARI-13513
> URL: https://issues.apache.org/jira/browse/AMBARI-13513
> Project: Ambari
>  Issue Type: Improvement
>  Components: ambari-server
>Affects Versions: 2.1.2, trunk
>    Reporter: Judy Nash
>Assignee: Judy Nash
>
> When spark thrift server launches, customers may want to add additional 
> options to configure spark.
> i.e. 
> --driver-memory 
> --jars 
> Some of these configurations can only be submitted as command line options, 
> so adding a new config on ambari to give customers access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AMBARI-13513) Add cmd opt config to spark thrift server

2015-10-21 Thread Judy Nash (JIRA)
Judy Nash created AMBARI-13513:
--

 Summary: Add cmd opt config to spark thrift server 
 Key: AMBARI-13513
 URL: https://issues.apache.org/jira/browse/AMBARI-13513
 Project: Ambari
  Issue Type: Improvement
  Components: ambari-server
Affects Versions: 2.1.2, trunk
Reporter: Judy Nash
Assignee: Judy Nash


When spark thrift server launches, customers may want to add additional options 
to configure spark.
i.e. 
--driver-memory 
--jars 

Some of these configurations can only be submitted as command line options, so 
adding a new config on ambari to give customers access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13513) Add cmd option config to spark thrift server

2015-10-21 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13513:
---
Attachment: AMBARI-13513.patch

> Add cmd option config to spark thrift server 
> -
>
> Key: AMBARI-13513
> URL: https://issues.apache.org/jira/browse/AMBARI-13513
> Project: Ambari
>  Issue Type: Improvement
>  Components: ambari-server
>Affects Versions: 2.1.2, trunk
>    Reporter: Judy Nash
>Assignee: Judy Nash
> Attachments: AMBARI-13513.patch
>
>
> When spark thrift server launches, customers may want to add additional 
> options to configure spark.
> i.e. 
> --driver-memory 
> --jars 
> Some of these configurations can only be submitted as command line options, 
> so adding a new config on ambari to give customers access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Review Request 39530: Ambari-13513:[trunk][2.1.2] Add cmd option config to spark thrift server

2015-10-21 Thread Judy Nash

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39530/
---

Review request for Ambari, Alejandro Fernandez and Sumit Mohanty.


Bugs: AMBARI-13513
https://issues.apache.org/jira/browse/AMBARI-13513


Repository: ambari


Description
---

Ambari-13513:[trunk][2.1.2] Add cmd option config to spark thrift server


Diffs
-

  
ambari-server/src/main/resources/common-services/SPARK/1.2.0.2.2/package/scripts/params.py
 518ba6d 
  
ambari-server/src/main/resources/common-services/SPARK/1.2.0.2.2/package/scripts/spark_service.py
 68a395b 
  
ambari-server/src/main/resources/stacks/HDP/2.3/services/SPARK/configuration/spark-thrift-cmd-opts-properties.xml
 PRE-CREATION 
  ambari-server/src/main/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml 
14161b4 
  ambari-server/src/test/python/stacks/2.3/SPARK/test_spark_thrift_server.py 
b3b8235 
  ambari-server/src/test/python/stacks/2.3/configs/spark_default.json 9f3fb90 

Diff: https://reviews.apache.org/r/39530/diff/


Testing
---

1. Updated Unit Tests 
2. E2E test on HDInsight cluster


Thanks,

Judy Nash



[jira] [Updated] (AMBARI-13513) Add cmd option config to spark thrift server

2015-10-21 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13513:
---
Attachment: SPARK-13513.patch

> Add cmd option config to spark thrift server 
> -
>
> Key: AMBARI-13513
> URL: https://issues.apache.org/jira/browse/AMBARI-13513
> Project: Ambari
>  Issue Type: Improvement
>  Components: ambari-server
>Affects Versions: 2.1.2, trunk
>    Reporter: Judy Nash
>Assignee: Judy Nash
> Attachments: SPARK-13513.patch
>
>
> When spark thrift server launches, customers may want to add additional 
> options to configure spark.
> i.e. 
> --driver-memory 
> --jars 
> Some of these configurations can only be submitted as command line options, 
> so adding a new config on ambari to give customers access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13513) Add cmd option config to spark thrift server

2015-10-21 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13513:
---
Attachment: (was: SPARK-13513.patch)

> Add cmd option config to spark thrift server 
> -
>
> Key: AMBARI-13513
> URL: https://issues.apache.org/jira/browse/AMBARI-13513
> Project: Ambari
>  Issue Type: Improvement
>  Components: ambari-server
>Affects Versions: 2.1.2, trunk
>    Reporter: Judy Nash
>Assignee: Judy Nash
> Attachments: SPARK-13513.patch
>
>
> When spark thrift server launches, customers may want to add additional 
> options to configure spark.
> i.e. 
> --driver-memory 
> --jars 
> Some of these configurations can only be submitted as command line options, 
> so adding a new config on ambari to give customers access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13513) Add cmd option config to spark thrift server

2015-10-21 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13513:
---
Attachment: SPARK-13513.patch

> Add cmd option config to spark thrift server 
> -
>
> Key: AMBARI-13513
> URL: https://issues.apache.org/jira/browse/AMBARI-13513
> Project: Ambari
>  Issue Type: Improvement
>  Components: ambari-server
>Affects Versions: 2.1.2, trunk
>    Reporter: Judy Nash
>Assignee: Judy Nash
> Attachments: SPARK-13513.patch
>
>
> When spark thrift server launches, customers may want to add additional 
> options to configure spark.
> i.e. 
> --driver-memory 
> --jars 
> Some of these configurations can only be submitted as command line options, 
> so adding a new config on ambari to give customers access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13513) Add cmd option config to spark thrift server

2015-10-21 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13513:
---
Attachment: (was: AMBARI-13513.patch)

> Add cmd option config to spark thrift server 
> -
>
> Key: AMBARI-13513
> URL: https://issues.apache.org/jira/browse/AMBARI-13513
> Project: Ambari
>  Issue Type: Improvement
>  Components: ambari-server
>Affects Versions: 2.1.2, trunk
>    Reporter: Judy Nash
>Assignee: Judy Nash
> Attachments: SPARK-13513.patch
>
>
> When spark thrift server launches, customers may want to add additional 
> options to configure spark.
> i.e. 
> --driver-memory 
> --jars 
> Some of these configurations can only be submitted as command line options, 
> so adding a new config on ambari to give customers access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AMBARI-13513) Add cmd option config to spark thrift server

2015-10-21 Thread Judy Nash (JIRA)

[ 
https://issues.apache.org/jira/browse/AMBARI-13513?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14968288#comment-14968288
 ] 

Judy Nash commented on AMBARI-13513:


Here it is: https://reviews.apache.org/r/39530/


> Add cmd option config to spark thrift server 
> -
>
> Key: AMBARI-13513
> URL: https://issues.apache.org/jira/browse/AMBARI-13513
> Project: Ambari
>  Issue Type: Improvement
>  Components: ambari-server
>Affects Versions: trunk
>    Reporter: Judy Nash
>Assignee: Judy Nash
> Attachments: SPARK-13513.patch
>
>
> When spark thrift server launches, customers may want to add additional 
> options to configure spark.
> i.e. 
> --driver-memory 
> --jars 
> Some of these configurations can only be submitted as command line options, 
> so adding a new config on ambari to give customers access. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


Re: Review Request 39530: Ambari-13513:[trunk][2.1.2] Add cmd option config to spark thrift server

2015-10-21 Thread Judy Nash


> On Oct. 22, 2015, 12:59 a.m., Alejandro Fernandez wrote:
> > ambari-server/src/main/resources/stacks/HDP/2.3/services/SPARK/configuration/spark-thrift-cmd-opts-properties.xml,
> >  line 21
> > <https://reviews.apache.org/r/39530/diff/1/?file=1102701#file1102701line21>
> >
> > ditto.
> > spark-env file already exists, so can add another property there, 
> > "spark_thrift_content" (or something like it)

Spark-env.sh doesn't have any spark thrift server property that can be used for 
this today. Using spark-env will require an apache spark commit. 

Can't read the property at runtime when kicking off ./start-thriftserver.sh 
neither, as spark-env.sh is not loaded until later at spark-class.sh. 

Are there other ways to make the property value consumable by ambari that I 
have not thought of?


- Judy


---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39530/#review103500
-------


On Oct. 21, 2015, 10:33 p.m., Judy Nash wrote:
> 
> ---
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/39530/
> ---
> 
> (Updated Oct. 21, 2015, 10:33 p.m.)
> 
> 
> Review request for Ambari, Alejandro Fernandez and Sumit Mohanty.
> 
> 
> Bugs: AMBARI-13513
> https://issues.apache.org/jira/browse/AMBARI-13513
> 
> 
> Repository: ambari
> 
> 
> Description
> ---
> 
> Ambari-13513:[trunk][2.1.2] Add cmd option config to spark thrift server
> 
> 
> Diffs
> -
> 
>   
> ambari-server/src/main/resources/common-services/SPARK/1.2.0.2.2/package/scripts/params.py
>  518ba6d 
>   
> ambari-server/src/main/resources/common-services/SPARK/1.2.0.2.2/package/scripts/spark_service.py
>  68a395b 
>   
> ambari-server/src/main/resources/stacks/HDP/2.3/services/SPARK/configuration/spark-thrift-cmd-opts-properties.xml
>  PRE-CREATION 
>   ambari-server/src/main/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml 
> 14161b4 
>   ambari-server/src/test/python/stacks/2.3/SPARK/test_spark_thrift_server.py 
> b3b8235 
>   ambari-server/src/test/python/stacks/2.3/configs/spark_default.json 9f3fb90 
> 
> Diff: https://reviews.apache.org/r/39530/diff/
> 
> 
> Testing
> ---
> 
> 1. Updated Unit Tests 
> 2. E2E test on HDInsight cluster
> 
> 
> Thanks,
> 
> Judy Nash
> 
>



Re: Review Request 39200: Lightweight bugfix to correct typo on configuration folder name & copy configuration folder from common-services to HDP stack so Ambari can find the new config files

2015-10-12 Thread Judy Nash

---
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/39200/
---

(Updated Oct. 12, 2015, 11:42 p.m.)


Review request for Ambari, Alejandro Fernandez and Sumit Mohanty.


Bugs: AMBARI-13382
https://issues.apache.org/jira/browse/AMBARI-13382


Repository: ambari


Description
---

AMBARI-13382 (trunk + branch 2.1.2): Spark thrift server cannot load the new 
configuration files added


Diffs
-

  
ambari-server/src/main/resources/common-services/SPARK/1.4.1.2.3/configurations/spark-hive-site-override.xml
 2de64c5 
  
ambari-server/src/main/resources/common-services/SPARK/1.4.1.2.3/configurations/spark-thrift-sparkconf.xml
 c42841f 
  
ambari-server/src/main/resources/stacks/HDP/2.3/services/SPARK/configuration/spark-hive-site-override.xml
 PRE-CREATION 
  
ambari-server/src/main/resources/stacks/HDP/2.3/services/SPARK/configuration/spark-thrift-sparkconf.xml
 PRE-CREATION 

Diff: https://reviews.apache.org/r/39200/diff/


Testing
---

Validated E2E on HDP 2.3 spark 1.4 cluster.


Thanks,

Judy Nash



[jira] [Updated] (AMBARI-13382) spark thrift server cannot load the new configuration files added

2015-10-12 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13382:
---
Attachment: (was: AMBARI-13382.patch)

> spark thrift server cannot load the new configuration files added
> -
>
> Key: AMBARI-13382
> URL: https://issues.apache.org/jira/browse/AMBARI-13382
> Project: Ambari
>  Issue Type: Bug
>    Reporter: Judy Nash
>Assignee: Judy Nash
> Attachments: AMBARI-13382.patch
>
>
> spark thrift server cannot load the new configuration files added. 
> Why
> 1) "configuration" folder under common-services is wrongly spelled as 
> "configurations"
> 2) HDP stack definition doesn't have configuration folder in 2.3 stack and 
> ambari is not reading it from common-services folder 
> Fix is to correct the typo on folder name and copy the configuration folder 
> from common-services to HDP stack definition 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13382) spark thrift server cannot load the new configuration files added

2015-10-12 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13382:
---
Attachment: AMBARI-13382.patch

> spark thrift server cannot load the new configuration files added
> -
>
> Key: AMBARI-13382
> URL: https://issues.apache.org/jira/browse/AMBARI-13382
> Project: Ambari
>  Issue Type: Bug
>    Reporter: Judy Nash
>Assignee: Judy Nash
> Attachments: AMBARI-13382.patch
>
>
> spark thrift server cannot load the new configuration files added. 
> Why
> 1) "configuration" folder under common-services is wrongly spelled as 
> "configurations"
> 2) HDP stack definition doesn't have configuration folder in 2.3 stack and 
> ambari is not reading it from common-services folder 
> Fix is to correct the typo on folder name and copy the configuration folder 
> from common-services to HDP stack definition 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13382) spark thrift server cannot load the new configuration files added

2015-10-09 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13382:
---
Attachment: AMBARI-13382.patch

> spark thrift server cannot load the new configuration files added
> -
>
> Key: AMBARI-13382
> URL: https://issues.apache.org/jira/browse/AMBARI-13382
> Project: Ambari
>  Issue Type: Bug
>    Reporter: Judy Nash
>Assignee: Judy Nash
> Attachments: AMBARI-13382.patch
>
>
> spark thrift server cannot load the new configuration files added. 
> Why
> 1) "configuration" folder under common-services is wrongly spelled as 
> "configuration*s*"
> 2) HDP stack definition doesn't have configuration folder in 2.3 stack and 
> ambari is not reading it from common-services folder 
> Fix is to correct the typo on folder name and copy the configuration folder 
> from common-services to HDP stack definition 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13382) spark thrift server cannot load the new configuration files added

2015-10-09 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13382:
---
Description: 
spark thrift server cannot load the new configuration files added. 

Why
1) "configuration" folder under common-services is wrongly spelled as 
"configurations"
2) HDP stack definition doesn't have configuration folder in 2.3 stack and 
ambari is not reading it from common-services folder 

Fix is to correct the typo on folder name and copy the configuration folder 
from common-services to HDP stack definition 

  was:
spark thrift server cannot load the new configuration files added. 

Why
1) "configuration" folder under common-services is wrongly spelled as 
"configuration*s*"
2) HDP stack definition doesn't have configuration folder in 2.3 stack and 
ambari is not reading it from common-services folder 

Fix is to correct the typo on folder name and copy the configuration folder 
from common-services to HDP stack definition 


> spark thrift server cannot load the new configuration files added
> -
>
> Key: AMBARI-13382
> URL: https://issues.apache.org/jira/browse/AMBARI-13382
> Project: Ambari
>  Issue Type: Bug
>Reporter: Judy Nash
>Assignee: Judy Nash
> Attachments: AMBARI-13382.patch
>
>
> spark thrift server cannot load the new configuration files added. 
> Why
> 1) "configuration" folder under common-services is wrongly spelled as 
> "configurations"
> 2) HDP stack definition doesn't have configuration folder in 2.3 stack and 
> ambari is not reading it from common-services folder 
> Fix is to correct the typo on folder name and copy the configuration folder 
> from common-services to HDP stack definition 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13382) spark thrift server cannot load the new configuration files added

2015-10-09 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13382?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13382:
---
Summary: spark thrift server cannot load the new configuration files added  
(was: spark thrift server cannot load the new configuration files added. )

> spark thrift server cannot load the new configuration files added
> -
>
> Key: AMBARI-13382
> URL: https://issues.apache.org/jira/browse/AMBARI-13382
> Project: Ambari
>  Issue Type: Bug
>    Reporter: Judy Nash
>Assignee: Judy Nash
>
> spark thrift server cannot load the new configuration files added. 
> Why
> 1) "configuration" folder under common-services is wrongly spelled as 
> "configuration*s*"
> 2) HDP stack definition doesn't have configuration folder in 2.3 stack and 
> ambari is not reading it from common-services folder 
> Fix is to correct the typo on folder name and copy the configuration folder 
> from common-services to HDP stack definition 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AMBARI-13382) spark thrift server cannot load the new configuration files added.

2015-10-09 Thread Judy Nash (JIRA)
Judy Nash created AMBARI-13382:
--

 Summary: spark thrift server cannot load the new configuration 
files added. 
 Key: AMBARI-13382
 URL: https://issues.apache.org/jira/browse/AMBARI-13382
 Project: Ambari
  Issue Type: Bug
Reporter: Judy Nash
Assignee: Judy Nash


spark thrift server cannot load the new configuration files added. 

Why
1) "configuration" folder under common-services is wrongly spelled as 
"configuration*s*"
2) HDP stack definition doesn't have configuration folder in 2.3 stack and 
ambari is not reading it from common-services folder 

Fix is to correct the typo on folder name and copy the configuration folder 
from common-services to HDP stack definition 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-21 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: (was: AMBARI-13094.patch)

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: AMBARI-13094.patch, Ambari Service for Spark Thrift 
> Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 
> Instruction to add thrift server to an existing cluster:
> 1) If running on HDP distro, Update metainfo.xml @ 
> /var/lib/ambari-server/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml. 
> Change true to false. 
> 2) Use Ambari UI if spark has not been installed before. It will show up as a 
> installable service alongside with spark history server. 
> 3) If spark component has been installed already, use API to add 
> thrift-server as a new service. Service name: SPARK_THRIFTSERVER



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-21 Thread Judy Nash (JIRA)

[ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=14901444#comment-14901444
 ] 

Judy Nash commented on AMBARI-13094:


Error looks like a build issue. Retrying patch. 

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: AMBARI-13094.patch, Ambari Service for Spark Thrift 
> Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 
> Instruction to add thrift server to an existing cluster:
> 1) If running on HDP distro, Update metainfo.xml @ 
> /var/lib/ambari-server/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml. 
> Change true to false. 
> 2) Use Ambari UI if spark has not been installed before. It will show up as a 
> installable service alongside with spark history server. 
> 3) If spark component has been installed already, use API to add 
> thrift-server as a new service. Service name: SPARK_THRIFTSERVER



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-21 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: AMBARI-13094.patch

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: AMBARI-13094.patch, Ambari Service for Spark Thrift 
> Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 
> Instruction to add thrift server to an existing cluster:
> 1) If running on HDP distro, Update metainfo.xml @ 
> /var/lib/ambari-server/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml. 
> Change true to false. 
> 2) Use Ambari UI if spark has not been installed before. It will show up as a 
> installable service alongside with spark history server. 
> 3) If spark component has been installed already, use API to add 
> thrift-server as a new service. Service name: SPARK_THRIFTSERVER



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-18 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Description: 
New feature to add spark thrift server support on Ambari. 

Design specification attached. 

Instruction to add thrift server to an existing cluster:
1) If running on HDP distro, Update metainfo.xml @ 
/var/lib/ambari-server/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml. 
Change true to false. 
2) Use Ambari UI if spark has not been installed before. It will show up as a 
installable service alongside with spark history server. 
3) If spark component has been installed already, use API to add thrift-server 
as a new service. Service name: SPARK_THRIFTSERVER




  was:
New feature to add spark thrift server support on Ambari. 

Design specification attached. 

Instruction to add thrift server to an existing cluster:
1) If running on HDP distro, Update metainfo.xml @ 
/var/lib/ambari-server/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml. 
Change true to false. 
2) Use Ambari UI if spark has not been installed before. It will show up as a 
installable service alongside with spark history server. 
3) If spark component has been installed already, use API to add thrift-server 
as a component. Service name: SPARK_THRIFTSERVER





> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: AMBARI-13094.patch, Ambari Service for Spark Thrift 
> Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 
> Instruction to add thrift server to an existing cluster:
> 1) If running on HDP distro, Update metainfo.xml @ 
> /var/lib/ambari-server/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml. 
> Change true to false. 
> 2) Use Ambari UI if spark has not been installed before. It will show up as a 
> installable service alongside with spark history server. 
> 3) If spark component has been installed already, use API to add 
> thrift-server as a new service. Service name: SPARK_THRIFTSERVER



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-18 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Description: 
New feature to add spark thrift server support on Ambari. 

Design specification attached. 

Instruction to add thrift server to an existing cluster:
1) If running on HDP distro, Update metainfo.xml @ 
/var/lib/ambari-server/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml. 
Change true to false. 
2) Use Ambari UI if spark has not been installed before. It will show up as a 
installable service alongside with spark history server. 
3) If spark component has been installed already, use API to add thrift-server 
as a component. Service name: SPARK_THRIFTSERVER




  was:
New feature to add spark thrift server support on Ambari. 

Design specification attached. 


> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: AMBARI-13094.patch, Ambari Service for Spark Thrift 
> Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 
> Instruction to add thrift server to an existing cluster:
> 1) If running on HDP distro, Update metainfo.xml @ 
> /var/lib/ambari-server/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml. 
> Change true to false. 
> 2) Use Ambari UI if spark has not been installed before. It will show up as a 
> installable service alongside with spark history server. 
> 3) If spark component has been installed already, use API to add 
> thrift-server as a component. Service name: SPARK_THRIFTSERVER



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-18 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: (was: AMBARI-13094.patch)

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: Ambari Service for Spark Thrift Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 
> Instruction to add thrift server to an existing cluster:
> 1) If running on HDP distro, Update metainfo.xml @ 
> /var/lib/ambari-server/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml. 
> Change true to false. 
> 2) Use Ambari UI if spark has not been installed before. It will show up as a 
> installable service alongside with spark history server. 
> 3) If spark component has been installed already, use API to add 
> thrift-server as a new service. Service name: SPARK_THRIFTSERVER



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-18 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: AMBARI-13094.patch

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: AMBARI-13094.patch, Ambari Service for Spark Thrift 
> Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 
> Instruction to add thrift server to an existing cluster:
> 1) If running on HDP distro, Update metainfo.xml @ 
> /var/lib/ambari-server/resources/stacks/HDP/2.3/services/SPARK/metainfo.xml. 
> Change true to false. 
> 2) Use Ambari UI if spark has not been installed before. It will show up as a 
> installable service alongside with spark history server. 
> 3) If spark component has been installed already, use API to add 
> thrift-server as a new service. Service name: SPARK_THRIFTSERVER



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-16 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: AMBARI-13094.patch

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: AMBARI-13094.patch, Ambari Service for Spark Thrift 
> Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-16 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: (was: ambari-237.patch)

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: AMBARI-13094.patch, Ambari Service for Spark Thrift 
> Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-15 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: ambari-237.patch

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: Ambari Service for Spark Thrift Design 
> Specification.docx, ambari-237.patch
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-15 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: (was: AMBARI-13094.patch)

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: Ambari Service for Spark Thrift Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-15 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: AMBARI-13094.patch

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: AMBARI-13094.patch, Ambari Service for Spark Thrift 
> Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-15 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: (was: 
0001-SPARK-237-Add-Spark-Thrift-Ambari-Service.6.patch)

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: AMBARI-13094.patch, Ambari Service for Spark Thrift 
> Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-15 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: AMBARI-13094.patch

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: AMBARI-13094.patch, Ambari Service for Spark Thrift 
> Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-15 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: (was: AMBARI-13094.patch)

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: AMBARI-13094.patch, Ambari Service for Spark Thrift 
> Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Created] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-14 Thread Judy Nash (JIRA)
Judy Nash created AMBARI-13094:
--

 Summary: Add Spark Thrift Ambari Service
 Key: AMBARI-13094
 URL: https://issues.apache.org/jira/browse/AMBARI-13094
 Project: Ambari
  Issue Type: New Feature
Reporter: Judy Nash


New feature to add spark thrift server support on Ambari. 

Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-14 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: Ambari Service for Spark Thrift Design Specification.docx

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>    Reporter: Judy Nash
> Attachments: Ambari Service for Spark Thrift Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-14 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: 0001-SPARK-237-Add-Spark-Thrift-Ambari-Service.4.patch

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>    Reporter: Judy Nash
> Attachments: 0001-SPARK-237-Add-Spark-Thrift-Ambari-Service.4.patch, 
> Ambari Service for Spark Thrift Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-14 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: 0001-SPARK-237-Add-Spark-Thrift-Ambari-Service.6.patch

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: 0001-SPARK-237-Add-Spark-Thrift-Ambari-Service.6.patch, 
> Ambari Service for Spark Thrift Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Updated] (AMBARI-13094) Add Spark Thrift Ambari Service

2015-09-14 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/AMBARI-13094?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated AMBARI-13094:
---
Attachment: (was: 
0001-SPARK-237-Add-Spark-Thrift-Ambari-Service.4.patch)

> Add Spark Thrift Ambari Service
> ---
>
> Key: AMBARI-13094
> URL: https://issues.apache.org/jira/browse/AMBARI-13094
> Project: Ambari
>  Issue Type: New Feature
>Affects Versions: trunk
>Reporter: Judy Nash
>  Labels: patch
> Fix For: trunk
>
> Attachments: 0001-SPARK-237-Add-Spark-Thrift-Ambari-Service.6.patch, 
> Ambari Service for Spark Thrift Design Specification.docx
>
>
> New feature to add spark thrift server support on Ambari. 
> Design specification attached. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


spark thrift server supports timeout?

2015-07-21 Thread Judy Nash
Hello everyone,

Does spark thrift server support timeout?
Is there a documentation I can reference for questions like these?

I know it support cancels, but not sure about timeout.

Thanks,
Judy


thrift server reliability issue

2015-07-07 Thread Judy Nash
Hi everyone,

Found a thrift server reliability issue on spark 1.3.1 that causes thrift to 
fail.

When thrift server has too little memory allocated to the driver to process the 
request, its Spark SQL session exits with OutOfMemory exception, causing thrift 
server to stop working.

Is this a known issue?

Thanks,
Judy

--
Full stacktrace of out of memory exception:
2015-07-08 03:30:18,011 ERROR actor.ActorSystemImpl 
(Slf4jLogger.scala:apply$mcV$sp(66)) - Uncaught fatal error from thread 
[sparkDriver-akka.remote.default-remote-dispatcher-6] shutting down ActorSystem 
[sparkDriver]
java.lang.OutOfMemoryError: Java heap space
at 
org.spark_project.protobuf.ByteString.toByteArray(ByteString.java:515)
at 
akka.remote.serialization.MessageContainerSerializer.fromBinary(MessageContainerSerializer.scala:64)
at 
akka.serialization.Serialization$$anonfun$deserialize$1.apply(Serialization.scala:104)
at scala.util.Try$.apply(Try.scala:161)
at 
akka.serialization.Serialization.deserialize(Serialization.scala:98)
at 
akka.remote.MessageSerializer$.deserialize(MessageSerializer.scala:23)
at 
akka.remote.DefaultMessageDispatcher.payload$lzycompute$1(Endpoint.scala:58)
at 
akka.remote.DefaultMessageDispatcher.payload$1(Endpoint.scala:58)
at 
akka.remote.DefaultMessageDispatcher.dispatch(Endpoint.scala:76)
at 
akka.remote.EndpointReader$$anonfun$receive$2.applyOrElse(Endpoint.scala:937)
at akka.actor.Actor$class.aroundReceive(Actor.scala:465)
at akka.remote.EndpointActor.aroundReceive(Endpoint.scala:415)
at akka.actor.ActorCell.receiveMessage(ActorCell.scala:516)
at akka.actor.ActorCell.invoke(ActorCell.scala:487)
at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:238)
at akka.dispatch.Mailbox.run(Mailbox.scala:220)
at 
akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:393)
at 
scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260)
at 
scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339)
at 
scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979)
at 
scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)


[jira] [Created] (SPARK-7811) Fix typo on slf4j configuration on metrics.properties.template

2015-05-21 Thread Judy Nash (JIRA)
Judy Nash created SPARK-7811:


 Summary: Fix typo on slf4j configuration on 
metrics.properties.template
 Key: SPARK-7811
 URL: https://issues.apache.org/jira/browse/SPARK-7811
 Project: Spark
  Issue Type: Bug
Reporter: Judy Nash
Priority: Minor


There are a minor typo on slf4jsink configuration at 
metrics.properties.template. 

slf4j is mispelled as sl4j on 2 of the configuration. 

Correcting the typo so users' custom settings will be loaded correctly. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



Get a list of temporary RDD tables via Thrift

2015-05-11 Thread Judy Nash
Hi,

How can I get a list of temporary tables via Thrift?

Have used thrift's startWithContext and registered a temp table, but not seeing 
the temp table/rdd when running show tables.


Thanks,
Judy


saveAsTable fails on Python with Unresolved plan found

2015-05-07 Thread Judy Nash
Hello,

I am following the tutorial code on sql programming 
guidehttps://spark.apache.org/docs/1.2.1/sql-programming-guide.html#inferring-the-schema-using-reflection
 to try out Python on spark 1.2.1.

SaveAsTable function works on Scala bur fails on python with Unresolved plan 
found.

Broken Python code:

from pyspark.sql import SQLContext, Row

sqlContext = SQLContext(sc)

lines = sc.textFile(data.txt)

parts = lines.map(lambda l: l.split(,))

people = parts.map(lambda p: Row(id=p[0], name=p[1]))

schemaPeople = sqlContext.inferSchema(people)

schemaPeople.saveAsTable(peopletable)

saveAsTable fails with Unresolved plan found.
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved plan 
found, tree:
'CreateTableAsSelect None, pytable, false, None


This scala code works fine:

from pyspark.sql import SQLContext, Row

sqlContext = SQLContext(sc)

lines = sc.textFile(data.txt)

parts = lines.map(lambda l: l.split(,))

people = parts.map(lambda p: Row(id=p[0], name=p[1]))

schemaPeople = sqlContext.inferSchema(people)

schemaPeople.saveAsTable(peopletable)

Is this a known issue? Or am I not using Python correctly?

Thanks,
Judy


RE: saveAsTable fails on Python with Unresolved plan found

2015-05-07 Thread Judy Nash
SPARK-4825https://issues.apache.org/jira/browse/SPARK-4825 looks like the 
right bug, but it should've been fixed on 1.2.1.

Is a similar fix needed in Python?

From: Judy Nash
Sent: Thursday, May 7, 2015 7:26 AM
To: user@spark.apache.org
Subject: saveAsTable fails on Python with Unresolved plan found

Hello,

I am following the tutorial code on sql programming 
guidehttps://spark.apache.org/docs/1.2.1/sql-programming-guide.html#inferring-the-schema-using-reflection
 to try out Python on spark 1.2.1.

SaveAsTable function works on Scala bur fails on python with Unresolved plan 
found.

Broken Python code:

from pyspark.sql import SQLContext, Row

sqlContext = SQLContext(sc)

lines = sc.textFile(data.txt)

parts = lines.map(lambda l: l.split(,))

people = parts.map(lambda p: Row(id=p[0], name=p[1]))

schemaPeople = sqlContext.inferSchema(people)

schemaPeople.saveAsTable(peopletable)

saveAsTable fails with Unresolved plan found.
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved plan 
found, tree:
'CreateTableAsSelect None, pytable, false, None


This scala code works fine:

from pyspark.sql import SQLContext, Row

sqlContext = SQLContext(sc)

lines = sc.textFile(data.txt)

parts = lines.map(lambda l: l.split(,))

people = parts.map(lambda p: Row(id=p[0], name=p[1]))

schemaPeople = sqlContext.inferSchema(people)

schemaPeople.saveAsTable(peopletable)


Is this a known issue? Or am I not using Python correctly?

Thanks,
Judy


RE: saveAsTable fails on Python with Unresolved plan found

2015-05-07 Thread Judy Nash
Figured it out. It was because I was using HiveContext instead of SQLContext.
FYI in case others saw the same issue.

From: Judy Nash
Sent: Thursday, May 7, 2015 7:38 AM
To: 'user@spark.apache.org'
Subject: RE: saveAsTable fails on Python with Unresolved plan found

SPARK-4825https://issues.apache.org/jira/browse/SPARK-4825 looks like the 
right bug, but it should've been fixed on 1.2.1.

Is a similar fix needed in Python?

From: Judy Nash
Sent: Thursday, May 7, 2015 7:26 AM
To: user@spark.apache.orgmailto:user@spark.apache.org
Subject: saveAsTable fails on Python with Unresolved plan found

Hello,

I am following the tutorial code on sql programming 
guidehttps://spark.apache.org/docs/1.2.1/sql-programming-guide.html#inferring-the-schema-using-reflection
 to try out Python on spark 1.2.1.

SaveAsTable function works on Scala bur fails on python with Unresolved plan 
found.

Broken Python code:

from pyspark.sql import SQLContext, Row

sqlContext = SQLContext(sc)

lines = sc.textFile(data.txt)

parts = lines.map(lambda l: l.split(,))

people = parts.map(lambda p: Row(id=p[0], name=p[1]))

schemaPeople = sqlContext.inferSchema(people)

schemaPeople.saveAsTable(peopletable)

saveAsTable fails with Unresolved plan found.
org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved plan 
found, tree:
'CreateTableAsSelect None, pytable, false, None


This scala code works fine:

from pyspark.sql import SQLContext, Row

sqlContext = SQLContext(sc)

lines = sc.textFile(data.txt)

parts = lines.map(lambda l: l.split(,))

people = parts.map(lambda p: Row(id=p[0], name=p[1]))

schemaPeople = sqlContext.inferSchema(people)

schemaPeople.saveAsTable(peopletable)


Is this a known issue? Or am I not using Python correctly?

Thanks,
Judy


RE: Using 'fair' scheduler mode with thrift server

2015-04-01 Thread Judy Nash
The expensive query can take all executor slots, but no task occupy the 
executor permanently.
i.e. The second job can possibly to take some resources to execute in-between 
tasks of the expensive queries.

Can the fair scheduler mode help in this case? Or is it possible to setup 
thrift such that no query is taking all resources.

From: Sean Owen [mailto:so...@cloudera.com]
Sent: Wednesday, April 1, 2015 12:28 AM
To: Asad Khan
Cc: user@spark.apache.org
Subject: Re: Using 'fair' scheduler mode


Does the expensive query take all executor slots? Then there is nothing for any 
other job to use regardless of scheduling policy.
On Mar 31, 2015 9:20 PM, asadrao 
as...@microsoft.commailto:as...@microsoft.com wrote:
Hi, I am using the Spark ‘fair’ scheduler mode. I have noticed that if the
first query is a very expensive query (ex: ‘select *’ on a really big data
set) than any subsequent query seem to get blocked. I would have expected
the second query to run in parallel since I am using the ‘fair’ scheduler
mode not the ‘fifo’. I am submitting the query through thrift server.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Using-fair-scheduler-mode-tp22328.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: 
user-unsubscr...@spark.apache.orgmailto:user-unsubscr...@spark.apache.org
For additional commands, e-mail: 
user-h...@spark.apache.orgmailto:user-h...@spark.apache.org


Spark SQL does not read from cached table if table is renamed

2015-04-01 Thread Judy Nash
Hi all,

Noticed a bug in my current version of Spark 1.2.1.

After a table is cached with cache table table command, query will not read 
from memory if SQL query renames the table.

This query reads from in memory table
i.e. select hivesampletable.country from default.hivesampletable  group by 
hivesampletable.country

This query with renamed table reads from hive
i.e. select table.country from default.hivesampletable table group by 
table.country


Is this a known bug?
Most BI tools rename tables to avoid table name collision.

Thanks,
Judy



Matching Spark application metrics data to App Id

2015-03-20 Thread Judy Nash
Hi,

I want to get telemetry metrics on spark apps activities, such as run time and 
jvm activities.

Using Spark Metrics I am able to get the following sample data point on the an 
app:
type=GAUGE, name=application.SparkSQL::headnode0.1426626495312.runtime_ms, 
value=414873

How can I match this datapoint to the AppId? (i.e. app-20150317210815-0001)
Spark App name is not an unique identifier.
1426626495312 appear to be unique, but I am unable to see how this is related 
to the AppId.

Thanks,
Judy


RE: configure number of cached partition in memory on SparkSQL

2015-03-19 Thread Judy Nash
Thanks Cheng for replying.

Meant to say to change number of partitions of a cached table. It doesn’t need 
to be re-adjusted after caching.

To provide more context:
What I am seeing on my dataset is that we have a large number of tasks. Since 
it appears each task is mapped to a partition, I want to see if matching 
partitions to available core count will make it faster.

I’ll give your suggestion a try to see if it will help. Experiment is a great 
way to learn more about spark internals.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Monday, March 16, 2015 5:41 AM
To: Judy Nash; user@spark.apache.org
Subject: Re: configure number of cached partition in memory on SparkSQL


Hi Judy,

In the case of HadoopRDD and NewHadoopRDD, partition number is actually decided 
by the InputFormat used. And spark.sql.inMemoryColumnarStorage.batchSize is not 
related to partition number, it controls the in-memory columnar batch size 
within a single partition.

Also, what do you mean by “change the number of partitions after caching the 
table”? Are you trying to re-cache an already cached table with a different 
partition number?

Currently, I don’t see a super intuitive pure SQL way to set the partition 
number in this case. Maybe you can try this (assuming table t has a column s 
which is expected to be sorted):

SET spark.sql.shuffle.partitions = 10;

CACHE TABLE cached_t AS SELECT * FROM t ORDER BY s;

In this way, we introduce a shuffle by sorting a column, and zoom in/out the 
partition number at the same time. This might not be the best way out there, 
but it’s the first one that jumped into my head.

Cheng

On 3/5/15 3:51 AM, Judy Nash wrote:
Hi,

I am tuning a hive dataset on Spark SQL deployed via thrift server.

How can I change the number of partitions created by caching the table on 
thrift server?

I have tried the following but still getting the same number of partitions 
after caching:
Spark.default.parallelism
spark.sql.inMemoryColumnarStorage.batchSize


Thanks,
Judy
​


RE: spark standalone with multiple executors in one work node

2015-03-05 Thread Judy Nash
I meant from one app, yes.

Was asking this because our previous tuning experiment shows spark-on-yarn runs 
faster when overloading workers with executors (i.e. if a worker has 4 cores, 
creating 2 executors each use 4 cores will see a speed boost from 1 executor 
with 4 cores).

I have found an equivalent solution for standalone that have given me a speed 
boost. Instead of adding more executors, I overloaded SPARK_WORKER_CORES to 2x 
of CPU cores on the worker. We are seeing better performance due to CPU now has 
consistent 100% utilization.

-Original Message-
From: Sean Owen [mailto:so...@cloudera.com] 
Sent: Thursday, February 26, 2015 2:11 AM
To: Judy Nash
Cc: user@spark.apache.org
Subject: Re: spark standalone with multiple executors in one work node

--num-executors is the total number of executors. In YARN there is not quite 
the same notion of a Spark worker. Of course, one worker has an executor for 
each running app, so yes, but you mean for one app? it's possible, though not 
usual, to run multiple executors for one app on one worker. This may be useful 
if your executor heap size is otherwise getting huge.

On Thu, Feb 26, 2015 at 1:58 AM, Judy Nash judyn...@exchange.microsoft.com 
wrote:
 Hello,



 Does spark standalone support running multiple executors in one worker node?



 It seems yarn has the parameter --num-executors  to set number of 
 executors to deploy, but I do not find the equivalent parameter in spark 
 standalone.





 Thanks,

 Judy


configure number of cached partition in memory on SparkSQL

2015-03-04 Thread Judy Nash
Hi,

I am tuning a hive dataset on Spark SQL deployed via thrift server.

How can I change the number of partitions after caching the table on thrift 
server?

I have tried the following but still getting the same number of partitions 
after caching:
Spark.default.parallelism
spark.sql.inMemoryColumnarStorage.batchSize


Thanks,
Judy


spark standalone with multiple executors in one work node

2015-02-25 Thread Judy Nash
Hello,

Does spark standalone support running multiple executors in one worker node?

It seems yarn has the parameter --num-executors  to set number of executors to 
deploy, but I do not find the equivalent parameter in spark standalone.


Thanks,
Judy


[jira] [Updated] (SPARK-5914) Enable spark-submit to run requiring only user permission on windows

2015-02-24 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated SPARK-5914:
-
Summary: Enable spark-submit to run requiring only user permission on 
windows  (was: Enable spark-submit to run with only user permission on windows)

 Enable spark-submit to run requiring only user permission on windows
 

 Key: SPARK-5914
 URL: https://issues.apache.org/jira/browse/SPARK-5914
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit, Windows
 Environment: Windows
Reporter: Judy Nash
Priority: Minor

 On windows platform only. 
 If slave is executed with user permission, spark-submit fails with 
 java.lang.ClassNotFoundException when attempting to read the cached jar from 
 spark_home\work folder. 
 This is due to the jars do not have read permission set by default on 
 windows. Fix is to add read permission explicitly for owner of the file. 
 Having service account running as admin (equivalent of sudo in Linux) is a 
 major security risk for production clusters. This make it easy for hackers to 
 compromise the cluster by taking over the service account. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



RE: spark slave cannot execute without admin permission on windows

2015-02-24 Thread Judy Nash
Update to the thread.

Upon investigation, this is a bug on windows. Windows does not grant user 
permission read permission to jar files by default.
Have created a pull request for 
SPARK-5914https://issues.apache.org/jira/browse/SPARK-5914 to grant read 
permission to jar owner (slave service account in this case). With this fix, 
slave will be able to run without admin permission.
FYI: master  thrift server works fine with only user permission, so no issue 
there.

From: Judy Nash [mailto:judyn...@exchange.microsoft.com]
Sent: Thursday, February 19, 2015 12:26 AM
To: Akhil Das; dev@spark.apache.org
Cc: u...@spark.apache.org
Subject: RE: spark slave cannot execute without admin permission on windows

+ dev mailing list

If this is supposed to work, is there a regression then?

The spark core code shows the permission for copied file to \work is set to a+x 
at Line 442 of 
Utils.scalahttps://github.com/apache/spark/blob/b271c265b742fa6947522eda4592e9e6a7fd1f3a/core/src/main/scala/org/apache/spark/util/Utils.scala
 .
The example jar I used had all permissions including Read  Execute prior 
spark-submit:
[cid:image001.png@01D04FCA.85961CE0]
However after copied to worker node’s \work folder, only limited permission 
left on the jar with no execution right.
[cid:image002.png@01D04FCA.85961CE0]

From: Akhil Das [mailto:ak...@sigmoidanalytics.com]
Sent: Wednesday, February 18, 2015 10:40 PM
To: Judy Nash
Cc: u...@spark.apache.orgmailto:u...@spark.apache.org
Subject: Re: spark slave cannot execute without admin permission on windows

You need not require admin permission, but just make sure all those jars has 
execute permission ( read/write access)

Thanks
Best Regards

On Thu, Feb 19, 2015 at 11:30 AM, Judy Nash 
judyn...@exchange.microsoft.commailto:judyn...@exchange.microsoft.com wrote:
Hi,

Is it possible to configure spark to run without admin permission on windows?

My current setup run master  slave successfully with admin permission.
However, if I downgrade permission level from admin to user, SparkPi fails with 
the following exception on the slave node:
Exception in thread main org.apache.spark.SparkException: Job aborted due to s
tage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task
0.3 in stage 0.0 (TID 9, 
workernode0.jnashsparkcurr2.d10.internal.cloudapp.nethttp://workernode0.jnashsparkcurr2.d10.internal.cloudapp.net)
: java.lang.ClassNotFoundException: org.apache.spark.examples.SparkPi$$anonfun$1

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)

Upon investigation, it appears that sparkPi jar under 
spark_home\worker\appname\*.jar does not have execute permission set, causing 
spark not able to find class.

Advice would be very much appreciated.

Thanks,
Judy




[jira] [Updated] (SPARK-5914) Enable spark-submit to run with only user permission on windows

2015-02-23 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-5914?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated SPARK-5914:
-
Summary: Enable spark-submit to run with only user permission on windows  
(was: Spark-submit cannot execute without machine admin permission on windows)

 Enable spark-submit to run with only user permission on windows
 ---

 Key: SPARK-5914
 URL: https://issues.apache.org/jira/browse/SPARK-5914
 Project: Spark
  Issue Type: Bug
  Components: Spark Submit, Windows
 Environment: Windows
Reporter: Judy Nash
Priority: Minor

 On windows platform only. 
 If slave is executed with user permission, spark-submit fails with 
 java.lang.ClassNotFoundException when attempting to read the cached jar from 
 spark_home\work folder. 
 This is due to the jars do not have read permission set by default on 
 windows. Fix is to add read permission explicitly for owner of the file. 
 Having service account running as admin (equivalent of sudo in Linux) is a 
 major security risk for production clusters. This make it easy for hackers to 
 compromise the cluster by taking over the service account. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



RE: spark slave cannot execute without admin permission on windows

2015-02-19 Thread Judy Nash
+ dev mailing list

If this is supposed to work, is there a regression then?

The spark core code shows the permission for copied file to \work is set to a+x 
at Line 442 of 
Utils.scalahttps://github.com/apache/spark/blob/b271c265b742fa6947522eda4592e9e6a7fd1f3a/core/src/main/scala/org/apache/spark/util/Utils.scala
 .
The example jar I used had all permissions including Read  Execute prior 
spark-submit:
[cid:image001.png@01D04BDA.A74C65E0]
However after copied to worker node’s \work folder, only limited permission 
left on the jar with no execution right.
[cid:image002.png@01D04BDA.A74C65E0]

From: Akhil Das [mailto:ak...@sigmoidanalytics.com]
Sent: Wednesday, February 18, 2015 10:40 PM
To: Judy Nash
Cc: u...@spark.apache.org
Subject: Re: spark slave cannot execute without admin permission on windows

You need not require admin permission, but just make sure all those jars has 
execute permission ( read/write access)

Thanks
Best Regards

On Thu, Feb 19, 2015 at 11:30 AM, Judy Nash 
judyn...@exchange.microsoft.commailto:judyn...@exchange.microsoft.com wrote:
Hi,

Is it possible to configure spark to run without admin permission on windows?

My current setup run master  slave successfully with admin permission.
However, if I downgrade permission level from admin to user, SparkPi fails with 
the following exception on the slave node:
Exception in thread main org.apache.spark.SparkException: Job aborted due to s
tage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task
0.3 in stage 0.0 (TID 9, 
workernode0.jnashsparkcurr2.d10.internal.cloudapp.nethttp://workernode0.jnashsparkcurr2.d10.internal.cloudapp.net)
: java.lang.ClassNotFoundException: org.apache.spark.examples.SparkPi$$anonfun$1

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)

Upon investigation, it appears that sparkPi jar under 
spark_home\worker\appname\*.jar does not have execute permission set, causing 
spark not able to find class.

Advice would be very much appreciated.

Thanks,
Judy




[jira] [Created] (SPARK-5914) Spark-submit cannot execute without machine admin permission on windows

2015-02-19 Thread Judy Nash (JIRA)
Judy Nash created SPARK-5914:


 Summary: Spark-submit cannot execute without machine admin 
permission on windows
 Key: SPARK-5914
 URL: https://issues.apache.org/jira/browse/SPARK-5914
 Project: Spark
  Issue Type: Bug
  Components: Spark Core
 Environment: Windows
Reporter: Judy Nash
Priority: Minor


On windows platform only. 

If slave is executed with user permission, spark-submit fails with 
java.lang.ClassNotFoundException when attempting to read the cached jar from 
spark_home\work folder. 

This is due to the jars do not have read permission set by default on windows. 
Fix is to add read permission explicitly for owner of the file. 

Having service account running as admin (equivalent of sudo in Linux) is a 
major security risk for production clusters. This make it easy for hackers to 
compromise the cluster by taking over the service account. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



spark slave cannot execute without admin permission on windows

2015-02-18 Thread Judy Nash
Hi,

Is it possible to configure spark to run without admin permission on windows?

My current setup run master  slave successfully with admin permission.
However, if I downgrade permission level from admin to user, SparkPi fails with 
the following exception on the slave node:
Exception in thread main org.apache.spark.SparkException: Job aborted due to s
tage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task
0.3 in stage 0.0 (TID 9, workernode0.jnashsparkcurr2.d10.internal.cloudapp.net)
: java.lang.ClassNotFoundException: org.apache.spark.examples.SparkPi$$anonfun$1

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
at java.lang.Class.forName(Class.java:270)

Upon investigation, it appears that sparkPi jar under 
spark_home\worker\appname\*.jar does not have execute permission set, causing 
spark not able to find class.

Advice would be very much appreciated.

Thanks,
Judy



RE: Is the Thrift server right for me?

2015-02-11 Thread Judy Nash
It should relay the queries to spark (i.e. you shouldn't see any MR job on 
Hadoop  you should see activities on the spark app on headnode UI).

Check your hive-site.xml. Are you directing to the hive server 2 port instead 
of spark thrift port?
Their default ports are both 1.

From: Andrew Lee [mailto:alee...@hotmail.com]
Sent: Wednesday, February 11, 2015 12:00 PM
To: sjbrunst; user@spark.apache.org
Subject: RE: Is the Thrift server right for me?

I have ThriftServer2 up and running, however, I notice that it relays the query 
to HiveServer2 when I pass the hive-site.xml to it.

I'm not sure if this is the expected behavior, but based on what I have up and 
running, the ThriftServer2 invokes HiveServer2 that results in MapReduce or Tez 
query. In this case, I could just connect directly to HiveServer2 if Hive is 
all you need.

If you are programmer and want to mash up data from Hive with other tables and 
data in Spark, then Spark ThriftServer2 seems to be a good integration point at 
some use case.

Please correct me if I misunderstood the purpose of Spark ThriftServer2.

 Date: Thu, 8 Jan 2015 14:49:00 -0700
 From: sjbru...@uwaterloo.camailto:sjbru...@uwaterloo.ca
 To: user@spark.apache.orgmailto:user@spark.apache.org
 Subject: Is the Thrift server right for me?

 I'm building a system that collects data using Spark Streaming, does some
 processing with it, then saves the data. I want the data to be queried by
 multiple applications, and it sounds like the Thrift JDBC/ODBC server might
 be the right tool to handle the queries. However, the documentation for the
 Thrift server
 http://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-thrift-jdbcodbc-server
 seems to be written for Hive users who are moving to Spark. I never used
 Hive before I started using Spark, so it is not clear to me how best to use
 this.

 I've tried putting data into Hive, then serving it with the Thrift server.
 But I have not been able to update the data in Hive without first shutting
 down the server. This is a problem because new data is always being streamed
 in, and so the data must continuously be updated.

 The system I'm building is supposed to replace a system that stores the data
 in MongoDB. The dataset has now grown so large that the database index does
 not fit in memory, which causes major performance problems in MongoDB.

 If the Thrift server is the right tool for me, how can I set it up for my
 application? If it is not the right tool, what else can I use?



 --
 View this message in context: 
 http://apache-spark-user-list.1001560.n3.nabble.com/Is-the-Thrift-server-right-for-me-tp21044.html
 Sent from the Apache Spark User List mailing list archive at Nabble.com.

 -
 To unsubscribe, e-mail: 
 user-unsubscr...@spark.apache.orgmailto:user-unsubscr...@spark.apache.org
 For additional commands, e-mail: 
 user-h...@spark.apache.orgmailto:user-h...@spark.apache.org



[jira] [Created] (SPARK-5708) Add Slf4jSink to Spark Metrics Sink

2015-02-09 Thread Judy Nash (JIRA)
Judy Nash created SPARK-5708:


 Summary: Add Slf4jSink to Spark Metrics Sink
 Key: SPARK-5708
 URL: https://issues.apache.org/jira/browse/SPARK-5708
 Project: Spark
  Issue Type: Bug
Reporter: Judy Nash


Add Slf4jSink to the currently supported metric sinks.

This is convenient for those who want metrics data for telemetry purposes, but 
want to reuse the pre-setup log4j pipeline. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



New Metrics Sink class not packaged in spark-assembly jar

2015-02-09 Thread Judy Nash
Hello,

Working on SPARK-5708https://issues.apache.org/jira/browse/SPARK-5708 - Add 
Slf4jSink to Spark Metrics Sink.

Wrote a new Slf4jSink class (see patch attached), but the new class is not 
packaged as part of spark-assembly jar.

Do I need to update build config somewhere to have this packaged?

Current packaged class:
[cid:image001.png@01D044B4.1B17A1C0]

Thought I must have missed something basic but can't figure out why.

Thanks!
Judy

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

RE: New Metrics Sink class not packaged in spark-assembly jar

2015-02-09 Thread Judy Nash
Thanks Patrick! That was the issue.
Built the jars on windows env with mvn and forgot to run make-distributions.ps1 
 afterward, so was looking at old jars.

From: Patrick Wendell [mailto:pwend...@gmail.com]
Sent: Monday, February 9, 2015 10:43 PM
To: Judy Nash
Cc: dev@spark.apache.org
Subject: Re: New Metrics Sink class not packaged in spark-assembly jar

Actually, to correct myself, the assembly jar is in assembly/target/scala-2.11 
(I think).

On Mon, Feb 9, 2015 at 10:42 PM, Patrick Wendell 
pwend...@gmail.commailto:pwend...@gmail.com wrote:
Hi Judy,

If you have added source files in the sink/ source folder, they should appear 
in the assembly jar when you build. One thing I noticed is that you are looking 
inside the /dist folder. That only gets populated if you run 
make-distribution. The normal development process is just to do mvn package 
and then look at the assembly jar that is contained in core/target.

- Patrick

On Mon, Feb 9, 2015 at 10:02 PM, Judy Nash 
judyn...@exchange.microsoft.commailto:judyn...@exchange.microsoft.com wrote:
Hello,

Working on SPARK-5708https://issues.apache.org/jira/browse/SPARK-5708 - Add 
Slf4jSink to Spark Metrics Sink.

Wrote a new Slf4jSink class (see patch attached), but the new class is not 
packaged as part of spark-assembly jar.

Do I need to update build config somewhere to have this packaged?

Current packaged class:
[cid:image001.png@01D044BC.8FE515C0]

Thought I must have missed something basic but can't figure out why.

Thanks!
Judy

-
To unsubscribe, e-mail: 
dev-unsubscr...@spark.apache.orgmailto:dev-unsubscr...@spark.apache.org
For additional commands, e-mail: 
dev-h...@spark.apache.orgmailto:dev-h...@spark.apache.org




Spark Metrics Servlet for driver and executor

2015-02-05 Thread Judy Nash
Hi all,

Looking at spark metricsServlet.

What is the url exposing driver  executor json response?

Found master and worker successfully, but can't find url that return json for 
the other 2 sources.


Thanks!
Judy


RE: spark 1.2 compatibility

2015-01-16 Thread Judy Nash
Yes. It's compatible with HDP 2.1 

-Original Message-
From: bhavyateja [mailto:bhavyateja.potin...@gmail.com] 
Sent: Friday, January 16, 2015 3:17 PM
To: user@spark.apache.org
Subject: spark 1.2 compatibility

Is spark 1.2 is compatibly with HDP 2.1



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-2-compatibility-tp21197.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: spark 1.2 compatibility

2015-01-16 Thread Judy Nash
Should clarify on this. I personally have used HDP 2.1 + Spark 1.2 and have not 
seen a problem. 

However officially HDP 2.1 + Spark 1.2 is not a supported scenario. 

-Original Message-
From: Judy Nash 
Sent: Friday, January 16, 2015 5:35 PM
To: 'bhavyateja'; user@spark.apache.org
Subject: RE: spark 1.2 compatibility

Yes. It's compatible with HDP 2.1 

-Original Message-
From: bhavyateja [mailto:bhavyateja.potin...@gmail.com] 
Sent: Friday, January 16, 2015 3:17 PM
To: user@spark.apache.org
Subject: spark 1.2 compatibility

Is spark 1.2 is compatibly with HDP 2.1



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/spark-1-2-compatibility-tp21197.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: Spark SQL API Doc IsCached as SQL command

2014-12-16 Thread Judy Nash
Thanks Cheng. Tried it out and saw the InMemoryColumnarTableScan word in the 
physical plan.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Friday, December 12, 2014 11:37 PM
To: Judy Nash; user@spark.apache.org
Subject: Re: Spark SQL API Doc  IsCached as SQL command


There isn’t a SQL statement that directly maps SQLContext.isCached, but you can 
use EXPLAIN EXTENDED to check whether the underlying physical plan is a 
InMemoryColumnarTableScan.

On 12/13/14 7:14 AM, Judy Nash wrote:
Hello,

Few questions on Spark SQL:


1)  Does Spark SQL support equivalent SQL Query for Scala command: 
IsCached(table name) ?


2)  Is there a documentation spec I can reference for question like this?



Closest doc I can find is this one: 
https://spark.apache.org/docs/1.1.0/sql-programming-guide.html#deploying-in-existing-hive-warehouses


Thanks,
Judy
​


Spark SQL API Doc IsCached as SQL command

2014-12-12 Thread Judy Nash
Hello,

Few questions on Spark SQL:


1)  Does Spark SQL support equivalent SQL Query for Scala command: 
IsCached(table name) ?


2)  Is there a documentation spec I can reference for question like this?



Closest doc I can find is this one: 
https://spark.apache.org/docs/1.1.0/sql-programming-guide.html#deploying-in-existing-hive-warehouses


Thanks,
Judy


[jira] [Updated] (SPARK-4700) Add Http support to Spark Thrift server

2014-12-10 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated SPARK-4700:
-
Description: 
Currently thrift only supports TCP connection. 

The JIRA is to add HTTP support to spark thrift server in addition to the TCP 
protocol. Both TCP and HTTP are supported by Hive today. HTTP is more secure 
and used often in Windows. 

  was:
Currently thrift only supports TCP connection. 

The ask is to add HTTP connection as well. Both TCP and HTTP are supported by 
Hive today. 


 Add Http support to Spark Thrift server
 ---

 Key: SPARK-4700
 URL: https://issues.apache.org/jira/browse/SPARK-4700
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 1.2.1
 Environment: Linux and Windows
Reporter: Judy Nash
   Original Estimate: 48h
  Remaining Estimate: 48h

 Currently thrift only supports TCP connection. 
 The JIRA is to add HTTP support to spark thrift server in addition to the TCP 
 protocol. Both TCP and HTTP are supported by Hive today. HTTP is more secure 
 and used often in Windows. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-4700) Add Http support to Spark Thrift server

2014-12-10 Thread Judy Nash (JIRA)

 [ 
https://issues.apache.org/jira/browse/SPARK-4700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Judy Nash updated SPARK-4700:
-
Affects Version/s: 1.3.0

 Add Http support to Spark Thrift server
 ---

 Key: SPARK-4700
 URL: https://issues.apache.org/jira/browse/SPARK-4700
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 1.3.0, 1.2.1
 Environment: Linux and Windows
Reporter: Judy Nash
   Original Estimate: 48h
  Remaining Estimate: 48h

 Currently thrift only supports TCP connection. 
 The JIRA is to add HTTP support to spark thrift server in addition to the TCP 
 protocol. Both TCP and HTTP are supported by Hive today. HTTP is more secure 
 and used often in Windows. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



RE: Spark-SQL JDBC driver

2014-12-10 Thread Judy Nash
Looks like you are wondering why you cannot see the RDD table you have created 
via thrift?

Based on my own experience with spark 1.1, RDD created directly via Spark SQL 
(i.e. Spark Shell or Spark-SQL.sh) is not visible on thrift, since thrift has 
its own session containing its own RDD.
Spark SQL experts on the forum can confirm on this though.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Tuesday, December 9, 2014 6:42 AM
To: Anas Mosaad
Cc: Judy Nash; user@spark.apache.org
Subject: Re: Spark-SQL JDBC driver

According to the stacktrace, you were still using SQLContext rather than 
HiveContext. To interact with Hive, HiveContext *must* be used.

Please refer to this page 
http://spark.apache.org/docs/latest/sql-programming-guide.html#hive-tables

On 12/9/14 6:26 PM, Anas Mosaad wrote:
Back to the first question, this will mandate that hive is up and running?

When I try it, I get the following exception. The documentation says that this 
method works only on SchemaRDD. I though that countries.saveAsTable did not 
work for that a reason so I created a tmp that contains the results from the 
registered temp table. Which I could validate that it's a SchemaRDD as shown 
below.


@Judy, I do really appreciate your kind support and I want to understand and 
off course don't want to wast your time. If you can direct me the documentation 
describing this details, this will be great.


scala val tmp = sqlContext.sql(select * from countries)

tmp: org.apache.spark.sql.SchemaRDD =

SchemaRDD[12] at RDD at SchemaRDD.scala:108

== Query Plan ==

== Physical Plan ==

PhysicalRDD 
[COUNTRY_ID#20,COUNTRY_ISO_CODE#21,COUNTRY_NAME#22,COUNTRY_SUBREGION#23,COUNTRY_SUBREGION_ID#24,COUNTRY_REGION#25,COUNTRY_REGION_ID#26,COUNTRY_TOTAL#27,COUNTRY_TOTAL_ID#28,COUNTRY_NAME_HIST#29],
 MapPartitionsRDD[9] at mapPartitions at ExistingRDD.scala:36



scala tmp.saveAsTable(Countries)

org.apache.spark.sql.catalyst.errors.package$TreeNodeException: Unresolved plan 
found, tree:

'CreateTableAsSelect None, Countries, false, None

 Project 
[COUNTRY_ID#20,COUNTRY_ISO_CODE#21,COUNTRY_NAME#22,COUNTRY_SUBREGION#23,COUNTRY_SUBREGION_ID#24,COUNTRY_REGION#25,COUNTRY_REGION_ID#26,COUNTRY_TOTAL#27,COUNTRY_TOTAL_ID#28,COUNTRY_NAME_HIST#29]

  Subquery countries

   LogicalRDD 
[COUNTRY_ID#20,COUNTRY_ISO_CODE#21,COUNTRY_NAME#22,COUNTRY_SUBREGION#23,COUNTRY_SUBREGION_ID#24,COUNTRY_REGION#25,COUNTRY_REGION_ID#26,COUNTRY_TOTAL#27,COUNTRY_TOTAL_ID#28,COUNTRY_NAME_HIST#29],
 MapPartitionsRDD[9] at mapPartitions at ExistingRDD.scala:36



at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:83)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$$anonfun$1.applyOrElse(Analyzer.scala:78)

at 
org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:144)

at org.apache.spark.sql.catalyst.trees.TreeNode.transform(TreeNode.scala:135)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:78)

at 
org.apache.spark.sql.catalyst.analysis.Analyzer$CheckResolution$.apply(Analyzer.scala:76)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:61)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1$$anonfun$apply$2.apply(RuleExecutor.scala:59)

at 
scala.collection.IndexedSeqOptimized$class.foldl(IndexedSeqOptimized.scala:51)

at 
scala.collection.IndexedSeqOptimized$class.foldLeft(IndexedSeqOptimized.scala:60)

at scala.collection.mutable.WrappedArray.foldLeft(WrappedArray.scala:34)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:59)

at 
org.apache.spark.sql.catalyst.rules.RuleExecutor$$anonfun$apply$1.apply(RuleExecutor.scala:51)

at scala.collection.immutable.List.foreach(List.scala:318)

at org.apache.spark.sql.catalyst.rules.RuleExecutor.apply(RuleExecutor.scala:51)

at 
org.apache.spark.sql.SQLContext$QueryExecution.analyzed$lzycompute(SQLContext.scala:411)

at org.apache.spark.sql.SQLContext$QueryExecution.analyzed(SQLContext.scala:411)

at 
org.apache.spark.sql.SQLContext$QueryExecution.withCachedData$lzycompute(SQLContext.scala:412)

at 
org.apache.spark.sql.SQLContext$QueryExecution.withCachedData(SQLContext.scala:412)

at 
org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan$lzycompute(SQLContext.scala:413)

at 
org.apache.spark.sql.SQLContext$QueryExecution.optimizedPlan(SQLContext.scala:413)

at 
org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan$lzycompute(SQLContext.scala:418)

at 
org.apache.spark.sql.SQLContext$QueryExecution.sparkPlan(SQLContext.scala:416)

at 
org.apache.spark.sql.SQLContext$QueryExecution.executedPlan$lzycompute(SQLContext.scala:422)

at 
org.apache.spark.sql.SQLContext$QueryExecution.executedPlan(SQLContext.scala:422)

at 
org.apache.spark.sql.SQLContext$QueryExecution.toRdd$lzycompute(SQLContext.scala:425)

at org.apache.spark.sql.SQLContext

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-12-09 Thread Judy Nash
To report back how I ultimately solved this issue and someone else can do:

1) Check each jar class path and make sure the jars are listed in the order of 
Guava class version (i.e. spark-assembly needs to list before Hadoop 2.4 
because spark-assembly has guava 14 and Hadoop 2.4 has guava 11). May require 
update compute-classpath.sh to get the ordering right. 

2) If the other jars uses a higher version, bump spark guava library to higher 
version. Guava supposedly to be very backward compatible.  

Hope this helps. 

-Original Message-
From: Marcelo Vanzin [mailto:van...@cloudera.com] 
Sent: Tuesday, December 2, 2014 11:35 AM
To: Judy Nash
Cc: Patrick Wendell; Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

On Tue, Dec 2, 2014 at 11:22 AM, Judy Nash judyn...@exchange.microsoft.com 
wrote:
 Any suggestion on how can user with custom Hadoop jar solve this issue?

You'll need to include all the dependencies for that custom Hadoop jar to the 
classpath. Those will include Guava (which is not included in its original form 
as part of the Spark dependencies).



 -Original Message-
 From: Patrick Wendell [mailto:pwend...@gmail.com]
 Sent: Sunday, November 30, 2014 11:06 PM
 To: Judy Nash
 Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava

 Thanks Judy. While this is not directly caused by a Spark issue, it is likely 
 other users will run into this. This is an unfortunate consequence of the way 
 that we've shaded Guava in this release, we rely on byte code shading of 
 Hadoop itself as well. And if the user has their own Hadoop classes present 
 it can cause issues.

 On Sun, Nov 30, 2014 at 10:53 PM, Judy Nash judyn...@exchange.microsoft.com 
 wrote:
 Thanks Patrick and Cheng for the suggestions.

 The issue was Hadoop common jar was added to a classpath. After I removed 
 Hadoop common jar from both master and slave, I was able to bypass the error.
 This was caused by a local change, so no impact on the 1.2 release.
 -Original Message-
 From: Patrick Wendell [mailto:pwend...@gmail.com]
 Sent: Wednesday, November 26, 2014 8:17 AM
 To: Judy Nash
 Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava

 Just to double check - I looked at our own assembly jar and I confirmed that 
 our Hadoop configuration class does use the correctly shaded version of 
 Guava. My best guess here is that somehow a separate Hadoop library is 
 ending up on the classpath, possible because Spark put it there somehow.

 tar xvzf spark-assembly-1.3.0-SNAPSHOT-hadoop2.4.0.jar
 cd org/apache/hadoop/
 javap -v Configuration | grep Precond

 Warning: Binary file Configuration contains 
 org.apache.hadoop.conf.Configuration

#497 = Utf8   
 org/spark-project/guava/common/base/Preconditions

#498 = Class  #497 //
 org/spark-project/guava/common/base/Preconditions

#502 = Methodref  #498.#501//
 org/spark-project/guava/common/base/Preconditions.checkArgument:(ZL
 j
 ava/lang/Object;)V

 12: invokestatic  #502// Method
 org/spark-project/guava/common/base/Preconitions.checkArgument:(ZLj
 a
 va/lang/Object;)V

 50: invokestatic  #502// Method
 org/spark-project/guava/common/base/Preconitions.checkArgument:(ZLj
 a
 va/lang/Object;)V

 On Wed, Nov 26, 2014 at 11:08 AM, Patrick Wendell pwend...@gmail.com wrote:
 Hi Judy,

 Are you somehow modifying Spark's classpath to include jars from 
 Hadoop and Hive that you have running on the machine? The issue 
 seems to be that you are somehow including a version of Hadoop that 
 references the original guava package. The Hadoop that is bundled in 
 the Spark jars should not do this.

 - Patrick

 On Wed, Nov 26, 2014 at 1:45 AM, Judy Nash 
 judyn...@exchange.microsoft.com wrote:
 Looks like a config issue. I ran spark-pi job and still failing 
 with the same guava error

 Command ran:

 .\bin\spark-class.cmd org.apache.spark.deploy.SparkSubmit --class 
 org.apache.spark.examples.SparkPi --master 
 spark://headnodehost:7077 --executor-memory 1G --num-executors 1 
 .\lib\spark-examples-1.2.1-SNAPSHOT-hadoop2.4.0.jar 100



 Had used the same build steps on spark 1.1 and had no issue.



 From: Denny Lee [mailto:denny.g@gmail.com]
 Sent: Tuesday, November 25, 2014 5:47 PM
 To: Judy Nash; Cheng Lian; u...@spark.incubator.apache.org


 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava



 To determine if this is a Windows vs. other configuration, can you 
 just try to call the Spark-class.cmd SparkSubmit without actually 
 referencing the Hadoop or Thrift server classes?





 On Tue Nov 25 2014 at 5:42:09 PM Judy Nash 
 judyn...@exchange.microsoft.com
 wrote:

 I

RE: Spark-SQL JDBC driver

2014-12-08 Thread Judy Nash
You can use thrift server for this purpose then test it with beeline.

See doc:
https://spark.apache.org/docs/latest/sql-programming-guide.html#running-the-thrift-jdbc-server


From: Anas Mosaad [mailto:anas.mos...@incorta.com]
Sent: Monday, December 8, 2014 11:01 AM
To: user@spark.apache.org
Subject: Spark-SQL JDBC driver

Hello Everyone,

I'm brand new to spark and was wondering if there's a JDBC driver to access 
spark-SQL directly. I'm running spark in standalone mode and don't have hadoop 
in this environment.

--

Best Regards/أطيب المنى,

Anas Mosaad



RE: build in IntelliJ IDEA

2014-12-07 Thread Judy Nash
Thanks Josh. That was the issue.

From: Josh Rosen [mailto:rosenvi...@gmail.com]
Sent: Friday, December 5, 2014 3:21 PM
To: Judy Nash; dev@spark.apache.org
Subject: Re: build in IntelliJ IDEA

If you go to “File - Project Structure” and click on “Project” under the 
“Project settings” heading, do you see an entry for “Project SDK?”  If not, you 
should click “New…” and configure a JDK; by default, I think IntelliJ should 
figure out a correct path to your system JDK, so you should just be able to hit 
“Ok” then rebuild your project.   For reference, here’s a screenshot showing 
what my version of that window looks like: http://i.imgur.com/hRfQjIi.png


On December 5, 2014 at 1:52:35 PM, Judy Nash 
(judyn...@exchange.microsoft.commailto:judyn...@exchange.microsoft.com) wrote:
Hi everyone,

Have a newbie question on using IntelliJ to build and debug.

I followed this wiki to setup IntelliJ:
https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-BuildingSparkinIntelliJIDEA

Afterward I tried to build via Toolbar (Build  Rebuild Project).
The action fails with the error message:
Cannot start compiler: the SDK is not specified.

What SDK do I need to specify to get the build working?

Thanks,
Judy


monitoring for spark standalone

2014-12-07 Thread Judy Nash
Hello,

Are there ways we can programmatically get health status of master  slave 
nodes, similar to Hadoop Ambari?

Wiki seems to suggest there are only web UI or instrumentations 
(http://spark.apache.org/docs/latest/monitoring.html).

Thanks,
Judy



build in IntelliJ IDEA

2014-12-05 Thread Judy Nash
Hi everyone,

Have a newbie question on using IntelliJ to build and debug.

I followed this wiki to setup IntelliJ:
https://cwiki.apache.org/confluence/display/SPARK/Useful+Developer+Tools#UsefulDeveloperTools-BuildingSparkinIntelliJIDEA

Afterward I tried to build via Toolbar (Build  Rebuild Project).
The action fails with the error message:
Cannot start compiler: the SDK is not specified.

What SDK do I need to specify to get the build working?

Thanks,
Judy


[jira] [Created] (SPARK-4700) Add Http support to Spark Thrift server

2014-12-02 Thread Judy Nash (JIRA)
Judy Nash created SPARK-4700:


 Summary: Add Http support to Spark Thrift server
 Key: SPARK-4700
 URL: https://issues.apache.org/jira/browse/SPARK-4700
 Project: Spark
  Issue Type: New Feature
  Components: SQL
Affects Versions: 1.2.1
 Environment: Linux and Windows
Reporter: Judy Nash


Currently thrift only supports TCP connection. 

The ask is to add HTTP connection as well. Both TCP and HTTP are supported by 
Hive today. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-12-02 Thread Judy Nash
Any suggestion on how can user with custom Hadoop jar solve this issue? 

-Original Message-
From: Patrick Wendell [mailto:pwend...@gmail.com] 
Sent: Sunday, November 30, 2014 11:06 PM
To: Judy Nash
Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Thanks Judy. While this is not directly caused by a Spark issue, it is likely 
other users will run into this. This is an unfortunate consequence of the way 
that we've shaded Guava in this release, we rely on byte code shading of Hadoop 
itself as well. And if the user has their own Hadoop classes present it can 
cause issues.

On Sun, Nov 30, 2014 at 10:53 PM, Judy Nash judyn...@exchange.microsoft.com 
wrote:
 Thanks Patrick and Cheng for the suggestions.

 The issue was Hadoop common jar was added to a classpath. After I removed 
 Hadoop common jar from both master and slave, I was able to bypass the error.
 This was caused by a local change, so no impact on the 1.2 release.
 -Original Message-
 From: Patrick Wendell [mailto:pwend...@gmail.com]
 Sent: Wednesday, November 26, 2014 8:17 AM
 To: Judy Nash
 Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava

 Just to double check - I looked at our own assembly jar and I confirmed that 
 our Hadoop configuration class does use the correctly shaded version of 
 Guava. My best guess here is that somehow a separate Hadoop library is ending 
 up on the classpath, possible because Spark put it there somehow.

 tar xvzf spark-assembly-1.3.0-SNAPSHOT-hadoop2.4.0.jar
 cd org/apache/hadoop/
 javap -v Configuration | grep Precond

 Warning: Binary file Configuration contains 
 org.apache.hadoop.conf.Configuration

#497 = Utf8   org/spark-project/guava/common/base/Preconditions

#498 = Class  #497 //
 org/spark-project/guava/common/base/Preconditions

#502 = Methodref  #498.#501//
 org/spark-project/guava/common/base/Preconditions.checkArgument:(ZLj
 ava/lang/Object;)V

 12: invokestatic  #502// Method
 org/spark-project/guava/common/base/Preconitions.checkArgument:(ZLja
 va/lang/Object;)V

 50: invokestatic  #502// Method
 org/spark-project/guava/common/base/Preconitions.checkArgument:(ZLja
 va/lang/Object;)V

 On Wed, Nov 26, 2014 at 11:08 AM, Patrick Wendell pwend...@gmail.com wrote:
 Hi Judy,

 Are you somehow modifying Spark's classpath to include jars from 
 Hadoop and Hive that you have running on the machine? The issue seems 
 to be that you are somehow including a version of Hadoop that 
 references the original guava package. The Hadoop that is bundled in 
 the Spark jars should not do this.

 - Patrick

 On Wed, Nov 26, 2014 at 1:45 AM, Judy Nash 
 judyn...@exchange.microsoft.com wrote:
 Looks like a config issue. I ran spark-pi job and still failing with 
 the same guava error

 Command ran:

 .\bin\spark-class.cmd org.apache.spark.deploy.SparkSubmit --class 
 org.apache.spark.examples.SparkPi --master spark://headnodehost:7077 
 --executor-memory 1G --num-executors 1 
 .\lib\spark-examples-1.2.1-SNAPSHOT-hadoop2.4.0.jar 100



 Had used the same build steps on spark 1.1 and had no issue.



 From: Denny Lee [mailto:denny.g@gmail.com]
 Sent: Tuesday, November 25, 2014 5:47 PM
 To: Judy Nash; Cheng Lian; u...@spark.incubator.apache.org


 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava



 To determine if this is a Windows vs. other configuration, can you 
 just try to call the Spark-class.cmd SparkSubmit without actually 
 referencing the Hadoop or Thrift server classes?





 On Tue Nov 25 2014 at 5:42:09 PM Judy Nash 
 judyn...@exchange.microsoft.com
 wrote:

 I traced the code and used the following to call:

 Spark-class.cmd org.apache.spark.deploy.SparkSubmit --class
 org.apache.spark.sql.hive.thriftserver.HiveThriftServer2
 spark-internal --hiveconf hive.server2.thrift.port=1



 The issue ended up to be much more fundamental however. Spark 
 doesn't work at all in configuration below. When open spark-shell, 
 it fails with the same ClassNotFound error.

 Now I wonder if this is a windows-only issue or the hive/Hadoop 
 configuration that is having this problem.



 From: Cheng Lian [mailto:lian.cs@gmail.com]
 Sent: Tuesday, November 25, 2014 1:50 AM


 To: Judy Nash; u...@spark.incubator.apache.org
 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava



 Oh so you're using Windows. What command are you using to start the 
 Thrift server then?

 On 11/25/14 4:25 PM, Judy Nash wrote:

 Made progress but still blocked.

 After recompiling the code on cmd instead of PowerShell, now I can 
 see all 5 classes as you mentioned.

 However I am still seeing the same error as before. Anything else I 
 can check for?



 From

RE: Unable to compile spark 1.1.0 on windows 8.1

2014-12-01 Thread Judy Nash
Have you checked out the wiki here? 
http://spark.apache.org/docs/latest/building-with-maven.html

A couple things I did differently from you:
1) I got the bits directly from github (https://github.com/apache/spark/). Use 
branch 1.1 for spark 1.1
2) execute maven command on cmd (powershell misses libraries sometimes) 
3) Increase maven memory per suggested by building with maven wiki

Hope this helps. 

-Original Message-
From: Ishwardeep Singh [mailto:ishwardeep.si...@impetus.co.in] 
Sent: Monday, December 1, 2014 1:50 AM
To: u...@spark.incubator.apache.org
Subject: RE: Unable to compile spark 1.1.0 on windows 8.1

Hi Judy,

Thank you for your response.

When I try to compile using maven mvn -Dhadoop.version=1.2.1 -DskipTests clean 
package I get an error Error: Could not find or load main class . 
I have maven 3.0.4.

And when I run command sbt package I get the same exception as earlier.

I have done the following steps:

1. Download spark-1.1.0.tgz from the spark site and unzip the compressed zip to 
a folder d:\myworkplace\software\spark-1.1.0
2. Then I downloaded sbt-0.13.7.zip and extract it to folder 
d:\myworkplace\software\sbt
3. Update the PATH environment variable to include 
d:\myworkplace\software\sbt\bin in the PATH.
4. Navigate to spark folder d:\myworkplace\software\spark-1.1.0
5. Run the command sbt assembly
6. As a side effect of this command a number of libraries are downloaded and I 
get an initial error that path 
C:\Users\ishwardeep.singh\.sbt\0.13\staging\ec3aa8f39111944cc5f2\sbt-pom-reader
does not exist. 
7. I manually create this subfolder ec3aa8f39111944cc5f2\sbt-pom-reader
and retry to get the next error as described in my initial error.

Is this the correct procedure to compile spark 1.1.0? Please let me know.

Hoping to hear from you soon.

Regards,
ishwardeep



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-compile-spark-1-1-0-on-windows-8-1-tp19996p20075.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: Unable to compile spark 1.1.0 on windows 8.1

2014-11-30 Thread Judy Nash
I have found the following to work for me on win 8.1:
1) run sbt assembly
2) Use Maven. You can find the maven commands for your build at : 
docs\building-spark.md


-Original Message-
From: Ishwardeep Singh [mailto:ishwardeep.si...@impetus.co.in] 
Sent: Thursday, November 27, 2014 11:31 PM
To: u...@spark.incubator.apache.org
Subject: Unable to compile spark 1.1.0 on windows 8.1

Hi,

I am trying to compile spark 1.1.0 on windows 8.1 but I get the following 
exception. 

[info] Compiling 3 Scala sources to
D:\myworkplace\software\spark-1.1.0\project\target\scala-2.10\sbt0.13\classes...
[error] D:\myworkplace\software\spark-1.1.0\project\SparkBuild.scala:26:
object sbt is not a member of package com.typesafe [error] import 
com.typesafe.sbt.pom.{PomBuild, SbtPomKeys}
[error] ^
[error] D:\myworkplace\software\spark-1.1.0\project\SparkBuild.scala:53: not
found: type PomBuild
[error] object SparkBuild extends PomBuild {
[error]   ^
[error] D:\myworkplace\software\spark-1.1.0\project\SparkBuild.scala:121:
not found: value SbtPomKeys
[error] otherResolvers = SbtPomKeys.mvnLocalRepository(dotM2 =
Seq(Resolver.file(dotM2, dotM2))),
[error]^
[error] D:\myworkplace\software\spark-1.1.0\project\SparkBuild.scala:165:
value projectDefinitions is not a member of AnyRef
[error] super.projectDefinitions(baseDirectory).map { x =
[error]   ^
[error] four errors found
[error] (plugins/compile:compile) Compilation failed

I have also setup scala 2.10.

Need help to resolve this issue.

Regards,
Ishwardeep 



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/Unable-to-compile-spark-1-1-0-on-windows-8-1-tp19996.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional 
commands, e-mail: user-h...@spark.apache.org


-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-30 Thread Judy Nash
Thanks Patrick and Cheng for the suggestions.

The issue was Hadoop common jar was added to a classpath. After I removed 
Hadoop common jar from both master and slave, I was able to bypass the error. 
This was caused by a local change, so no impact on the 1.2 release. 
-Original Message-
From: Patrick Wendell [mailto:pwend...@gmail.com] 
Sent: Wednesday, November 26, 2014 8:17 AM
To: Judy Nash
Cc: Denny Lee; Cheng Lian; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Just to double check - I looked at our own assembly jar and I confirmed that 
our Hadoop configuration class does use the correctly shaded version of Guava. 
My best guess here is that somehow a separate Hadoop library is ending up on 
the classpath, possible because Spark put it there somehow.

 tar xvzf spark-assembly-1.3.0-SNAPSHOT-hadoop2.4.0.jar
 cd org/apache/hadoop/
 javap -v Configuration | grep Precond

Warning: Binary file Configuration contains org.apache.hadoop.conf.Configuration

   #497 = Utf8   org/spark-project/guava/common/base/Preconditions

   #498 = Class  #497 //
org/spark-project/guava/common/base/Preconditions

   #502 = Methodref  #498.#501//
org/spark-project/guava/common/base/Preconditions.checkArgument:(ZLjava/lang/Object;)V

12: invokestatic  #502// Method
org/spark-project/guava/common/base/Preconitions.checkArgument:(ZLjava/lang/Object;)V

50: invokestatic  #502// Method
org/spark-project/guava/common/base/Preconitions.checkArgument:(ZLjava/lang/Object;)V

On Wed, Nov 26, 2014 at 11:08 AM, Patrick Wendell pwend...@gmail.com wrote:
 Hi Judy,

 Are you somehow modifying Spark's classpath to include jars from 
 Hadoop and Hive that you have running on the machine? The issue seems 
 to be that you are somehow including a version of Hadoop that 
 references the original guava package. The Hadoop that is bundled in 
 the Spark jars should not do this.

 - Patrick

 On Wed, Nov 26, 2014 at 1:45 AM, Judy Nash 
 judyn...@exchange.microsoft.com wrote:
 Looks like a config issue. I ran spark-pi job and still failing with 
 the same guava error

 Command ran:

 .\bin\spark-class.cmd org.apache.spark.deploy.SparkSubmit --class 
 org.apache.spark.examples.SparkPi --master spark://headnodehost:7077 
 --executor-memory 1G --num-executors 1 
 .\lib\spark-examples-1.2.1-SNAPSHOT-hadoop2.4.0.jar 100



 Had used the same build steps on spark 1.1 and had no issue.



 From: Denny Lee [mailto:denny.g@gmail.com]
 Sent: Tuesday, November 25, 2014 5:47 PM
 To: Judy Nash; Cheng Lian; u...@spark.incubator.apache.org


 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava



 To determine if this is a Windows vs. other configuration, can you 
 just try to call the Spark-class.cmd SparkSubmit without actually 
 referencing the Hadoop or Thrift server classes?





 On Tue Nov 25 2014 at 5:42:09 PM Judy Nash 
 judyn...@exchange.microsoft.com
 wrote:

 I traced the code and used the following to call:

 Spark-class.cmd org.apache.spark.deploy.SparkSubmit --class
 org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 
 spark-internal --hiveconf hive.server2.thrift.port=1



 The issue ended up to be much more fundamental however. Spark doesn't 
 work at all in configuration below. When open spark-shell, it fails 
 with the same ClassNotFound error.

 Now I wonder if this is a windows-only issue or the hive/Hadoop 
 configuration that is having this problem.



 From: Cheng Lian [mailto:lian.cs@gmail.com]
 Sent: Tuesday, November 25, 2014 1:50 AM


 To: Judy Nash; u...@spark.incubator.apache.org
 Subject: Re: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava



 Oh so you're using Windows. What command are you using to start the 
 Thrift server then?

 On 11/25/14 4:25 PM, Judy Nash wrote:

 Made progress but still blocked.

 After recompiling the code on cmd instead of PowerShell, now I can 
 see all 5 classes as you mentioned.

 However I am still seeing the same error as before. Anything else I 
 can check for?



 From: Judy Nash [mailto:judyn...@exchange.microsoft.com]
 Sent: Monday, November 24, 2014 11:50 PM
 To: Cheng Lian; u...@spark.incubator.apache.org
 Subject: RE: latest Spark 1.2 thrift server fail with 
 NoClassDefFoundError on Guava



 This is what I got from jar tf:

 org/spark-project/guava/common/base/Preconditions.class

 org/spark-project/guava/common/math/MathPreconditions.class

 com/clearspring/analytics/util/Preconditions.class

 parquet/Preconditions.class



 I seem to have the line that reported missing, but I am missing this file:

 com/google/inject/internal/util/$Preconditions.class



 Any suggestion on how to fix this?

 Very much appreciate the help as I am very new to Spark and open 
 source technologies.



 From: Cheng Lian [mailto:lian.cs@gmail.com]
 Sent: Monday

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-25 Thread Judy Nash
Made progress but still blocked.
After recompiling the code on cmd instead of PowerShell, now I can see all 5 
classes as you mentioned.

However I am still seeing the same error as before. Anything else I can check 
for?

From: Judy Nash [mailto:judyn...@exchange.microsoft.com]
Sent: Monday, November 24, 2014 11:50 PM
To: Cheng Lian; u...@spark.incubator.apache.org
Subject: RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

This is what I got from jar tf:
org/spark-project/guava/common/base/Preconditions.class
org/spark-project/guava/common/math/MathPreconditions.class
com/clearspring/analytics/util/Preconditions.class
parquet/Preconditions.class

I seem to have the line that reported missing, but I am missing this file:

com/google/inject/internal/util/$Preconditions.class

Any suggestion on how to fix this?
Very much appreciate the help as I am very new to Spark and open source 
technologies.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Monday, November 24, 2014 8:24 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava


Hm, I tried exactly the same commit and the build command locally, but couldn’t 
reproduce this.

Usually this kind of errors are caused by classpath misconfiguration. Could you 
please try this to ensure corresponding Guava classes are included in the 
assembly jar you built?

jar tf assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-hadoop2.4.0.jar 
| grep Preconditions

On my machine I got these lines (the first line is the one reported as missing 
in your case):

org/spark-project/guava/common/base/Preconditions.class

org/spark-project/guava/common/math/MathPreconditions.class

com/clearspring/analytics/util/Preconditions.class

parquet/Preconditions.class

com/google/inject/internal/util/$Preconditions.class

On 11/25/14 6:25 AM, Judy Nash wrote:
Thank you Cheng for responding.

Here is the commit SHA1 on the 1.2 branch I saw this failure in:
commit 6f70e0295572e3037660004797040e026e440dbd
Author: zsxwing zsxw...@gmail.commailto:zsxw...@gmail.com
Date:   Fri Nov 21 00:42:43 2014 -0800

[SPARK-4472][Shell] Print Spark context available as sc. only when 
SparkContext is created...

... successfully

It's weird that printing Spark context available as sc when creating 
SparkContext unsuccessfully.

Let me know if you need anything else.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Friday, November 21, 2014 8:02 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Hi Judy, could you please provide the commit SHA1 of the version you're using? 
Thanks!
On 11/22/14 11:05 AM, Judy Nash wrote:
Hi,

Thrift server is failing to start for me on latest spark 1.2 branch.

I got the error below when I start thrift server.
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)….

Here is my setup:

1)  Latest spark 1.2 branch build

2)  Used build command:

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver 
-DskipTests clean package

3)  Added hive-site.xml to \conf

4)  Version on the box: Hive 0.13, Hadoop 2.4

Is this a real bug or am I doing something wrong?

---
Full Stacktrace:
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:327)
at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:409)

at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopU
til.scala:82)
at org.apache.spark.deploy.SparkHadoopUtil.init(SparkHadoopUtil.scala:
42)
at org.apache.spark.deploy.SparkHadoopUtil$.init(SparkHadoopUtil.scala
:202)
at org.apache.spark.deploy.SparkHadoopUtil$.clinit(SparkHadoopUtil.sca
la)
at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1784)
at org.apache.spark.storage.BlockManager.init(BlockManager.scala:105)
at org.apache.spark.storage.BlockManager.init(BlockManager.scala:180)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:292)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:159)
at org.apache.spark.SparkContext.init(SparkContext.scala:230)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.
scala:38)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveTh
riftServer2.scala:56)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThr

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-25 Thread Judy Nash
I traced the code and used the following to call:
Spark-class.cmd org.apache.spark.deploy.SparkSubmit --class 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal 
--hiveconf hive.server2.thrift.port=1

The issue ended up to be much more fundamental however. Spark doesn’t work at 
all in configuration below. When open spark-shell, it fails with the same 
ClassNotFound error.
Now I wonder if this is a windows-only issue or the hive/Hadoop configuration 
that is having this problem.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Tuesday, November 25, 2014 1:50 AM
To: Judy Nash; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Oh so you're using Windows. What command are you using to start the Thrift 
server then?
On 11/25/14 4:25 PM, Judy Nash wrote:
Made progress but still blocked.
After recompiling the code on cmd instead of PowerShell, now I can see all 5 
classes as you mentioned.


However I am still seeing the same error as before. Anything else I can check 
for?

From: Judy Nash [mailto:judyn...@exchange.microsoft.com]
Sent: Monday, November 24, 2014 11:50 PM
To: Cheng Lian; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

This is what I got from jar tf:
org/spark-project/guava/common/base/Preconditions.class
org/spark-project/guava/common/math/MathPreconditions.class
com/clearspring/analytics/util/Preconditions.class
parquet/Preconditions.class

I seem to have the line that reported missing, but I am missing this file:

com/google/inject/internal/util/$Preconditions.class

Any suggestion on how to fix this?
Very much appreciate the help as I am very new to Spark and open source 
technologies.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Monday, November 24, 2014 8:24 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava


Hm, I tried exactly the same commit and the build command locally, but couldn’t 
reproduce this.

Usually this kind of errors are caused by classpath misconfiguration. Could you 
please try this to ensure corresponding Guava classes are included in the 
assembly jar you built?

jar tf assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-hadoop2.4.0.jar 
| grep Preconditions

On my machine I got these lines (the first line is the one reported as missing 
in your case):

org/spark-project/guava/common/base/Preconditions.class

org/spark-project/guava/common/math/MathPreconditions.class

com/clearspring/analytics/util/Preconditions.class

parquet/Preconditions.class

com/google/inject/internal/util/$Preconditions.class

On 11/25/14 6:25 AM, Judy Nash wrote:
Thank you Cheng for responding.

Here is the commit SHA1 on the 1.2 branch I saw this failure in:
commit 6f70e0295572e3037660004797040e026e440dbd
Author: zsxwing zsxw...@gmail.commailto:zsxw...@gmail.com
Date:   Fri Nov 21 00:42:43 2014 -0800

[SPARK-4472][Shell] Print Spark context available as sc. only when 
SparkContext is created...

... successfully

It's weird that printing Spark context available as sc when creating 
SparkContext unsuccessfully.

Let me know if you need anything else.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Friday, November 21, 2014 8:02 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Hi Judy, could you please provide the commit SHA1 of the version you're using? 
Thanks!
On 11/22/14 11:05 AM, Judy Nash wrote:
Hi,

Thrift server is failing to start for me on latest spark 1.2 branch.

I got the error below when I start thrift server.
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)….

Here is my setup:

1)  Latest spark 1.2 branch build

2)  Used build command:

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver 
-DskipTests clean package

3)  Added hive-site.xml to \conf

4)  Version on the box: Hive 0.13, Hadoop 2.4

Is this a real bug or am I doing something wrong?

---
Full Stacktrace:
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:327)
at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:409)

at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopU
til.scala:82)
at org.apache.spark.deploy.SparkHadoopUtil.init

RE: beeline via spark thrift doesn't retain cache

2014-11-25 Thread Judy Nash
Thanks Yanbo.
My issue was 1) . I had spark thrift server setup, but it was running against 
hive instead of Spark SQL due a local change.

After I fix this, beeline automatically caches rerun queries + accepts cache 
table.

From: Yanbo Liang [mailto:yanboha...@gmail.com]
Sent: Friday, November 21, 2014 12:42 AM
To: Judy Nash
Cc: u...@spark.incubator.apache.org
Subject: Re: beeline via spark thrift doesn't retain cache

1) make sure your beeline client connected to Hiveserver2 of Spark SQL.
You can found execution logs of Hiveserver2 in the environment of 
start-thriftserver.sh.
2) what about your scale of data. If cache with small data, it will take more 
time to schedule workload between different executors.
Look the configuration of spark execution environment. Whether there are enough 
memory for RDD storage, if not, it will take some time to serialize/deserialize 
data between memory and disk.

2014-11-21 11:06 GMT+08:00 Judy Nash 
judyn...@exchange.microsoft.commailto:judyn...@exchange.microsoft.com:
Hi friends,

I have successfully setup thrift server and execute beeline on top.

Beeline can handle select queries just fine, but it cannot seem to do any kind 
of caching/RDD operations.

i.e.

1)  Command “cache table” doesn’t work. See error:

Error: Error while processing statement: FAILED: ParseException line 1:0 cannot

recognize input near 'cache' 'table' 'hivesampletable' (state=42000,code=4)



2)  Re-run SQL commands do not have any performance improvements.

By comparison, Spark-SQL shell can execute “cache table” command and rerunning 
SQL command has a huge performance boost.

Am I missing something or this is expected when execute through Spark thrift 
server?

Thanks!
Judy





RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-25 Thread Judy Nash
Looks like a config issue. I ran spark-pi job and still failing with the same 
guava error
Command ran:
.\bin\spark-class.cmd org.apache.spark.deploy.SparkSubmit --class 
org.apache.spark.examples.SparkPi --master spark://headnodehost:7077 
--executor-memory 1G --num-executors 1 
.\lib\spark-examples-1.2.1-SNAPSHOT-hadoop2.4.0.jar 100

Had used the same build steps on spark 1.1 and had no issue.

From: Denny Lee [mailto:denny.g@gmail.com]
Sent: Tuesday, November 25, 2014 5:47 PM
To: Judy Nash; Cheng Lian; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

To determine if this is a Windows vs. other configuration, can you just try to 
call the Spark-class.cmd SparkSubmit without actually referencing the Hadoop or 
Thrift server classes?


On Tue Nov 25 2014 at 5:42:09 PM Judy Nash 
judyn...@exchange.microsoft.commailto:judyn...@exchange.microsoft.com wrote:
I traced the code and used the following to call:
Spark-class.cmd org.apache.spark.deploy.SparkSubmit --class 
org.apache.spark.sql.hive.thriftserver.HiveThriftServer2 spark-internal 
--hiveconf hive.server2.thrift.port=1

The issue ended up to be much more fundamental however. Spark doesn’t work at 
all in configuration below. When open spark-shell, it fails with the same 
ClassNotFound error.
Now I wonder if this is a windows-only issue or the hive/Hadoop configuration 
that is having this problem.

From: Cheng Lian [mailto:lian.cs@gmail.commailto:lian.cs@gmail.com]
Sent: Tuesday, November 25, 2014 1:50 AM

To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Oh so you're using Windows. What command are you using to start the Thrift 
server then?
On 11/25/14 4:25 PM, Judy Nash wrote:
Made progress but still blocked.
After recompiling the code on cmd instead of PowerShell, now I can see all 5 
classes as you mentioned.

However I am still seeing the same error as before. Anything else I can check 
for?

From: Judy Nash [mailto:judyn...@exchange.microsoft.com]
Sent: Monday, November 24, 2014 11:50 PM
To: Cheng Lian; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

This is what I got from jar tf:
org/spark-project/guava/common/base/Preconditions.class
org/spark-project/guava/common/math/MathPreconditions.class
com/clearspring/analytics/util/Preconditions.class
parquet/Preconditions.class

I seem to have the line that reported missing, but I am missing this file:

com/google/inject/internal/util/$Preconditions.class

Any suggestion on how to fix this?
Very much appreciate the help as I am very new to Spark and open source 
technologies.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Monday, November 24, 2014 8:24 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava


Hm, I tried exactly the same commit and the build command locally, but couldn’t 
reproduce this.

Usually this kind of errors are caused by classpath misconfiguration. Could you 
please try this to ensure corresponding Guava classes are included in the 
assembly jar you built?

jar tf assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-hadoop2.4.0.jar 
| grep Preconditions

On my machine I got these lines (the first line is the one reported as missing 
in your case):

org/spark-project/guava/common/base/Preconditions.class

org/spark-project/guava/common/math/MathPreconditions.class

com/clearspring/analytics/util/Preconditions.class

parquet/Preconditions.class

com/google/inject/internal/util/$Preconditions.class

On 11/25/14 6:25 AM, Judy Nash wrote:
Thank you Cheng for responding.

Here is the commit SHA1 on the 1.2 branch I saw this failure in:
commit 6f70e0295572e3037660004797040e026e440dbd
Author: zsxwing zsxw...@gmail.commailto:zsxw...@gmail.com
Date:   Fri Nov 21 00:42:43 2014 -0800

[SPARK-4472][Shell] Print Spark context available as sc. only when 
SparkContext is created...

... successfully

It's weird that printing Spark context available as sc when creating 
SparkContext unsuccessfully.

Let me know if you need anything else.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Friday, November 21, 2014 8:02 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Hi Judy, could you please provide the commit SHA1 of the version you're using? 
Thanks!
On 11/22/14 11:05 AM, Judy Nash wrote:
Hi,

Thrift server is failing to start for me on latest spark 1.2 branch.

I got the error below when I start thrift server.
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions

RE: latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-24 Thread Judy Nash
This is what I got from jar tf:
org/spark-project/guava/common/base/Preconditions.class
org/spark-project/guava/common/math/MathPreconditions.class
com/clearspring/analytics/util/Preconditions.class
parquet/Preconditions.class

I seem to have the line that reported missing, but I am missing this file:

com/google/inject/internal/util/$Preconditions.class

Any suggestion on how to fix this?
Very much appreciate the help as I am very new to Spark and open source 
technologies.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Monday, November 24, 2014 8:24 PM
To: Judy Nash; u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava


Hm, I tried exactly the same commit and the build command locally, but couldn’t 
reproduce this.

Usually this kind of errors are caused by classpath misconfiguration. Could you 
please try this to ensure corresponding Guava classes are included in the 
assembly jar you built?

jar tf assembly/target/scala-2.10/spark-assembly-1.2.1-SNAPSHOT-hadoop2.4.0.jar 
| grep Preconditions

On my machine I got these lines (the first line is the one reported as missing 
in your case):

org/spark-project/guava/common/base/Preconditions.class

org/spark-project/guava/common/math/MathPreconditions.class

com/clearspring/analytics/util/Preconditions.class

parquet/Preconditions.class

com/google/inject/internal/util/$Preconditions.class

On 11/25/14 6:25 AM, Judy Nash wrote:
Thank you Cheng for responding.

Here is the commit SHA1 on the 1.2 branch I saw this failure in:
commit 6f70e0295572e3037660004797040e026e440dbd
Author: zsxwing zsxw...@gmail.commailto:zsxw...@gmail.com
Date:   Fri Nov 21 00:42:43 2014 -0800

[SPARK-4472][Shell] Print Spark context available as sc. only when 
SparkContext is created...

... successfully

It's weird that printing Spark context available as sc when creating 
SparkContext unsuccessfully.

Let me know if you need anything else.

From: Cheng Lian [mailto:lian.cs@gmail.com]
Sent: Friday, November 21, 2014 8:02 PM
To: Judy Nash; 
u...@spark.incubator.apache.orgmailto:u...@spark.incubator.apache.org
Subject: Re: latest Spark 1.2 thrift server fail with NoClassDefFoundError on 
Guava

Hi Judy, could you please provide the commit SHA1 of the version you're using? 
Thanks!
On 11/22/14 11:05 AM, Judy Nash wrote:
Hi,

Thrift server is failing to start for me on latest spark 1.2 branch.

I got the error below when I start thrift server.
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)….

Here is my setup:

1)  Latest spark 1.2 branch build

2)  Used build command:

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver 
-DskipTests clean package

3)  Added hive-site.xml to \conf

4)  Version on the box: Hive 0.13, Hadoop 2.4

Is this a real bug or am I doing something wrong?

---
Full Stacktrace:
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:327)
at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:409)

at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopU
til.scala:82)
at org.apache.spark.deploy.SparkHadoopUtil.init(SparkHadoopUtil.scala:
42)
at org.apache.spark.deploy.SparkHadoopUtil$.init(SparkHadoopUtil.scala
:202)
at org.apache.spark.deploy.SparkHadoopUtil$.clinit(SparkHadoopUtil.sca
la)
at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1784)
at org.apache.spark.storage.BlockManager.init(BlockManager.scala:105)
at org.apache.spark.storage.BlockManager.init(BlockManager.scala:180)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:292)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:159)
at org.apache.spark.SparkContext.init(SparkContext.scala:230)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.
scala:38)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveTh
riftServer2.scala:56)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThr
iftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:353)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75

latest Spark 1.2 thrift server fail with NoClassDefFoundError on Guava

2014-11-21 Thread Judy Nash
Hi,

Thrift server is failing to start for me on latest spark 1.2 branch.

I got the error below when I start thrift server.
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)

Here is my setup:

1)  Latest spark 1.2 branch build

2)  Used build command:

mvn -Pyarn -Phadoop-2.4 -Dhadoop.version=2.4.0 -Phive -Phive-thriftserver 
-DskipTests clean package

3)  Added hive-site.xml to \conf

4)  Version on the box: Hive 0.13, Hadoop 2.4

Is this a real bug or am I doing something wrong?

---
Full Stacktrace:
Exception in thread main java.lang.NoClassDefFoundError: com/google/common/bas
e/Preconditions
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:314)
at org.apache.hadoop.conf.Configuration$DeprecationDelta.init(Configur
ation.java:327)
at org.apache.hadoop.conf.Configuration.clinit(Configuration.java:409)

at org.apache.spark.deploy.SparkHadoopUtil.newConfiguration(SparkHadoopU
til.scala:82)
at org.apache.spark.deploy.SparkHadoopUtil.init(SparkHadoopUtil.scala:
42)
at org.apache.spark.deploy.SparkHadoopUtil$.init(SparkHadoopUtil.scala
:202)
at org.apache.spark.deploy.SparkHadoopUtil$.clinit(SparkHadoopUtil.sca
la)
at org.apache.spark.util.Utils$.getSparkOrYarnConfig(Utils.scala:1784)
at org.apache.spark.storage.BlockManager.init(BlockManager.scala:105)
at org.apache.spark.storage.BlockManager.init(BlockManager.scala:180)
at org.apache.spark.SparkEnv$.create(SparkEnv.scala:292)
at org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:159)
at org.apache.spark.SparkContext.init(SparkContext.scala:230)
at org.apache.spark.sql.hive.thriftserver.SparkSQLEnv$.init(SparkSQLEnv.
scala:38)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2$.main(HiveTh
riftServer2.scala:56)
at org.apache.spark.sql.hive.thriftserver.HiveThriftServer2.main(HiveThr
iftServer2.scala)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.
java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAcces
sorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:606)
at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:353)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.ClassNotFoundException: com.google.common.base.Precondition
s
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:308)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)


beeline via spark thrift doesn't retain cache

2014-11-20 Thread Judy Nash
Hi friends,

I have successfully setup thrift server and execute beeline on top.

Beeline can handle select queries just fine, but it cannot seem to do any kind 
of caching/RDD operations.

i.e.

1)  Command cache table doesn't work. See error:

Error: Error while processing statement: FAILED: ParseException line 1:0 cannot

recognize input near 'cache' 'table' 'hivesampletable' (state=42000,code=4)



2)  Re-run SQL commands do not have any performance improvements.

By comparison, Spark-SQL shell can execute cache table command and rerunning 
SQL command has a huge performance boost.

Am I missing something or this is expected when execute through Spark thrift 
server?

Thanks!
Judy