date:20141216

[jira] [Commented] (SPARK-4854) Custom UDTF with Lateral View throws ClassNotFound exception in Spark SQL CLI

2014-12-16 Thread Shenghua Wan (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-4854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14247938#comment-14247938
 ] 

Shenghua Wan commented on SPARK-4854:
-

I was trying to debugging Spark source code to fix this issue. And I found the 
function name resolution might not work, where normal situation like 
SELECT+CustomUDTF resolves the function name as FQDN class name, but in 
situation like SELECT+LATERAL VIEW+CustomUDTF resolves the function name as 
just the alias I gave in create temporary function clause rather than FQDN. 
In other words, the function name to class name translation failed in that 
situation.

A workaround trick is discovered by debugging the Spark source code.

The trick is to remove the package name in the Java code of your custom UDTF, 
like org.xxx. In that case, your class name equals its FQDN. In addition the 
class name is used as the alias function name in create temporary function 
clause. In that case, though Spark use the unresolved alias function name, but 
this name can be resolved in the default name space. 

 Custom UDTF with Lateral View throws ClassNotFound exception in Spark SQL CLI
 -

 Key: SPARK-4854
 URL: https://issues.apache.org/jira/browse/SPARK-4854
 Project: Spark
  Issue Type: Bug
Affects Versions: 1.1.0, 1.1.1
Reporter: Shenghua Wan

 Hello, 
 I met a problem when using Spark sql CLI. A custom UDTF with lateral view 
 throws ClassNotFound exception. I did a couple of experiments in same 
 environment (spark version 1.1.0, 1.1.1): 
 select + same custom UDTF (Passed) 
 select + lateral view + custom UDTF (ClassNotFoundException) 
 select + lateral view + built-in UDTF (Passed) 
 I have done some googling there days and found one related issue ticket of 
 Spark 
 https://issues.apache.org/jira/browse/SPARK-4811
 which is about Custom UDTFs not working in Spark SQL. 
 It should be helpful to put actual code here to reproduce the problem. 
 However,  corporate regulations might prohibit this. So sorry about this. 
 Directly using explode's source code in a jar will help anyway. 
 Here is a portion of stack print when exception, just in case: 
 java.lang.ClassNotFoundException: XXX 
 at java.net.URLClassLoader$1.run(URLClassLoader.java:366) 
 at java.net.URLClassLoader$1.run(URLClassLoader.java:355) 
 at java.security.AccessController.doPrivileged(Native Method) 
 at java.net.URLClassLoader.findClass(URLClassLoader.java:354) 
 at java.lang.ClassLoader.loadClass(ClassLoader.java:425) 
 at java.lang.ClassLoader.loadClass(ClassLoader.java:358) 
 at 
 org.apache.spark.sql.hive.HiveFunctionFactory$class.createFunction(hiveUdfs.scala:81)
  
 at 
 org.apache.spark.sql.hive.HiveGenericUdtf.createFunction(hiveUdfs.scala:247) 
 at 
 org.apache.spark.sql.hive.HiveGenericUdtf.function$lzycompute(hiveUdfs.scala:254)
  
 at 
 org.apache.spark.sql.hive.HiveGenericUdtf.function(hiveUdfs.scala:254) 
 at 
 org.apache.spark.sql.hive.HiveGenericUdtf.outputInspectors$lzycompute(hiveUdfs.scala:261)
  
 at 
 org.apache.spark.sql.hive.HiveGenericUdtf.outputInspectors(hiveUdfs.scala:260)
  
 at 
 org.apache.spark.sql.hive.HiveGenericUdtf.outputDataTypes$lzycompute(hiveUdfs.scala:265)
  
 at 
 org.apache.spark.sql.hive.HiveGenericUdtf.outputDataTypes(hiveUdfs.scala:265) 
 at 
 org.apache.spark.sql.hive.HiveGenericUdtf.makeOutput(hiveUdfs.scala:269) 
 at 
 org.apache.spark.sql.catalyst.expressions.Generator.output(generators.scala:60)
  
 at 
 org.apache.spark.sql.catalyst.plans.logical.Generate$$anonfun$1.apply(basicOperators.scala:50)
  
 at 
 org.apache.spark.sql.catalyst.plans.logical.Generate$$anonfun$1.apply(basicOperators.scala:50)
  
 at scala.Option.map(Option.scala:145) 
 at 
 org.apache.spark.sql.catalyst.plans.logical.Generate.generatorOutput(basicOperators.scala:50)
  
 at 
 org.apache.spark.sql.catalyst.plans.logical.Generate.output(basicOperators.scala:60)
  
 at 
 org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveChildren$1.apply(LogicalPlan.scala:79)
  
 at 
 org.apache.spark.sql.catalyst.plans.logical.LogicalPlan$$anonfun$resolveChildren$1.apply(LogicalPlan.scala:79)
  
 at 
 scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
  
 at 
 scala.collection.TraversableLike$$anonfun$flatMap$1.apply(TraversableLike.scala:251)
  
 at scala.collection.immutable.List.foreach(List.scala:318) 
 at 
 scala.collection.TraversableLike$class.flatMap(TraversableLike.scala:251) 
 at 
 scala.collection.AbstractTraversable.flatMap(Traversable.scala:105) 
 the rest is omitted.

[jira] [Commented] (SPARK-4857) Add Executor Events to SparkListener

2014-12-16 Thread Apache Spark (JIRA)


[ 
https://issues.apache.org/jira/browse/SPARK-4857?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14247955#comment-14247955
 ] 

Apache Spark commented on SPARK-4857:
-

User 'ksakellis' has created a pull request for this issue:
https://github.com/apache/spark/pull/3711

 Add Executor Events to SparkListener
 

 Key: SPARK-4857
 URL: https://issues.apache.org/jira/browse/SPARK-4857
 Project: Spark
  Issue Type: Improvement
Reporter: Kostas Sakellis

 We need to add events to the SparkListener to indicate an executor has been 
 added or removed with corresponding information. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Updated] (SPARK-4341) Spark need to set num-executors automatically

2014-12-16 Thread Hong Shen (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-4341?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Hong Shen updated SPARK-4341:
-
Attachment: SPARK-4341.diff

 Spark need to set num-executors automatically
 -

 Key: SPARK-4341
 URL: https://issues.apache.org/jira/browse/SPARK-4341
 Project: Spark
  Issue Type: Improvement
  Components: Spark Core
Affects Versions: 1.1.0
Reporter: Hong Shen
 Attachments: SPARK-4341.diff


 The mapreduce job can set maptask automaticlly, but in spark, we have to set 
 num-executors, executor memory and cores. It's difficult for users to set 
 these args, especially for the users want to use spark sql. So when user 
 havn't set num-executors,  spark should set num-executors automatically 
 accroding to the input partitions.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-2988) Port repl to scala 2.11.

2014-12-16 Thread Prashant Sharma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2988?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Sharma resolved SPARK-2988.

Resolution: Fixed

 Port repl to scala 2.11.
 

 Key: SPARK-2988
 URL: https://issues.apache.org/jira/browse/SPARK-2988
 Project: Spark
  Issue Type: Sub-task
  Components: Build, Spark Core
Reporter: Prashant Sharma
Assignee: Prashant Sharma





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-2987) Adjust build system to support building with scala 2.11 and fix tests.

2014-12-16 Thread Prashant Sharma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-2987?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Sharma resolved SPARK-2987.

Resolution: Fixed

 Adjust build system to support building with scala 2.11 and fix tests.
 --

 Key: SPARK-2987
 URL: https://issues.apache.org/jira/browse/SPARK-2987
 Project: Spark
  Issue Type: Sub-task
  Components: Build, Spark Core
Reporter: Prashant Sharma
Assignee: Prashant Sharma





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-2709) Add a tool for certifying Spark API compatiblity

2014-12-16 Thread Prashant Sharma (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Prashant Sharma resolved SPARK-2709.

Resolution: Fixed

Add a tool for certifying Spark API compatiblity

Key: SPARK-2709
URL: https://issues.apache.org/jira/browse/SPARK-2709
Project: Spark
Issue Type: New Feature
Components: Spark Core
Reporter: Patrick Wendell
Assignee: Prashant Sharma

As Spark is packaged by more and more distributors, it would be good to have
a tool that verifies API compatiblity of a provided Spark package. The tool
would certify that a vendor distrubtion of Spark contains all of the API's
present in a particular upstream Spark version.
This will help vendors make sure they remain API compliant when they make
changes or back ports to Spark. It will also discourage vendors from
knowingly breaking API's, because anyone can audit their distribution and see
that they have removed support for certain API's.
I'm hoping a tool like this will avoid API fragmentation in the Spark
community.
One poor man's implementation of this is that a vendor can just run the
binary compatibility checks in the spark build against an upstream version of
Spark. That's a pretty good start, but it means you can't come as a third
party and audit a distribution.
Another approach would be to have something where anyone can come in and
audit a distribution even if they don't have access to the packaging and
source code. That would look something like this:
1. For each release we publish a manifest of all public API's (we might
borrow the MIMA string representation of bye code signatures)
2. We package an auditing tool as a jar file.
3. The user runs a tool with spark-submit that reflectively walks through all
exposed Spark API's and makes sure that everything on the manifest is
encountered.
From the implementation side, this is just brainstorming at this point.

--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Resolved] (SPARK-1338) Create Additional Style Rules

2014-12-16 Thread Prashant Sharma (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-1338?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Prashant Sharma resolved SPARK-1338.

Resolution: Fixed

This is fixed by updating the scalastyle version. Applying these styles accross 
the code base can be a different issue.

 Create Additional Style Rules
 -

 Key: SPARK-1338
 URL: https://issues.apache.org/jira/browse/SPARK-1338
 Project: Spark
  Issue Type: Improvement
  Components: Project Infra
Reporter: Patrick Wendell
Assignee: Prashant Sharma
 Fix For: 1.2.0


 There are a few other rules that would be helpful to have. Also we should add 
 tests for these rules because it's easy to get them wrong. I gave some 
 example comparisons from a javascript style checker.
 Require spaces in type declarations:
 def foo:String = X // no
 def foo: String = XXX
 def x:Int = 100 // no
 val x: Int = 100
 Require spaces after keywords:
 if(x - 3) // no
 if (x + 10)
 See: requireSpaceAfterKeywords from
 https://github.com/mdevils/node-jscs
 Disallow spaces inside of parentheses:
 val x = ( 3 + 5 ) // no
 val x = (3 + 5)
 See: disallowSpacesInsideParentheses from
 https://github.com/mdevils/node-jscs
 Require spaces before and after binary operators:
 See: requireSpaceBeforeBinaryOperators
 See: disallowSpaceAfterBinaryOperators
 from https://github.com/mdevils/node-jscs



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

[jira] [Commented] (SPARK-2709) Add a tool for certifying Spark API compatiblity

2014-12-16 Thread Sean Owen (JIRA)

[
https://issues.apache.org/jira/browse/SPARK-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248174#comment-14248174
]

Sean Owen commented on SPARK-2709:
--

Was this sort of tool implemented?

Add a tool for certifying Spark API compatiblity

Key: SPARK-2709
URL: https://issues.apache.org/jira/browse/SPARK-2709
Project: Spark
Issue Type: New Feature
Components: Spark Core
Reporter: Patrick Wendell
Assignee: Prashant Sharma

68 matches

Mail list logo