Re: enum-like types in Spark

2015-07-02 Thread Imran Rashid
Hi Stephen,

I'm not sure which link you are referring to for the example code -- but
yes, the recommendation is that you create the enum in Java, eg. see

https://github.com/apache/spark/blob/v1.4.0/core/src/main/java/org/apache/spark/status/api/v1/StageStatus.java

Then nothing special is required to use it in scala.  This method both uses
the overall type of the enum in the return value, and uses specific values
in the body:

https://github.com/apache/spark/blob/v1.4.0/core/src/main/scala/org/apache/spark/status/api/v1/AllStagesResource.scala#L114

(I did delete the branches for the code that is *not* recommended anymore)

Imran


On Wed, Jul 1, 2015 at 5:53 PM, Stephen Boesch java...@gmail.com wrote:

 I am reviving an old thread here. The link for the example code for the
 java enum based solution is now dead: would someone please post an updated
 link showing the proper interop?

 Specifically: it is my understanding that java enum's may not be created
 within Scala.  So is the proposed solution requiring dropping out into Java
 to create the enum's?

 2015-04-09 17:16 GMT-07:00 Xiangrui Meng men...@gmail.com:

 Using Java enums sound good. We can list the values in the JavaDoc and
 hope Scala will be able to correctly generate docs for Java enums in
 the future. -Xiangrui

 On Thu, Apr 9, 2015 at 10:59 AM, Imran Rashid iras...@cloudera.com
 wrote:
  any update here?  This is relevant for a currently open PR of mine --
 I've
  got a bunch of new public constants defined w/ format #4, but I'd gladly
  switch to java enums.  (Even if we are just going to postpone this
 decision,
  I'm still inclined to switch to java enums ...)
 
  just to be clear about the existing problem with enums  scaladoc: right
  now, the scaladoc knows about the enum class, and generates a page for
 it,
  but it does not display the enum constants.  It is at least labeled as a
  java enum, though, so a savvy user could switch to the javadocs to see
 the
  constants.
 
 
 
  On Mon, Mar 23, 2015 at 4:50 PM, Imran Rashid iras...@cloudera.com
 wrote:
 
  well, perhaps I overstated things a little, I wouldn't call it the
  official solution, just a recommendation in the never-ending debate
 (and
  the recommendation from folks with their hands on scala itself).
 
  Even if we do get this fixed in scaladoc eventually -- as its not in
 the
  current versions, where does that leave this proposal?  personally I'd
  *still* prefer java enums, even if it doesn't get into scaladoc.  btw,
 even
  with sealed traits, the scaladoc still isn't great -- you don't see the
  values from the class, you only see them listed from the companion
 object.
  (though, that is somewhat standard for scaladoc, so maybe I'm reaching
 a
  little)
 
 
 
  On Mon, Mar 23, 2015 at 4:11 PM, Patrick Wendell pwend...@gmail.com
  wrote:
 
  If the official solution from the Scala community is to use Java
  enums, then it seems strange they aren't generated in scaldoc? Maybe
  we can just fix that w/ Typesafe's help and then we can use them.
 
  On Mon, Mar 23, 2015 at 1:46 PM, Sean Owen so...@cloudera.com
 wrote:
   Yeah the fully realized #4, which gets back the ability to use it in
   switch statements (? in Scala but not Java?) does end up being kind
 of
   huge.
  
   I confess I'm swayed a bit back to Java enums, seeing what it
   involves. The hashCode() issue can be 'solved' with the hash of the
   String representation.
  
   On Mon, Mar 23, 2015 at 8:33 PM, Imran Rashid iras...@cloudera.com
 
   wrote:
   I've just switched some of my code over to the new format, and I
 just
   want
   to make sure everyone realizes what we are getting into.  I went
 from
   10
   lines as java enums
  
  
  
 https://github.com/squito/spark/blob/fef66058612ebf225e58dd5f5fea6bae1afd5b31/core/src/main/java/org/apache/spark/status/api/StageStatus.java#L20
  
   to 30 lines with the new format:
  
  
  
 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/status/api/v1/api.scala#L250
  
   its not just that its verbose.  each name has to be repeated 4
 times,
   with
   potential typos in some locations that won't be caught by the
   compiler.
   Also, you have to manually maintain the values as you update the
 set
   of
   enums, the compiler won't do it for you.
  
   The only downside I've heard for java enums is enum.hashcode().
 OTOH,
   the
   downsides for this version are: maintainability / verbosity, no
   values(),
   more cumbersome to use from java, no enum map / enumset.
  
   I did put together a little util to at least get back the
 equivalent
   of
   enum.valueOf() with this format
  
  
  
 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/util/SparkEnum.scala
  
   I'm not trying to prevent us from moving forward on this, its fine
 if
   this
   is still what everyone wants, but I feel pretty strongly java enums
   make
   more sense.
  
   thanks,
   Imran
  
  
 

Re: enum-like types in Spark

2015-07-01 Thread Stephen Boesch
I am reviving an old thread here. The link for the example code for the
java enum based solution is now dead: would someone please post an updated
link showing the proper interop?

Specifically: it is my understanding that java enum's may not be created
within Scala.  So is the proposed solution requiring dropping out into Java
to create the enum's?

2015-04-09 17:16 GMT-07:00 Xiangrui Meng men...@gmail.com:

 Using Java enums sound good. We can list the values in the JavaDoc and
 hope Scala will be able to correctly generate docs for Java enums in
 the future. -Xiangrui

 On Thu, Apr 9, 2015 at 10:59 AM, Imran Rashid iras...@cloudera.com
 wrote:
  any update here?  This is relevant for a currently open PR of mine --
 I've
  got a bunch of new public constants defined w/ format #4, but I'd gladly
  switch to java enums.  (Even if we are just going to postpone this
 decision,
  I'm still inclined to switch to java enums ...)
 
  just to be clear about the existing problem with enums  scaladoc: right
  now, the scaladoc knows about the enum class, and generates a page for
 it,
  but it does not display the enum constants.  It is at least labeled as a
  java enum, though, so a savvy user could switch to the javadocs to see
 the
  constants.
 
 
 
  On Mon, Mar 23, 2015 at 4:50 PM, Imran Rashid iras...@cloudera.com
 wrote:
 
  well, perhaps I overstated things a little, I wouldn't call it the
  official solution, just a recommendation in the never-ending debate
 (and
  the recommendation from folks with their hands on scala itself).
 
  Even if we do get this fixed in scaladoc eventually -- as its not in the
  current versions, where does that leave this proposal?  personally I'd
  *still* prefer java enums, even if it doesn't get into scaladoc.  btw,
 even
  with sealed traits, the scaladoc still isn't great -- you don't see the
  values from the class, you only see them listed from the companion
 object.
  (though, that is somewhat standard for scaladoc, so maybe I'm reaching a
  little)
 
 
 
  On Mon, Mar 23, 2015 at 4:11 PM, Patrick Wendell pwend...@gmail.com
  wrote:
 
  If the official solution from the Scala community is to use Java
  enums, then it seems strange they aren't generated in scaldoc? Maybe
  we can just fix that w/ Typesafe's help and then we can use them.
 
  On Mon, Mar 23, 2015 at 1:46 PM, Sean Owen so...@cloudera.com wrote:
   Yeah the fully realized #4, which gets back the ability to use it in
   switch statements (? in Scala but not Java?) does end up being kind
 of
   huge.
  
   I confess I'm swayed a bit back to Java enums, seeing what it
   involves. The hashCode() issue can be 'solved' with the hash of the
   String representation.
  
   On Mon, Mar 23, 2015 at 8:33 PM, Imran Rashid iras...@cloudera.com
   wrote:
   I've just switched some of my code over to the new format, and I
 just
   want
   to make sure everyone realizes what we are getting into.  I went
 from
   10
   lines as java enums
  
  
  
 https://github.com/squito/spark/blob/fef66058612ebf225e58dd5f5fea6bae1afd5b31/core/src/main/java/org/apache/spark/status/api/StageStatus.java#L20
  
   to 30 lines with the new format:
  
  
  
 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/status/api/v1/api.scala#L250
  
   its not just that its verbose.  each name has to be repeated 4
 times,
   with
   potential typos in some locations that won't be caught by the
   compiler.
   Also, you have to manually maintain the values as you update the
 set
   of
   enums, the compiler won't do it for you.
  
   The only downside I've heard for java enums is enum.hashcode().
 OTOH,
   the
   downsides for this version are: maintainability / verbosity, no
   values(),
   more cumbersome to use from java, no enum map / enumset.
  
   I did put together a little util to at least get back the equivalent
   of
   enum.valueOf() with this format
  
  
  
 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/util/SparkEnum.scala
  
   I'm not trying to prevent us from moving forward on this, its fine
 if
   this
   is still what everyone wants, but I feel pretty strongly java enums
   make
   more sense.
  
   thanks,
   Imran
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
   For additional commands, e-mail: dev-h...@spark.apache.org
  
 
 
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org




Re: enum-like types in Spark

2015-04-09 Thread Imran Rashid
any update here?  This is relevant for a currently open PR of mine -- I've
got a bunch of new public constants defined w/ format #4, but I'd gladly
switch to java enums.  (Even if we are just going to postpone this
decision, I'm still inclined to switch to java enums ...)

just to be clear about the existing problem with enums  scaladoc: right
now, the scaladoc knows about the enum class, and generates a page for it,
but it does not display the enum constants.  It is at least labeled as a
java enum, though, so a savvy user could switch to the javadocs to see the
constants.



On Mon, Mar 23, 2015 at 4:50 PM, Imran Rashid iras...@cloudera.com wrote:

 well, perhaps I overstated things a little, I wouldn't call it the
 official solution, just a recommendation in the never-ending debate (and
 the recommendation from folks with their hands on scala itself).

 Even if we do get this fixed in scaladoc eventually -- as its not in the
 current versions, where does that leave this proposal?  personally I'd
 *still* prefer java enums, even if it doesn't get into scaladoc.  btw, even
 with sealed traits, the scaladoc still isn't great -- you don't see the
 values from the class, you only see them listed from the companion object.
  (though, that is somewhat standard for scaladoc, so maybe I'm reaching a
 little)



 On Mon, Mar 23, 2015 at 4:11 PM, Patrick Wendell pwend...@gmail.com
 wrote:

 If the official solution from the Scala community is to use Java
 enums, then it seems strange they aren't generated in scaldoc? Maybe
 we can just fix that w/ Typesafe's help and then we can use them.

 On Mon, Mar 23, 2015 at 1:46 PM, Sean Owen so...@cloudera.com wrote:
  Yeah the fully realized #4, which gets back the ability to use it in
  switch statements (? in Scala but not Java?) does end up being kind of
  huge.
 
  I confess I'm swayed a bit back to Java enums, seeing what it
  involves. The hashCode() issue can be 'solved' with the hash of the
  String representation.
 
  On Mon, Mar 23, 2015 at 8:33 PM, Imran Rashid iras...@cloudera.com
 wrote:
  I've just switched some of my code over to the new format, and I just
 want
  to make sure everyone realizes what we are getting into.  I went from
 10
  lines as java enums
 
 
 https://github.com/squito/spark/blob/fef66058612ebf225e58dd5f5fea6bae1afd5b31/core/src/main/java/org/apache/spark/status/api/StageStatus.java#L20
 
  to 30 lines with the new format:
 
 
 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/status/api/v1/api.scala#L250
 
  its not just that its verbose.  each name has to be repeated 4 times,
 with
  potential typos in some locations that won't be caught by the compiler.
  Also, you have to manually maintain the values as you update the set
 of
  enums, the compiler won't do it for you.
 
  The only downside I've heard for java enums is enum.hashcode().  OTOH,
 the
  downsides for this version are: maintainability / verbosity, no
 values(),
  more cumbersome to use from java, no enum map / enumset.
 
  I did put together a little util to at least get back the equivalent of
  enum.valueOf() with this format
 
 
 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/util/SparkEnum.scala
 
  I'm not trying to prevent us from moving forward on this, its fine if
 this
  is still what everyone wants, but I feel pretty strongly java enums
 make
  more sense.
 
  thanks,
  Imran
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 





Re: enum-like types in Spark

2015-03-23 Thread Sean Owen
Yeah the fully realized #4, which gets back the ability to use it in
switch statements (? in Scala but not Java?) does end up being kind of
huge.

I confess I'm swayed a bit back to Java enums, seeing what it
involves. The hashCode() issue can be 'solved' with the hash of the
String representation.

On Mon, Mar 23, 2015 at 8:33 PM, Imran Rashid iras...@cloudera.com wrote:
 I've just switched some of my code over to the new format, and I just want
 to make sure everyone realizes what we are getting into.  I went from 10
 lines as java enums

 https://github.com/squito/spark/blob/fef66058612ebf225e58dd5f5fea6bae1afd5b31/core/src/main/java/org/apache/spark/status/api/StageStatus.java#L20

 to 30 lines with the new format:

 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/status/api/v1/api.scala#L250

 its not just that its verbose.  each name has to be repeated 4 times, with
 potential typos in some locations that won't be caught by the compiler.
 Also, you have to manually maintain the values as you update the set of
 enums, the compiler won't do it for you.

 The only downside I've heard for java enums is enum.hashcode().  OTOH, the
 downsides for this version are: maintainability / verbosity, no values(),
 more cumbersome to use from java, no enum map / enumset.

 I did put together a little util to at least get back the equivalent of
 enum.valueOf() with this format

 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/util/SparkEnum.scala

 I'm not trying to prevent us from moving forward on this, its fine if this
 is still what everyone wants, but I feel pretty strongly java enums make
 more sense.

 thanks,
 Imran

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: enum-like types in Spark

2015-03-23 Thread Reynold Xin
If scaladoc can show the Java enum types, I do think the best way is then
just Java enum types.


On Mon, Mar 23, 2015 at 2:11 PM, Patrick Wendell pwend...@gmail.com wrote:

 If the official solution from the Scala community is to use Java
 enums, then it seems strange they aren't generated in scaldoc? Maybe
 we can just fix that w/ Typesafe's help and then we can use them.

 On Mon, Mar 23, 2015 at 1:46 PM, Sean Owen so...@cloudera.com wrote:
  Yeah the fully realized #4, which gets back the ability to use it in
  switch statements (? in Scala but not Java?) does end up being kind of
  huge.
 
  I confess I'm swayed a bit back to Java enums, seeing what it
  involves. The hashCode() issue can be 'solved' with the hash of the
  String representation.
 
  On Mon, Mar 23, 2015 at 8:33 PM, Imran Rashid iras...@cloudera.com
 wrote:
  I've just switched some of my code over to the new format, and I just
 want
  to make sure everyone realizes what we are getting into.  I went from 10
  lines as java enums
 
 
 https://github.com/squito/spark/blob/fef66058612ebf225e58dd5f5fea6bae1afd5b31/core/src/main/java/org/apache/spark/status/api/StageStatus.java#L20
 
  to 30 lines with the new format:
 
 
 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/status/api/v1/api.scala#L250
 
  its not just that its verbose.  each name has to be repeated 4 times,
 with
  potential typos in some locations that won't be caught by the compiler.
  Also, you have to manually maintain the values as you update the set
 of
  enums, the compiler won't do it for you.
 
  The only downside I've heard for java enums is enum.hashcode().  OTOH,
 the
  downsides for this version are: maintainability / verbosity, no
 values(),
  more cumbersome to use from java, no enum map / enumset.
 
  I did put together a little util to at least get back the equivalent of
  enum.valueOf() with this format
 
 
 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/util/SparkEnum.scala
 
  I'm not trying to prevent us from moving forward on this, its fine if
 this
  is still what everyone wants, but I feel pretty strongly java enums make
  more sense.
 
  thanks,
  Imran
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org




Re: enum-like types in Spark

2015-03-23 Thread Imran Rashid
well, perhaps I overstated things a little, I wouldn't call it the
official solution, just a recommendation in the never-ending debate (and
the recommendation from folks with their hands on scala itself).

Even if we do get this fixed in scaladoc eventually -- as its not in the
current versions, where does that leave this proposal?  personally I'd
*still* prefer java enums, even if it doesn't get into scaladoc.  btw, even
with sealed traits, the scaladoc still isn't great -- you don't see the
values from the class, you only see them listed from the companion object.
 (though, that is somewhat standard for scaladoc, so maybe I'm reaching a
little)



On Mon, Mar 23, 2015 at 4:11 PM, Patrick Wendell pwend...@gmail.com wrote:

 If the official solution from the Scala community is to use Java
 enums, then it seems strange they aren't generated in scaldoc? Maybe
 we can just fix that w/ Typesafe's help and then we can use them.

 On Mon, Mar 23, 2015 at 1:46 PM, Sean Owen so...@cloudera.com wrote:
  Yeah the fully realized #4, which gets back the ability to use it in
  switch statements (? in Scala but not Java?) does end up being kind of
  huge.
 
  I confess I'm swayed a bit back to Java enums, seeing what it
  involves. The hashCode() issue can be 'solved' with the hash of the
  String representation.
 
  On Mon, Mar 23, 2015 at 8:33 PM, Imran Rashid iras...@cloudera.com
 wrote:
  I've just switched some of my code over to the new format, and I just
 want
  to make sure everyone realizes what we are getting into.  I went from 10
  lines as java enums
 
 
 https://github.com/squito/spark/blob/fef66058612ebf225e58dd5f5fea6bae1afd5b31/core/src/main/java/org/apache/spark/status/api/StageStatus.java#L20
 
  to 30 lines with the new format:
 
 
 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/status/api/v1/api.scala#L250
 
  its not just that its verbose.  each name has to be repeated 4 times,
 with
  potential typos in some locations that won't be caught by the compiler.
  Also, you have to manually maintain the values as you update the set
 of
  enums, the compiler won't do it for you.
 
  The only downside I've heard for java enums is enum.hashcode().  OTOH,
 the
  downsides for this version are: maintainability / verbosity, no
 values(),
  more cumbersome to use from java, no enum map / enumset.
 
  I did put together a little util to at least get back the equivalent of
  enum.valueOf() with this format
 
 
 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/util/SparkEnum.scala
 
  I'm not trying to prevent us from moving forward on this, its fine if
 this
  is still what everyone wants, but I feel pretty strongly java enums make
  more sense.
 
  thanks,
  Imran
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 



Re: enum-like types in Spark

2015-03-23 Thread Aaron Davidson
The only issue I knew of with Java enums was that it does not appear in the
Scala documentation.

On Mon, Mar 23, 2015 at 1:46 PM, Sean Owen so...@cloudera.com wrote:

 Yeah the fully realized #4, which gets back the ability to use it in
 switch statements (? in Scala but not Java?) does end up being kind of
 huge.

 I confess I'm swayed a bit back to Java enums, seeing what it
 involves. The hashCode() issue can be 'solved' with the hash of the
 String representation.

 On Mon, Mar 23, 2015 at 8:33 PM, Imran Rashid iras...@cloudera.com
 wrote:
  I've just switched some of my code over to the new format, and I just
 want
  to make sure everyone realizes what we are getting into.  I went from 10
  lines as java enums
 
 
 https://github.com/squito/spark/blob/fef66058612ebf225e58dd5f5fea6bae1afd5b31/core/src/main/java/org/apache/spark/status/api/StageStatus.java#L20
 
  to 30 lines with the new format:
 
 
 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/status/api/v1/api.scala#L250
 
  its not just that its verbose.  each name has to be repeated 4 times,
 with
  potential typos in some locations that won't be caught by the compiler.
  Also, you have to manually maintain the values as you update the set of
  enums, the compiler won't do it for you.
 
  The only downside I've heard for java enums is enum.hashcode().  OTOH,
 the
  downsides for this version are: maintainability / verbosity, no values(),
  more cumbersome to use from java, no enum map / enumset.
 
  I did put together a little util to at least get back the equivalent of
  enum.valueOf() with this format
 
 
 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/util/SparkEnum.scala
 
  I'm not trying to prevent us from moving forward on this, its fine if
 this
  is still what everyone wants, but I feel pretty strongly java enums make
  more sense.
 
  thanks,
  Imran



Re: enum-like types in Spark

2015-03-23 Thread Patrick Wendell
If the official solution from the Scala community is to use Java
enums, then it seems strange they aren't generated in scaldoc? Maybe
we can just fix that w/ Typesafe's help and then we can use them.

On Mon, Mar 23, 2015 at 1:46 PM, Sean Owen so...@cloudera.com wrote:
 Yeah the fully realized #4, which gets back the ability to use it in
 switch statements (? in Scala but not Java?) does end up being kind of
 huge.

 I confess I'm swayed a bit back to Java enums, seeing what it
 involves. The hashCode() issue can be 'solved' with the hash of the
 String representation.

 On Mon, Mar 23, 2015 at 8:33 PM, Imran Rashid iras...@cloudera.com wrote:
 I've just switched some of my code over to the new format, and I just want
 to make sure everyone realizes what we are getting into.  I went from 10
 lines as java enums

 https://github.com/squito/spark/blob/fef66058612ebf225e58dd5f5fea6bae1afd5b31/core/src/main/java/org/apache/spark/status/api/StageStatus.java#L20

 to 30 lines with the new format:

 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/status/api/v1/api.scala#L250

 its not just that its verbose.  each name has to be repeated 4 times, with
 potential typos in some locations that won't be caught by the compiler.
 Also, you have to manually maintain the values as you update the set of
 enums, the compiler won't do it for you.

 The only downside I've heard for java enums is enum.hashcode().  OTOH, the
 downsides for this version are: maintainability / verbosity, no values(),
 more cumbersome to use from java, no enum map / enumset.

 I did put together a little util to at least get back the equivalent of
 enum.valueOf() with this format

 https://github.com/squito/spark/blob/SPARK-3454_w_jersey/core/src/main/scala/org/apache/spark/util/SparkEnum.scala

 I'm not trying to prevent us from moving forward on this, its fine if this
 is still what everyone wants, but I feel pretty strongly java enums make
 more sense.

 thanks,
 Imran

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: enum-like types in Spark

2015-03-17 Thread Xiangrui Meng
):


 1. object Container {
 2. val MyConstant = ...
 3. }


 2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:

  Hi all,

 There are many places where we use enum-like types in Spark, but

 in

 different ways. Every approach has both pros and cons. I wonder
 whether there should be an official approach for enum-like

 types in

 Spark.

 1. Scala's Enumeration (e.g., SchedulingMode, WorkerState, etc)

 * All types show up as Enumeration.Value in Java.


  http://spark.apache.org/docs/latest/api/java/org/apache/
 spark/scheduler/SchedulingMode.html

 2. Java's Enum (e.g., SaveMode, IOMode)

 * Implementation must be in a Java file.
 * Values doesn't show up in the ScalaDoc:


  http://spark.apache.org/docs/latest/api/scala/#org.apache.
 spark.network.util.IOMode

 3. Static fields in Java (e.g., TripletFields)

 * Implementation must be in a Java file.
 * Doesn't need () in Java code.
 * Values don't show up in the ScalaDoc:


  http://spark.apache.org/docs/latest/api/scala/#org.apache.
 spark.graphx.TripletFields

 4. Objects in Scala. (e.g., StorageLevel)

 * Needs () in Java code.
 * Values show up in both ScalaDoc and JavaDoc:


  http://spark.apache.org/docs/latest/api/scala/#org.apache.
 spark.storage.StorageLevel$


  http://spark.apache.org/docs/latest/api/java/org/apache/
 spark/storage/StorageLevel.html

 It would be great if we have an official approach for this as

 well

 as the naming convention for enum-like values (MEMORY_ONLY or
 MemoryOnly). Personally, I like 4) with MEMORY_ONLY. Any

 thoughts?

 Best,
 Xiangrui


  
 -

 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


  
 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


  -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org



 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: enum-like types in Spark

2015-03-16 Thread Patrick Wendell
Hey Xiangrui,

Do you want to write up a straw man proposal based on this line of discussion?

- Patrick

On Mon, Mar 16, 2015 at 12:12 PM, Kevin Markey kevin.mar...@oracle.com wrote:
 In some applications, I have rather heavy use of Java enums which are needed
 for related Java APIs that the application uses.  And unfortunately, they
 are also used as keys.  As such, using the native hashcodes makes any
 function over keys unstable and unpredictable, so we now use Enum.name() as
 the key instead.  Oh well.  But it works and seems to work well.

 Kevin


 On 03/05/2015 09:49 PM, Mridul Muralidharan wrote:

I have a strong dislike for java enum's due to the fact that they
 are not stable across JVM's - if it undergoes serde, you end up with
 unpredictable results at times [1].
 One of the reasons why we prevent enum's from being key : though it is
 highly possible users might depend on it internally and shoot
 themselves in the foot.

 Would be better to keep away from them in general and use something more
 stable.

 Regards,
 Mridul

 [1] Having had to debug this issue for 2 weeks - I really really hate it.


 On Thu, Mar 5, 2015 at 1:08 PM, Imran Rashid iras...@cloudera.com wrote:

 I have a very strong dislike for #1 (scala enumerations).   I'm ok with
 #4
 (with Xiangrui's final suggestion, especially making it sealed 
 available
 in Java), but I really think #2, java enums, are the best option.

 Java enums actually have some very real advantages over the other
 approaches -- you get values(), valueOf(), EnumSet, and EnumMap.  There
 has
 been endless debate in the Scala community about the problems with the
 approaches in Scala.  Very smart, level-headed Scala gurus have
 complained
 about their short-comings (Rex Kerr's name is coming to mind, though I'm
 not positive about that); there have been numerous well-thought out
 proposals to give Scala a better enum.  But the powers-that-be in Scala
 always reject them.  IIRC the explanation for rejecting is basically that
 (a) enums aren't important enough for introducing some new special
 feature,
 scala's got bigger things to work on and (b) if you really need a good
 enum, just use java's enum.

 I doubt it really matters that much for Spark internals, which is why I
 think #4 is fine.  But I figured I'd give my spiel, because every
 developer
 loves language wars :)

 Imran



 On Thu, Mar 5, 2015 at 1:35 AM, Xiangrui Meng men...@gmail.com wrote:

 `case object` inside an `object` doesn't show up in Java. This is the
 minimal code I found to make everything show up correctly in both
 Scala and Java:

 sealed abstract class StorageLevel // cannot be a trait

 object StorageLevel {
private[this] case object _MemoryOnly extends StorageLevel
final val MemoryOnly: StorageLevel = _MemoryOnly

private[this] case object _DiskOnly extends StorageLevel
final val DiskOnly: StorageLevel = _DiskOnly
 }

 On Wed, Mar 4, 2015 at 8:10 PM, Patrick Wendell pwend...@gmail.com
 wrote:

 I like #4 as well and agree with Aaron's suggestion.

 - Patrick

 On Wed, Mar 4, 2015 at 6:07 PM, Aaron Davidson ilike...@gmail.com

 wrote:

 I'm cool with #4 as well, but make sure we dictate that the values

 should

 be defined within an object with the same name as the enumeration
 (like

 we

 do for StorageLevel). Otherwise we may pollute a higher namespace.

 e.g. we SHOULD do:

 trait StorageLevel
 object StorageLevel {
case object MemoryOnly extends StorageLevel
case object DiskOnly extends StorageLevel
 }

 On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust 

 mich...@databricks.com

 wrote:

 #4 with a preference for CamelCaseEnums

 On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley
 jos...@databricks.com
 wrote:

 another vote for #4
 People are already used to adding () in Java.


 On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch java...@gmail.com

 wrote:

 #4 but with MemoryOnly (more scala-like)

 http://docs.scala-lang.org/style/naming-conventions.html

 Constants, Values, Variable and Methods

 Constant names should be in upper camel case. That is, if the

 member is

 final, immutable and it belongs to a package object or an object,

 it

 may

 be

 considered a constant (similar to Java'sstatic final members):


 1. object Container {
 2. val MyConstant = ...
 3. }


 2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:

 Hi all,

 There are many places where we use enum-like types in Spark, but

 in

 different ways. Every approach has both pros and cons. I wonder
 whether there should be an official approach for enum-like

 types in

 Spark.

 1. Scala's Enumeration (e.g., SchedulingMode, WorkerState, etc)

 * All types show up as Enumeration.Value in Java.



 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html

 2. Java's Enum (e.g., SaveMode, IOMode)

 * Implementation must be in a Java file.
 * Values doesn't show up in the ScalaDoc:



 http://spark.apache.org/docs/latest/api

Re: enum-like types in Spark

2015-03-16 Thread Aaron Davidson
It's unrelated to the proposal, but Enum#ordinal() should be much faster,
assuming it's not serialized to JVMs with different versions of the enum :)

On Mon, Mar 16, 2015 at 12:12 PM, Kevin Markey kevin.mar...@oracle.com
wrote:

 In some applications, I have rather heavy use of Java enums which are
 needed for related Java APIs that the application uses.  And unfortunately,
 they are also used as keys.  As such, using the native hashcodes makes any
 function over keys unstable and unpredictable, so we now use Enum.name() as
 the key instead.  Oh well.  But it works and seems to work well.

 Kevin


 On 03/05/2015 09:49 PM, Mridul Muralidharan wrote:

I have a strong dislike for java enum's due to the fact that they
 are not stable across JVM's - if it undergoes serde, you end up with
 unpredictable results at times [1].
 One of the reasons why we prevent enum's from being key : though it is
 highly possible users might depend on it internally and shoot
 themselves in the foot.

 Would be better to keep away from them in general and use something more
 stable.

 Regards,
 Mridul

 [1] Having had to debug this issue for 2 weeks - I really really hate it.


 On Thu, Mar 5, 2015 at 1:08 PM, Imran Rashid iras...@cloudera.com
 wrote:

 I have a very strong dislike for #1 (scala enumerations).   I'm ok with
 #4
 (with Xiangrui's final suggestion, especially making it sealed 
 available
 in Java), but I really think #2, java enums, are the best option.

 Java enums actually have some very real advantages over the other
 approaches -- you get values(), valueOf(), EnumSet, and EnumMap.  There
 has
 been endless debate in the Scala community about the problems with the
 approaches in Scala.  Very smart, level-headed Scala gurus have
 complained
 about their short-comings (Rex Kerr's name is coming to mind, though I'm
 not positive about that); there have been numerous well-thought out
 proposals to give Scala a better enum.  But the powers-that-be in Scala
 always reject them.  IIRC the explanation for rejecting is basically that
 (a) enums aren't important enough for introducing some new special
 feature,
 scala's got bigger things to work on and (b) if you really need a good
 enum, just use java's enum.

 I doubt it really matters that much for Spark internals, which is why I
 think #4 is fine.  But I figured I'd give my spiel, because every
 developer
 loves language wars :)

 Imran



 On Thu, Mar 5, 2015 at 1:35 AM, Xiangrui Meng men...@gmail.com wrote:

  `case object` inside an `object` doesn't show up in Java. This is the
 minimal code I found to make everything show up correctly in both
 Scala and Java:

 sealed abstract class StorageLevel // cannot be a trait

 object StorageLevel {
private[this] case object _MemoryOnly extends StorageLevel
final val MemoryOnly: StorageLevel = _MemoryOnly

private[this] case object _DiskOnly extends StorageLevel
final val DiskOnly: StorageLevel = _DiskOnly
 }

 On Wed, Mar 4, 2015 at 8:10 PM, Patrick Wendell pwend...@gmail.com
 wrote:

 I like #4 as well and agree with Aaron's suggestion.

 - Patrick

 On Wed, Mar 4, 2015 at 6:07 PM, Aaron Davidson ilike...@gmail.com

 wrote:

 I'm cool with #4 as well, but make sure we dictate that the values

 should

 be defined within an object with the same name as the enumeration (like

 we

 do for StorageLevel). Otherwise we may pollute a higher namespace.

 e.g. we SHOULD do:

 trait StorageLevel
 object StorageLevel {
case object MemoryOnly extends StorageLevel
case object DiskOnly extends StorageLevel
 }

 On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust 

 mich...@databricks.com

 wrote:

  #4 with a preference for CamelCaseEnums

 On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley 
 jos...@databricks.com
 wrote:

  another vote for #4
 People are already used to adding () in Java.


 On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch java...@gmail.com

 wrote:

 #4 but with MemoryOnly (more scala-like)

 http://docs.scala-lang.org/style/naming-conventions.html

 Constants, Values, Variable and Methods

 Constant names should be in upper camel case. That is, if the

 member is

 final, immutable and it belongs to a package object or an object,

 it

 may

 be

 considered a constant (similar to Java'sstatic final members):


 1. object Container {
 2. val MyConstant = ...
 3. }


 2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:

  Hi all,

 There are many places where we use enum-like types in Spark, but

 in

 different ways. Every approach has both pros and cons. I wonder
 whether there should be an official approach for enum-like

 types in

 Spark.

 1. Scala's Enumeration (e.g., SchedulingMode, WorkerState, etc)

 * All types show up as Enumeration.Value in Java.


  http://spark.apache.org/docs/latest/api/java/org/apache/
 spark/scheduler/SchedulingMode.html

 2. Java's Enum (e.g., SaveMode, IOMode)

 * Implementation must be in a Java file.
 * Values doesn't show up

Re: enum-like types in Spark

2015-03-16 Thread Xiangrui Meng
}
   
On Wed, Mar 4, 2015 at 8:10 PM, Patrick Wendell 
 pwend...@gmail.com
wrote:
 I like #4 as well and agree with Aaron's suggestion.

 - Patrick

 On Wed, Mar 4, 2015 at 6:07 PM, Aaron Davidson 
 ilike...@gmail.com
wrote:
 I'm cool with #4 as well, but make sure we dictate that the
 values
should
 be defined within an object with the same name as the
 enumeration
   (like
we
 do for StorageLevel). Otherwise we may pollute a higher
 namespace.

 e.g. we SHOULD do:

 trait StorageLevel
 object StorageLevel {
   case object MemoryOnly extends StorageLevel
   case object DiskOnly extends StorageLevel
 }

 On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust 
mich...@databricks.com
 wrote:

 #4 with a preference for CamelCaseEnums

 On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley 
   jos...@databricks.com
 wrote:

  another vote for #4
  People are already used to adding () in Java.
 
 
  On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch 
  java...@gmail.com
   
 wrote:
 
   #4 but with MemoryOnly (more scala-like)
  
   http://docs.scala-lang.org/style/naming-conventions.html
  
   Constants, Values, Variable and Methods
  
   Constant names should be in upper camel case. That is, if
 the
member is
   final, immutable and it belongs to a package object or an
   object,
it
 may
  be
   considered a constant (similar to Java'sstatic final
 members):
  
  
  1. object Container {
  2. val MyConstant = ...
  3. }
  
  
   2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com
 :
  
Hi all,
   
There are many places where we use enum-like types in
 Spark,
   but
in
different ways. Every approach has both pros and cons. I
   wonder
whether there should be an official approach for
 enum-like
types in
Spark.
   
1. Scala's Enumeration (e.g., SchedulingMode,
 WorkerState,
   etc)
   
* All types show up as Enumeration.Value in Java.
   
   
  
 

   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html
   
2. Java's Enum (e.g., SaveMode, IOMode)
   
* Implementation must be in a Java file.
* Values doesn't show up in the ScalaDoc:
   
   
  
 

   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode
   
3. Static fields in Java (e.g., TripletFields)
   
* Implementation must be in a Java file.
* Doesn't need () in Java code.
* Values don't show up in the ScalaDoc:
   
   
  
 

   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields
   
4. Objects in Scala. (e.g., StorageLevel)
   
* Needs () in Java code.
* Values show up in both ScalaDoc and JavaDoc:
   
   
  
 

   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$
   
   
  
 

   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html
   
It would be great if we have an official approach for
 this
   as
well
as the naming convention for enum-like values
 (MEMORY_ONLY
   or
MemoryOnly). Personally, I like 4) with MEMORY_ONLY.
 Any
 thoughts?
   
Best,
Xiangrui
   
   
   
 -
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail:
 dev-h...@spark.apache.org
   
   
  
 

   
   
 -
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org
   
   
  
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: enum-like types in Spark

2015-03-11 Thread RJ Nowling
 with #4 as well, but make sure we dictate that the
 values
should
 be defined within an object with the same name as the
 enumeration
   (like
we
 do for StorageLevel). Otherwise we may pollute a higher
 namespace.

 e.g. we SHOULD do:

 trait StorageLevel
 object StorageLevel {
   case object MemoryOnly extends StorageLevel
   case object DiskOnly extends StorageLevel
 }

 On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust 
mich...@databricks.com
 wrote:

 #4 with a preference for CamelCaseEnums

 On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley 
   jos...@databricks.com
 wrote:

  another vote for #4
  People are already used to adding () in Java.
 
 
  On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch 
  java...@gmail.com
   
 wrote:
 
   #4 but with MemoryOnly (more scala-like)
  
   http://docs.scala-lang.org/style/naming-conventions.html
  
   Constants, Values, Variable and Methods
  
   Constant names should be in upper camel case. That is, if
 the
member is
   final, immutable and it belongs to a package object or an
   object,
it
 may
  be
   considered a constant (similar to Java'sstatic final
 members):
  
  
  1. object Container {
  2. val MyConstant = ...
  3. }
  
  
   2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com
 :
  
Hi all,
   
There are many places where we use enum-like types in
 Spark,
   but
in
different ways. Every approach has both pros and cons. I
   wonder
whether there should be an official approach for
 enum-like
types in
Spark.
   
1. Scala's Enumeration (e.g., SchedulingMode,
 WorkerState,
   etc)
   
* All types show up as Enumeration.Value in Java.
   
   
  
 

   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html
   
2. Java's Enum (e.g., SaveMode, IOMode)
   
* Implementation must be in a Java file.
* Values doesn't show up in the ScalaDoc:
   
   
  
 

   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode
   
3. Static fields in Java (e.g., TripletFields)
   
* Implementation must be in a Java file.
* Doesn't need () in Java code.
* Values don't show up in the ScalaDoc:
   
   
  
 

   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields
   
4. Objects in Scala. (e.g., StorageLevel)
   
* Needs () in Java code.
* Values show up in both ScalaDoc and JavaDoc:
   
   
  
 

   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$
   
   
  
 

   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html
   
It would be great if we have an official approach for
 this
   as
well
as the naming convention for enum-like values
 (MEMORY_ONLY
   or
MemoryOnly). Personally, I like 4) with MEMORY_ONLY.
 Any
 thoughts?
   
Best,
Xiangrui
   
   
   
 -
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail:
 dev-h...@spark.apache.org
   
   
  
 

   
   
 -
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org
   
   
  
 

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org




Re: enum-like types in Spark

2015-03-09 Thread Imran Rashid

 Constants, Values, Variable and Methods

 Constant names should be in upper camel case. That is, if the
  member is
 final, immutable and it belongs to a package object or an
 object,
  it
   may
be
 considered a constant (similar to Java'sstatic final members):


1. object Container {
2. val MyConstant = ...
3. }


 2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:

  Hi all,
 
  There are many places where we use enum-like types in Spark,
 but
  in
  different ways. Every approach has both pros and cons. I
 wonder
  whether there should be an official approach for enum-like
  types in
  Spark.
 
  1. Scala's Enumeration (e.g., SchedulingMode, WorkerState,
 etc)
 
  * All types show up as Enumeration.Value in Java.
 
 

   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html
 
  2. Java's Enum (e.g., SaveMode, IOMode)
 
  * Implementation must be in a Java file.
  * Values doesn't show up in the ScalaDoc:
 
 

   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode
 
  3. Static fields in Java (e.g., TripletFields)
 
  * Implementation must be in a Java file.
  * Doesn't need () in Java code.
  * Values don't show up in the ScalaDoc:
 
 

   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields
 
  4. Objects in Scala. (e.g., StorageLevel)
 
  * Needs () in Java code.
  * Values show up in both ScalaDoc and JavaDoc:
 
 

   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$
 
 

   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html
 
  It would be great if we have an official approach for this
 as
  well
  as the naming convention for enum-like values (MEMORY_ONLY
 or
  MemoryOnly). Personally, I like 4) with MEMORY_ONLY. Any
   thoughts?
 
  Best,
  Xiangrui
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 

   
  
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 



Re: enum-like types in Spark

2015-03-06 Thread Sean Owen
This has some disadvantage for Java, I think. You can't switch on an
object defined like this, but you can with an enum. And although the
scala compiler understands that the set of values is fixed because of
'sealed' and so can warn about missing cases, the JVM won't know this,
and can't do the same.

On Fri, Mar 6, 2015 at 3:58 AM, Xiangrui Meng men...@gmail.com wrote:
 For #4, my previous proposal may confuse the IDEs with additional
 types generated by the case objects, and their toString contain the
 underscore. The following works better:

 sealed abstract class StorageLevel

 object StorageLevel {
   final val MemoryOnly: StorageLevel = {
 case object MemoryOnly extends StorageLevel
 MemoryOnly
   }

   final val DiskOnly: StorageLevel = {
 case object DiskOnly extends StorageLevel
 DiskOnly
  }
 }

 MemoryOnly and DiskOnly can be used in pattern matching. If people are
 okay with this approach, I can add it to the code style guide.

 Imran, this is not just for internal APIs, which are relatively more
 flexible. It is good to use the same approach to implement public
 enum-like types from now on.

 Best,
 Xiangrui

 On Thu, Mar 5, 2015 at 1:08 PM, Imran Rashid iras...@cloudera.com wrote:
 I have a very strong dislike for #1 (scala enumerations).   I'm ok with #4
 (with Xiangrui's final suggestion, especially making it sealed  available
 in Java), but I really think #2, java enums, are the best option.

 Java enums actually have some very real advantages over the other
 approaches -- you get values(), valueOf(), EnumSet, and EnumMap.  There has
 been endless debate in the Scala community about the problems with the
 approaches in Scala.  Very smart, level-headed Scala gurus have complained
 about their short-comings (Rex Kerr's name is coming to mind, though I'm
 not positive about that); there have been numerous well-thought out
 proposals to give Scala a better enum.  But the powers-that-be in Scala
 always reject them.  IIRC the explanation for rejecting is basically that
 (a) enums aren't important enough for introducing some new special feature,
 scala's got bigger things to work on and (b) if you really need a good
 enum, just use java's enum.

 I doubt it really matters that much for Spark internals, which is why I
 think #4 is fine.  But I figured I'd give my spiel, because every developer
 loves language wars :)

 Imran



 On Thu, Mar 5, 2015 at 1:35 AM, Xiangrui Meng men...@gmail.com wrote:

 `case object` inside an `object` doesn't show up in Java. This is the
 minimal code I found to make everything show up correctly in both
 Scala and Java:

 sealed abstract class StorageLevel // cannot be a trait

 object StorageLevel {
   private[this] case object _MemoryOnly extends StorageLevel
   final val MemoryOnly: StorageLevel = _MemoryOnly

   private[this] case object _DiskOnly extends StorageLevel
   final val DiskOnly: StorageLevel = _DiskOnly
 }

 On Wed, Mar 4, 2015 at 8:10 PM, Patrick Wendell pwend...@gmail.com
 wrote:
  I like #4 as well and agree with Aaron's suggestion.
 
  - Patrick
 
  On Wed, Mar 4, 2015 at 6:07 PM, Aaron Davidson ilike...@gmail.com
 wrote:
  I'm cool with #4 as well, but make sure we dictate that the values
 should
  be defined within an object with the same name as the enumeration (like
 we
  do for StorageLevel). Otherwise we may pollute a higher namespace.
 
  e.g. we SHOULD do:
 
  trait StorageLevel
  object StorageLevel {
case object MemoryOnly extends StorageLevel
case object DiskOnly extends StorageLevel
  }
 
  On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust 
 mich...@databricks.com
  wrote:
 
  #4 with a preference for CamelCaseEnums
 
  On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley jos...@databricks.com
  wrote:
 
   another vote for #4
   People are already used to adding () in Java.
  
  
   On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch java...@gmail.com
  wrote:
  
#4 but with MemoryOnly (more scala-like)
   
http://docs.scala-lang.org/style/naming-conventions.html
   
Constants, Values, Variable and Methods
   
Constant names should be in upper camel case. That is, if the
 member is
final, immutable and it belongs to a package object or an object,
 it
  may
   be
considered a constant (similar to Java'sstatic final members):
   
   
   1. object Container {
   2. val MyConstant = ...
   3. }
   
   
2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:
   
 Hi all,

 There are many places where we use enum-like types in Spark, but
 in
 different ways. Every approach has both pros and cons. I wonder
 whether there should be an official approach for enum-like
 types in
 Spark.

 1. Scala's Enumeration (e.g., SchedulingMode, WorkerState, etc)

 * All types show up as Enumeration.Value in Java.


   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html

 2. Java's Enum (e.g

Re: enum-like types in Spark

2015-03-05 Thread Patrick Wendell
Yes - only new or internal API's. I doubt we'd break any exposed APIs for
the purpose of clean up.

Patrick
On Mar 5, 2015 12:16 AM, Mridul Muralidharan mri...@gmail.com wrote:

 While I dont have any strong opinions about how we handle enum's
 either way in spark, I assume the discussion is targetted at (new) api
 being designed in spark.
 Rewiring what we already have exposed will lead to incompatible api
 change (StorageLevel for example, is in 1.0).

 Regards,
 Mridul

 On Wed, Mar 4, 2015 at 11:45 PM, Aaron Davidson ilike...@gmail.com
 wrote:
  That's kinda annoying, but it's just a little extra boilerplate. Can you
  call it as StorageLevel.DiskOnly() from Java? Would it also work if they
  were case classes with empty constructors, without the field?
 
  On Wed, Mar 4, 2015 at 11:35 PM, Xiangrui Meng men...@gmail.com wrote:
 
  `case object` inside an `object` doesn't show up in Java. This is the
  minimal code I found to make everything show up correctly in both
  Scala and Java:
 
  sealed abstract class StorageLevel // cannot be a trait
 
  object StorageLevel {
private[this] case object _MemoryOnly extends StorageLevel
final val MemoryOnly: StorageLevel = _MemoryOnly
 
private[this] case object _DiskOnly extends StorageLevel
final val DiskOnly: StorageLevel = _DiskOnly
  }
 
  On Wed, Mar 4, 2015 at 8:10 PM, Patrick Wendell pwend...@gmail.com
  wrote:
   I like #4 as well and agree with Aaron's suggestion.
  
   - Patrick
  
   On Wed, Mar 4, 2015 at 6:07 PM, Aaron Davidson ilike...@gmail.com
  wrote:
   I'm cool with #4 as well, but make sure we dictate that the values
  should
   be defined within an object with the same name as the enumeration
 (like
  we
   do for StorageLevel). Otherwise we may pollute a higher namespace.
  
   e.g. we SHOULD do:
  
   trait StorageLevel
   object StorageLevel {
 case object MemoryOnly extends StorageLevel
 case object DiskOnly extends StorageLevel
   }
  
   On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust 
  mich...@databricks.com
   wrote:
  
   #4 with a preference for CamelCaseEnums
  
   On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley 
 jos...@databricks.com
   wrote:
  
another vote for #4
People are already used to adding () in Java.
   
   
On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch java...@gmail.com
 
   wrote:
   
 #4 but with MemoryOnly (more scala-like)

 http://docs.scala-lang.org/style/naming-conventions.html

 Constants, Values, Variable and Methods

 Constant names should be in upper camel case. That is, if the
  member is
 final, immutable and it belongs to a package object or an
 object,
  it
   may
be
 considered a constant (similar to Java'sstatic final members):


1. object Container {
2. val MyConstant = ...
3. }


 2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:

  Hi all,
 
  There are many places where we use enum-like types in Spark,
 but
  in
  different ways. Every approach has both pros and cons. I
 wonder
  whether there should be an official approach for enum-like
  types in
  Spark.
 
  1. Scala's Enumeration (e.g., SchedulingMode, WorkerState,
 etc)
 
  * All types show up as Enumeration.Value in Java.
 
 

   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html
 
  2. Java's Enum (e.g., SaveMode, IOMode)
 
  * Implementation must be in a Java file.
  * Values doesn't show up in the ScalaDoc:
 
 

   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode
 
  3. Static fields in Java (e.g., TripletFields)
 
  * Implementation must be in a Java file.
  * Doesn't need () in Java code.
  * Values don't show up in the ScalaDoc:
 
 

   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields
 
  4. Objects in Scala. (e.g., StorageLevel)
 
  * Needs () in Java code.
  * Values show up in both ScalaDoc and JavaDoc:
 
 

   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$
 
 

   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html
 
  It would be great if we have an official approach for this
 as
  well
  as the naming convention for enum-like values (MEMORY_ONLY
 or
  MemoryOnly). Personally, I like 4) with MEMORY_ONLY. Any
   thoughts?
 
  Best,
  Xiangrui
 
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 

   
  
 



Re: enum-like types in Spark

2015-03-05 Thread Mridul Muralidharan
While I dont have any strong opinions about how we handle enum's
either way in spark, I assume the discussion is targetted at (new) api
being designed in spark.
Rewiring what we already have exposed will lead to incompatible api
change (StorageLevel for example, is in 1.0).

Regards,
Mridul

On Wed, Mar 4, 2015 at 11:45 PM, Aaron Davidson ilike...@gmail.com wrote:
 That's kinda annoying, but it's just a little extra boilerplate. Can you
 call it as StorageLevel.DiskOnly() from Java? Would it also work if they
 were case classes with empty constructors, without the field?

 On Wed, Mar 4, 2015 at 11:35 PM, Xiangrui Meng men...@gmail.com wrote:

 `case object` inside an `object` doesn't show up in Java. This is the
 minimal code I found to make everything show up correctly in both
 Scala and Java:

 sealed abstract class StorageLevel // cannot be a trait

 object StorageLevel {
   private[this] case object _MemoryOnly extends StorageLevel
   final val MemoryOnly: StorageLevel = _MemoryOnly

   private[this] case object _DiskOnly extends StorageLevel
   final val DiskOnly: StorageLevel = _DiskOnly
 }

 On Wed, Mar 4, 2015 at 8:10 PM, Patrick Wendell pwend...@gmail.com
 wrote:
  I like #4 as well and agree with Aaron's suggestion.
 
  - Patrick
 
  On Wed, Mar 4, 2015 at 6:07 PM, Aaron Davidson ilike...@gmail.com
 wrote:
  I'm cool with #4 as well, but make sure we dictate that the values
 should
  be defined within an object with the same name as the enumeration (like
 we
  do for StorageLevel). Otherwise we may pollute a higher namespace.
 
  e.g. we SHOULD do:
 
  trait StorageLevel
  object StorageLevel {
case object MemoryOnly extends StorageLevel
case object DiskOnly extends StorageLevel
  }
 
  On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust 
 mich...@databricks.com
  wrote:
 
  #4 with a preference for CamelCaseEnums
 
  On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley jos...@databricks.com
  wrote:
 
   another vote for #4
   People are already used to adding () in Java.
  
  
   On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch java...@gmail.com
  wrote:
  
#4 but with MemoryOnly (more scala-like)
   
http://docs.scala-lang.org/style/naming-conventions.html
   
Constants, Values, Variable and Methods
   
Constant names should be in upper camel case. That is, if the
 member is
final, immutable and it belongs to a package object or an object,
 it
  may
   be
considered a constant (similar to Java'sstatic final members):
   
   
   1. object Container {
   2. val MyConstant = ...
   3. }
   
   
2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:
   
 Hi all,

 There are many places where we use enum-like types in Spark, but
 in
 different ways. Every approach has both pros and cons. I wonder
 whether there should be an official approach for enum-like
 types in
 Spark.

 1. Scala's Enumeration (e.g., SchedulingMode, WorkerState, etc)

 * All types show up as Enumeration.Value in Java.


   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html

 2. Java's Enum (e.g., SaveMode, IOMode)

 * Implementation must be in a Java file.
 * Values doesn't show up in the ScalaDoc:


   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode

 3. Static fields in Java (e.g., TripletFields)

 * Implementation must be in a Java file.
 * Doesn't need () in Java code.
 * Values don't show up in the ScalaDoc:


   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields

 4. Objects in Scala. (e.g., StorageLevel)

 * Needs () in Java code.
 * Values show up in both ScalaDoc and JavaDoc:


   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$


   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html

 It would be great if we have an official approach for this as
 well
 as the naming convention for enum-like values (MEMORY_ONLY or
 MemoryOnly). Personally, I like 4) with MEMORY_ONLY. Any
  thoughts?

 Best,
 Xiangrui


 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org


   
  
 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: enum-like types in Spark

2015-03-05 Thread Mridul Muralidharan
  I have a strong dislike for java enum's due to the fact that they
are not stable across JVM's - if it undergoes serde, you end up with
unpredictable results at times [1].
One of the reasons why we prevent enum's from being key : though it is
highly possible users might depend on it internally and shoot
themselves in the foot.

Would be better to keep away from them in general and use something more stable.

Regards,
Mridul

[1] Having had to debug this issue for 2 weeks - I really really hate it.


On Thu, Mar 5, 2015 at 1:08 PM, Imran Rashid iras...@cloudera.com wrote:
 I have a very strong dislike for #1 (scala enumerations).   I'm ok with #4
 (with Xiangrui's final suggestion, especially making it sealed  available
 in Java), but I really think #2, java enums, are the best option.

 Java enums actually have some very real advantages over the other
 approaches -- you get values(), valueOf(), EnumSet, and EnumMap.  There has
 been endless debate in the Scala community about the problems with the
 approaches in Scala.  Very smart, level-headed Scala gurus have complained
 about their short-comings (Rex Kerr's name is coming to mind, though I'm
 not positive about that); there have been numerous well-thought out
 proposals to give Scala a better enum.  But the powers-that-be in Scala
 always reject them.  IIRC the explanation for rejecting is basically that
 (a) enums aren't important enough for introducing some new special feature,
 scala's got bigger things to work on and (b) if you really need a good
 enum, just use java's enum.

 I doubt it really matters that much for Spark internals, which is why I
 think #4 is fine.  But I figured I'd give my spiel, because every developer
 loves language wars :)

 Imran



 On Thu, Mar 5, 2015 at 1:35 AM, Xiangrui Meng men...@gmail.com wrote:

 `case object` inside an `object` doesn't show up in Java. This is the
 minimal code I found to make everything show up correctly in both
 Scala and Java:

 sealed abstract class StorageLevel // cannot be a trait

 object StorageLevel {
   private[this] case object _MemoryOnly extends StorageLevel
   final val MemoryOnly: StorageLevel = _MemoryOnly

   private[this] case object _DiskOnly extends StorageLevel
   final val DiskOnly: StorageLevel = _DiskOnly
 }

 On Wed, Mar 4, 2015 at 8:10 PM, Patrick Wendell pwend...@gmail.com
 wrote:
  I like #4 as well and agree with Aaron's suggestion.
 
  - Patrick
 
  On Wed, Mar 4, 2015 at 6:07 PM, Aaron Davidson ilike...@gmail.com
 wrote:
  I'm cool with #4 as well, but make sure we dictate that the values
 should
  be defined within an object with the same name as the enumeration (like
 we
  do for StorageLevel). Otherwise we may pollute a higher namespace.
 
  e.g. we SHOULD do:
 
  trait StorageLevel
  object StorageLevel {
case object MemoryOnly extends StorageLevel
case object DiskOnly extends StorageLevel
  }
 
  On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust 
 mich...@databricks.com
  wrote:
 
  #4 with a preference for CamelCaseEnums
 
  On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley jos...@databricks.com
  wrote:
 
   another vote for #4
   People are already used to adding () in Java.
  
  
   On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch java...@gmail.com
  wrote:
  
#4 but with MemoryOnly (more scala-like)
   
http://docs.scala-lang.org/style/naming-conventions.html
   
Constants, Values, Variable and Methods
   
Constant names should be in upper camel case. That is, if the
 member is
final, immutable and it belongs to a package object or an object,
 it
  may
   be
considered a constant (similar to Java'sstatic final members):
   
   
   1. object Container {
   2. val MyConstant = ...
   3. }
   
   
2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:
   
 Hi all,

 There are many places where we use enum-like types in Spark, but
 in
 different ways. Every approach has both pros and cons. I wonder
 whether there should be an official approach for enum-like
 types in
 Spark.

 1. Scala's Enumeration (e.g., SchedulingMode, WorkerState, etc)

 * All types show up as Enumeration.Value in Java.


   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html

 2. Java's Enum (e.g., SaveMode, IOMode)

 * Implementation must be in a Java file.
 * Values doesn't show up in the ScalaDoc:


   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode

 3. Static fields in Java (e.g., TripletFields)

 * Implementation must be in a Java file.
 * Doesn't need () in Java code.
 * Values don't show up in the ScalaDoc:


   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields

 4. Objects in Scala. (e.g., StorageLevel)

 * Needs () in Java code.
 * Values show up in both ScalaDoc and JavaDoc

Re: enum-like types in Spark

2015-03-04 Thread Michael Armbrust
#4 with a preference for CamelCaseEnums

On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley jos...@databricks.com
wrote:

 another vote for #4
 People are already used to adding () in Java.


 On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch java...@gmail.com wrote:

  #4 but with MemoryOnly (more scala-like)
 
  http://docs.scala-lang.org/style/naming-conventions.html
 
  Constants, Values, Variable and Methods
 
  Constant names should be in upper camel case. That is, if the member is
  final, immutable and it belongs to a package object or an object, it may
 be
  considered a constant (similar to Java’sstatic final members):
 
 
 1. object Container {
 2. val MyConstant = ...
 3. }
 
 
  2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:
 
   Hi all,
  
   There are many places where we use enum-like types in Spark, but in
   different ways. Every approach has both pros and cons. I wonder
   whether there should be an “official” approach for enum-like types in
   Spark.
  
   1. Scala’s Enumeration (e.g., SchedulingMode, WorkerState, etc)
  
   * All types show up as Enumeration.Value in Java.
  
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html
  
   2. Java’s Enum (e.g., SaveMode, IOMode)
  
   * Implementation must be in a Java file.
   * Values doesn’t show up in the ScalaDoc:
  
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode
  
   3. Static fields in Java (e.g., TripletFields)
  
   * Implementation must be in a Java file.
   * Doesn’t need “()” in Java code.
   * Values don't show up in the ScalaDoc:
  
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields
  
   4. Objects in Scala. (e.g., StorageLevel)
  
   * Needs “()” in Java code.
   * Values show up in both ScalaDoc and JavaDoc:
  
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$
  
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html
  
   It would be great if we have an “official” approach for this as well
   as the naming convention for enum-like values (“MEMORY_ONLY” or
   “MemoryOnly”). Personally, I like 4) with “MEMORY_ONLY”. Any thoughts?
  
   Best,
   Xiangrui
  
   -
   To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
   For additional commands, e-mail: dev-h...@spark.apache.org
  
  
 



enum-like types in Spark

2015-03-04 Thread Xiangrui Meng
Hi all,

There are many places where we use enum-like types in Spark, but in
different ways. Every approach has both pros and cons. I wonder
whether there should be an “official” approach for enum-like types in
Spark.

1. Scala’s Enumeration (e.g., SchedulingMode, WorkerState, etc)

* All types show up as Enumeration.Value in Java.
http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html

2. Java’s Enum (e.g., SaveMode, IOMode)

* Implementation must be in a Java file.
* Values doesn’t show up in the ScalaDoc:
http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode

3. Static fields in Java (e.g., TripletFields)

* Implementation must be in a Java file.
* Doesn’t need “()” in Java code.
* Values don't show up in the ScalaDoc:
http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields

4. Objects in Scala. (e.g., StorageLevel)

* Needs “()” in Java code.
* Values show up in both ScalaDoc and JavaDoc:
  
http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$
  
http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html

It would be great if we have an “official” approach for this as well
as the naming convention for enum-like values (“MEMORY_ONLY” or
“MemoryOnly”). Personally, I like 4) with “MEMORY_ONLY”. Any thoughts?

Best,
Xiangrui

-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org



Re: enum-like types in Spark

2015-03-04 Thread Joseph Bradley
another vote for #4
People are already used to adding () in Java.


On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch java...@gmail.com wrote:

 #4 but with MemoryOnly (more scala-like)

 http://docs.scala-lang.org/style/naming-conventions.html

 Constants, Values, Variable and Methods

 Constant names should be in upper camel case. That is, if the member is
 final, immutable and it belongs to a package object or an object, it may be
 considered a constant (similar to Java’sstatic final members):


1. object Container {
2. val MyConstant = ...
3. }


 2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:

  Hi all,
 
  There are many places where we use enum-like types in Spark, but in
  different ways. Every approach has both pros and cons. I wonder
  whether there should be an “official” approach for enum-like types in
  Spark.
 
  1. Scala’s Enumeration (e.g., SchedulingMode, WorkerState, etc)
 
  * All types show up as Enumeration.Value in Java.
 
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html
 
  2. Java’s Enum (e.g., SaveMode, IOMode)
 
  * Implementation must be in a Java file.
  * Values doesn’t show up in the ScalaDoc:
 
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode
 
  3. Static fields in Java (e.g., TripletFields)
 
  * Implementation must be in a Java file.
  * Doesn’t need “()” in Java code.
  * Values don't show up in the ScalaDoc:
 
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields
 
  4. Objects in Scala. (e.g., StorageLevel)
 
  * Needs “()” in Java code.
  * Values show up in both ScalaDoc and JavaDoc:
 
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$
 
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html
 
  It would be great if we have an “official” approach for this as well
  as the naming convention for enum-like values (“MEMORY_ONLY” or
  “MemoryOnly”). Personally, I like 4) with “MEMORY_ONLY”. Any thoughts?
 
  Best,
  Xiangrui
 
  -
  To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
  For additional commands, e-mail: dev-h...@spark.apache.org
 
 



Re: enum-like types in Spark

2015-03-04 Thread Aaron Davidson
I'm cool with #4 as well, but make sure we dictate that the values should
be defined within an object with the same name as the enumeration (like we
do for StorageLevel). Otherwise we may pollute a higher namespace.

e.g. we SHOULD do:

trait StorageLevel
object StorageLevel {
  case object MemoryOnly extends StorageLevel
  case object DiskOnly extends StorageLevel
}

On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust mich...@databricks.com
wrote:

 #4 with a preference for CamelCaseEnums

 On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley jos...@databricks.com
 wrote:

  another vote for #4
  People are already used to adding () in Java.
 
 
  On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch java...@gmail.com
 wrote:
 
   #4 but with MemoryOnly (more scala-like)
  
   http://docs.scala-lang.org/style/naming-conventions.html
  
   Constants, Values, Variable and Methods
  
   Constant names should be in upper camel case. That is, if the member is
   final, immutable and it belongs to a package object or an object, it
 may
  be
   considered a constant (similar to Java’sstatic final members):
  
  
  1. object Container {
  2. val MyConstant = ...
  3. }
  
  
   2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:
  
Hi all,
   
There are many places where we use enum-like types in Spark, but in
different ways. Every approach has both pros and cons. I wonder
whether there should be an “official” approach for enum-like types in
Spark.
   
1. Scala’s Enumeration (e.g., SchedulingMode, WorkerState, etc)
   
* All types show up as Enumeration.Value in Java.
   
   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html
   
2. Java’s Enum (e.g., SaveMode, IOMode)
   
* Implementation must be in a Java file.
* Values doesn’t show up in the ScalaDoc:
   
   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode
   
3. Static fields in Java (e.g., TripletFields)
   
* Implementation must be in a Java file.
* Doesn’t need “()” in Java code.
* Values don't show up in the ScalaDoc:
   
   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields
   
4. Objects in Scala. (e.g., StorageLevel)
   
* Needs “()” in Java code.
* Values show up in both ScalaDoc and JavaDoc:
   
   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$
   
   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html
   
It would be great if we have an “official” approach for this as well
as the naming convention for enum-like values (“MEMORY_ONLY” or
“MemoryOnly”). Personally, I like 4) with “MEMORY_ONLY”. Any
 thoughts?
   
Best,
Xiangrui
   
-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org
   
   
  
 



Re: enum-like types in Spark

2015-03-04 Thread Stephen Boesch
#4 but with MemoryOnly (more scala-like)

http://docs.scala-lang.org/style/naming-conventions.html

Constants, Values, Variable and Methods

Constant names should be in upper camel case. That is, if the member is
final, immutable and it belongs to a package object or an object, it may be
considered a constant (similar to Java’sstatic final members):


   1. object Container {
   2. val MyConstant = ...
   3. }


2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:

 Hi all,

 There are many places where we use enum-like types in Spark, but in
 different ways. Every approach has both pros and cons. I wonder
 whether there should be an “official” approach for enum-like types in
 Spark.

 1. Scala’s Enumeration (e.g., SchedulingMode, WorkerState, etc)

 * All types show up as Enumeration.Value in Java.

 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html

 2. Java’s Enum (e.g., SaveMode, IOMode)

 * Implementation must be in a Java file.
 * Values doesn’t show up in the ScalaDoc:

 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode

 3. Static fields in Java (e.g., TripletFields)

 * Implementation must be in a Java file.
 * Doesn’t need “()” in Java code.
 * Values don't show up in the ScalaDoc:

 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields

 4. Objects in Scala. (e.g., StorageLevel)

 * Needs “()” in Java code.
 * Values show up in both ScalaDoc and JavaDoc:

 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$

 http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html

 It would be great if we have an “official” approach for this as well
 as the naming convention for enum-like values (“MEMORY_ONLY” or
 “MemoryOnly”). Personally, I like 4) with “MEMORY_ONLY”. Any thoughts?

 Best,
 Xiangrui

 -
 To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
 For additional commands, e-mail: dev-h...@spark.apache.org




Re: enum-like types in Spark

2015-03-04 Thread Patrick Wendell
I like #4 as well and agree with Aaron's suggestion.

- Patrick

On Wed, Mar 4, 2015 at 6:07 PM, Aaron Davidson ilike...@gmail.com wrote:
 I'm cool with #4 as well, but make sure we dictate that the values should
 be defined within an object with the same name as the enumeration (like we
 do for StorageLevel). Otherwise we may pollute a higher namespace.

 e.g. we SHOULD do:

 trait StorageLevel
 object StorageLevel {
   case object MemoryOnly extends StorageLevel
   case object DiskOnly extends StorageLevel
 }

 On Wed, Mar 4, 2015 at 5:37 PM, Michael Armbrust mich...@databricks.com
 wrote:

 #4 with a preference for CamelCaseEnums

 On Wed, Mar 4, 2015 at 5:29 PM, Joseph Bradley jos...@databricks.com
 wrote:

  another vote for #4
  People are already used to adding () in Java.
 
 
  On Wed, Mar 4, 2015 at 5:14 PM, Stephen Boesch java...@gmail.com
 wrote:
 
   #4 but with MemoryOnly (more scala-like)
  
   http://docs.scala-lang.org/style/naming-conventions.html
  
   Constants, Values, Variable and Methods
  
   Constant names should be in upper camel case. That is, if the member is
   final, immutable and it belongs to a package object or an object, it
 may
  be
   considered a constant (similar to Java'sstatic final members):
  
  
  1. object Container {
  2. val MyConstant = ...
  3. }
  
  
   2015-03-04 17:11 GMT-08:00 Xiangrui Meng men...@gmail.com:
  
Hi all,
   
There are many places where we use enum-like types in Spark, but in
different ways. Every approach has both pros and cons. I wonder
whether there should be an official approach for enum-like types in
Spark.
   
1. Scala's Enumeration (e.g., SchedulingMode, WorkerState, etc)
   
* All types show up as Enumeration.Value in Java.
   
   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/scheduler/SchedulingMode.html
   
2. Java's Enum (e.g., SaveMode, IOMode)
   
* Implementation must be in a Java file.
* Values doesn't show up in the ScalaDoc:
   
   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.network.util.IOMode
   
3. Static fields in Java (e.g., TripletFields)
   
* Implementation must be in a Java file.
* Doesn't need () in Java code.
* Values don't show up in the ScalaDoc:
   
   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.graphx.TripletFields
   
4. Objects in Scala. (e.g., StorageLevel)
   
* Needs () in Java code.
* Values show up in both ScalaDoc and JavaDoc:
   
   
  
 
 http://spark.apache.org/docs/latest/api/scala/#org.apache.spark.storage.StorageLevel$
   
   
  
 
 http://spark.apache.org/docs/latest/api/java/org/apache/spark/storage/StorageLevel.html
   
It would be great if we have an official approach for this as well
as the naming convention for enum-like values (MEMORY_ONLY or
MemoryOnly). Personally, I like 4) with MEMORY_ONLY. Any
 thoughts?
   
Best,
Xiangrui
   
-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org
   
   
  
 


-
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org