[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2016-05-05 Thread aa8y
Github user aa8y closed the pull request at:

https://github.com/apache/spark/pull/10247


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2016-05-05 Thread aa8y
Github user aa8y commented on the pull request:

https://github.com/apache/spark/pull/10247#issuecomment-217194679
  
Won't do.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2016-05-05 Thread srowen
Github user srowen commented on the pull request:

https://github.com/apache/spark/pull/10247#issuecomment-217117009
  
I believe this PR should be closed


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/10247#discussion_r47579246
  
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala ---
@@ -316,6 +317,21 @@ trait Row extends Serializable {
   }
 
   /**
+   * Returns the Optional value at position i.
+   * If the value is null or if it cannot be cast to the requested type, 
None is returned.
+   */
+  def getAsOpt[T](i: Int): Option[T] = {
+val value = {
+  if (i < length) {
+if (isNullAt(i)) None
+else catching(classOf[ClassCastException]) opt getAs[T](i)
--- End diff --

We typically avoid infix notation and this style of `if`.

https://cwiki.apache.org/confluence/display/SPARK/Spark+Code+Style+Guide

Also, why do we want to hide `ClassCastException` from the user, this seems 
more confusing than helpful.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread aa8y
Github user aa8y commented on a diff in the pull request:

https://github.com/apache/spark/pull/10247#discussion_r47579991
  
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala ---
@@ -325,6 +341,14 @@ trait Row extends Serializable {
   def getAs[T](i: Int): T = get(i).asInstanceOf[T]
 
   /**
+   * Returns the Optional value of a given fieldName.
+   * If the value is null or if it cannot be cast to the requested type, 
None is returned. None is
+   * also returned if either the schema is not defined or else the given 
fieldName does not exist
+   * in the schema.
+   */
+  def getAsOpt[T](fieldName: String): Option[T] = None
--- End diff --

I considered returning a `Try`. But there are a lot of cases, personally, 
where I don't care about the error since I know the kind of bad data I can 
expect. But that's just me.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread aa8y
Github user aa8y commented on the pull request:

https://github.com/apache/spark/pull/10247#issuecomment-164600717
  
@marmbrus I agree. I wasn't a fan of `getAsOpt` but couldn't think of a 
better name then. I've updated them all to `getOption`. So does this look 
better now?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread aa8y
Github user aa8y commented on the pull request:

https://github.com/apache/spark/pull/10247#issuecomment-164604835
  
Isn't the whole point of `Option` to not throw exceptions? Actually it's to 
not return `null`s I guess. But throwing an exception would defeat the purpose 
of returning an `Option`. If a person really wants that information, (s)he can 
still use the existing `getAs` methods.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread aa8y
Github user aa8y commented on the pull request:

https://github.com/apache/spark/pull/10247#issuecomment-164605606
  
What would you propose as a solution instead?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread aa8y
Github user aa8y commented on the pull request:

https://github.com/apache/spark/pull/10247#issuecomment-164594278
  
@jodersky: I incorporated your recommendations and also updated my branch 
with the current master. Can you or one of the admins please ask Jenkins to 
test this build. Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread jodersky
Github user jodersky commented on the pull request:

https://github.com/apache/spark/pull/10247#issuecomment-164596063
  
Great! Unfortunately I can't help with the tests though.
Check out this 
[page](https://cwiki.apache.org/confluence/display/SPARK/Committers#Committers-ReviewProcessandMaintainers)
 to get a list of maintainers and their components. In this case I think @rxin 
and @marmbrus should be the ones to check your PR


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/10247#issuecomment-164605446
  
I think the point of `Option` is typesafe null handling, not hiding 
structural problems with your code (i.e. using the wrong type for a column).  
If this fails then you are going to get `None` always, not just for bad data 
(because the type of a column will always be the same.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/10247#discussion_r47579389
  
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala ---
@@ -325,6 +341,14 @@ trait Row extends Serializable {
   def getAs[T](i: Int): T = get(i).asInstanceOf[T]
 
   /**
+   * Returns the Optional value of a given fieldName.
+   * If the value is null or if it cannot be cast to the requested type, 
None is returned. None is
+   * also returned if either the schema is not defined or else the given 
fieldName does not exist
+   * in the schema.
+   */
+  def getAsOpt[T](fieldName: String): Option[T] = None
--- End diff --

Again, I'm not sure we want to hide errors from the user by returning 
`None`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/10247#issuecomment-164606013
  
Don't catch the `ClassCastException` or throw 
`UnsupportedOperationException` when the schema is not present (which will 
happen if you use `fieldIndexes` anyway).  Just check out the structure of the 
other get by field name methods.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/10247#discussion_r47579021
  
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala ---
@@ -316,6 +317,21 @@ trait Row extends Serializable {
   }
 
   /**
+   * Returns the Optional value at position i.
+   * If the value is null or if it cannot be cast to the requested type, 
None is returned.
+   */
+  def getAsOpt[T](i: Int): Option[T] = {
--- End diff --

`getOption` would be more inline with similar functions in the Scala 
collections API (i.e. `headOption` `reduceOption`).  We also typically avoid 
abbreviation unless its very common.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread marmbrus
Github user marmbrus commented on the pull request:

https://github.com/apache/spark/pull/10247#issuecomment-164601273
  
I still don't think we should swallow class cast exceptions.  It means the 
person doesn't know the schema of their data (and they probably want to know 
they are wrong).  Similarly I'd throw `UnsupportedOperationException` as we do 
elsewhere when the schema is not available.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-14 Thread marmbrus
Github user marmbrus commented on a diff in the pull request:

https://github.com/apache/spark/pull/10247#discussion_r47580230
  
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala ---
@@ -325,6 +341,14 @@ trait Row extends Serializable {
   def getAs[T](i: Int): T = get(i).asInstanceOf[T]
 
   /**
+   * Returns the Optional value of a given fieldName.
+   * If the value is null or if it cannot be cast to the requested type, 
None is returned. None is
+   * also returned if either the schema is not defined or else the given 
fieldName does not exist
+   * in the schema.
+   */
+  def getAsOpt[T](fieldName: String): Option[T] = None
--- End diff --

Yeah, but this is data coming out of Spark SQL, so the type will always be 
correct.  Bad data would not make it this far.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-11 Thread jodersky
Github user jodersky commented on a diff in the pull request:

https://github.com/apache/spark/pull/10247#discussion_r47417301
  
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/Row.scala ---
@@ -325,6 +341,14 @@ trait Row extends Serializable {
   def getAs[T](i: Int): T = get(i).asInstanceOf[T]
 
   /**
+   * Returns the Optional value of a given fieldName.
+   * If the value is null or if it cannot be cast to the requested type, 
None is returned. None is
+   * also returned if either the schema is not defined or else the given 
fieldName does not exist
+   * in the schema.
+   */
+  def getAsOpt[T](fieldName: String): Option[T] = 
None.asInstanceOf[Option[T]]
--- End diff --

Nitpick: `asInstanceOf` is not required here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-11 Thread jodersky
Github user jodersky commented on a diff in the pull request:

https://github.com/apache/spark/pull/10247#discussion_r47417343
  
--- Diff: 
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/rows.scala
 ---
@@ -211,6 +211,18 @@ class GenericRowWithSchema(values: Array[Any], 
override val schema: StructType)
   protected def this() = this(null, null)
 
   override def fieldIndex(name: String): Int = schema.fieldIndex(name)
+
+  override def getAsOpt[T](fieldName: String): Option[T] = {
+val index = schema match {
+  case null => -1
+  case _ =>
+val fieldNames = schema.fieldNames
+if (fieldNames.contains(fieldName)) fieldNames.indexOf(fieldName) 
else -1
+}
+val value = if (index < 0) None else getAsOpt[T](index)
+
+value.asInstanceOf[Option[T]]
--- End diff --

Nitpick: `asInstanceOf` is not required here


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-09 Thread aa8y
GitHub user aa8y opened a pull request:

https://github.com/apache/spark/pull/10247

[SPARK-11962] Added getAsOpt functions to Row and tests for it.

getAsOpt[T] functions have been added to Row and GeneicRowWithSchema to get 
the values present in a row object optionally. Corresponding tests have also 
been added. Filed a pull request again after fixing style issues.

This is a new pull request created to replace #10028.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/aa8y/spark master

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/10247.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #10247


commit 7e187055deea5feef2669b18a906b49dc4d24443
Author: Arun Allamsetty 
Date:   2015-12-10T06:02:31Z

Added getAsOpt methods and tests for it.




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request: [SPARK-11962] Added getAsOpt functions to Row ...

2015-12-09 Thread AmplabJenkins
Github user AmplabJenkins commented on the pull request:

https://github.com/apache/spark/pull/10247#issuecomment-163522426
  
Can one of the admins verify this patch?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org