[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-12-04 Thread TomaszGaweda
Github user TomaszGaweda closed the pull request at:

https://github.com/apache/spark/pull/22249


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-28 Thread TomaszGaweda
Github user TomaszGaweda commented on a diff in the pull request:

https://github.com/apache/spark/pull/22249#discussion_r213232338
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2459,6 +2459,26 @@ object functions {
 StringTrimLeft(e.expr, Literal(trimString))
   }
 
+  /**
+* Extracts a part from a URL.
+*
+* @group string_funcs
+* @since 2.4.0
+*/
+  def parse_url(url: Column, partToExtract: String): Column = withExpr {
--- End diff --

@HyukjinKwon I see now. Yeah, wrapping in  the `Column` will be necessary, 
at least no string concatenation will be required


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-28 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22249#discussion_r213227841
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2459,6 +2459,26 @@ object functions {
 StringTrimLeft(e.expr, Literal(trimString))
   }
 
+  /**
+* Extracts a part from a URL.
+*
+* @group string_funcs
+* @since 2.4.0
+*/
+  def parse_url(url: Column, partToExtract: String): Column = withExpr {
--- End diff --

I mean, 

> I would suggest method that return handler for any registered function. 
So that you can write: SqlFunction something = 
spark.(...?).getFunction("parse_url")

Can this support strongly typed one?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-28 Thread TomaszGaweda
Github user TomaszGaweda commented on a diff in the pull request:

https://github.com/apache/spark/pull/22249#discussion_r213213173
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2459,6 +2459,26 @@ object functions {
 StringTrimLeft(e.expr, Literal(trimString))
   }
 
+  /**
+* Extracts a part from a URL.
+*
+* @group string_funcs
+* @since 2.4.0
+*/
+  def parse_url(url: Column, partToExtract: String): Column = withExpr {
--- End diff --

@HyukjinKwon Thanks for the suggestion, however now users are complaining 
about stringly-typed system in Spark, there are libs like Frameless from 
Typelevel to archieve a bit more type safety. `expr` is springly-typed, while 
functions in `functions` object or accessed via `UserDefinedFunction` are a bit 
more type safe.


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-27 Thread HyukjinKwon
Github user HyukjinKwon commented on a diff in the pull request:

https://github.com/apache/spark/pull/22249#discussion_r213176616
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2459,6 +2459,26 @@ object functions {
 StringTrimLeft(e.expr, Literal(trimString))
   }
 
+  /**
+* Extracts a part from a URL.
+*
+* @group string_funcs
+* @since 2.4.0
+*/
+  def parse_url(url: Column, partToExtract: String): Column = withExpr {
--- End diff --

Can't we just use `expr` instead?


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-27 Thread TomaszGaweda
Github user TomaszGaweda commented on a diff in the pull request:

https://github.com/apache/spark/pull/22249#discussion_r213126158
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2459,6 +2459,26 @@ object functions {
 StringTrimLeft(e.expr, Literal(trimString))
   }
 
+  /**
+* Extracts a part from a URL.
+*
+* @group string_funcs
+* @since 2.4.0
+*/
+  def parse_url(url: Column, partToExtract: String): Column = withExpr {
--- End diff --

Ok, tomorrow I will create a Jira and start working on it. Thanks for your 
comments! :)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-27 Thread mgaido91
Github user mgaido91 commented on a diff in the pull request:

https://github.com/apache/spark/pull/22249#discussion_r213121794
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2459,6 +2459,26 @@ object functions {
 StringTrimLeft(e.expr, Literal(trimString))
   }
 
+  /**
+* Extracts a part from a URL.
+*
+* @group string_funcs
+* @since 2.4.0
+*/
+  def parse_url(url: Column, partToExtract: String): Column = withExpr {
--- End diff --

I like this idea too


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22249#discussion_r213120096
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2459,6 +2459,26 @@ object functions {
 StringTrimLeft(e.expr, Literal(trimString))
   }
 
+  /**
+* Extracts a part from a URL.
+*
+* @group string_funcs
+* @since 2.4.0
+*/
+  def parse_url(url: Column, partToExtract: String): Column = withExpr {
--- End diff --

@TomaszGaweda This sounds a good idea by returning a handler for built-in 
functions. cc @rxin 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-27 Thread TomaszGaweda
Github user TomaszGaweda commented on a diff in the pull request:

https://github.com/apache/spark/pull/22249#discussion_r213112726
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2459,6 +2459,26 @@ object functions {
 StringTrimLeft(e.expr, Literal(trimString))
   }
 
+  /**
+* Extracts a part from a URL.
+*
+* @group string_funcs
+* @since 2.4.0
+*/
+  def parse_url(url: Column, partToExtract: String): Column = withExpr {
--- End diff --

In long term, I would suggest method that return handler for any registered 
function. So that you can write: SqlFunction something = 
spark.(...?).getFunction("parse_url"). Now `spark.udf.register` returns a 
handler for UDF, something similar for getting any kind of registered function 
may be helpful. However, it's a lot more work, so now I've just proposed to add 
this function :)


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-27 Thread TomaszGaweda
Github user TomaszGaweda commented on a diff in the pull request:

https://github.com/apache/spark/pull/22249#discussion_r213110408
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2459,6 +2459,26 @@ object functions {
 StringTrimLeft(e.expr, Literal(trimString))
   }
 
+  /**
+* Extracts a part from a URL.
+*
+* @group string_funcs
+* @since 2.4.0
+*/
+  def parse_url(url: Column, partToExtract: String): Column = withExpr {
--- End diff --

This PR is created after this StackOverflow question: 
https://stackoverflow.com/questions/52041342/how-to-parse-url-in-spark-sqlscala/52043771

I'm not sure how often it is used, however most of functions are available 
in functions object to make Scala and SQL interfaces similar. If you think it's 
useless - please let me know, I'll just close this PR


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-27 Thread gatorsmile
Github user gatorsmile commented on a diff in the pull request:

https://github.com/apache/spark/pull/22249#discussion_r213109101
  
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/functions.scala ---
@@ -2459,6 +2459,26 @@ object functions {
 StringTrimLeft(e.expr, Literal(trimString))
   }
 
+  /**
+* Extracts a part from a URL.
+*
+* @group string_funcs
+* @since 2.4.0
+*/
+  def parse_url(url: Column, partToExtract: String): Column = withExpr {
--- End diff --

When adding the functions into `object functions`, we need to check how 
useful they are. 


---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #22249: [SPARK-16281][SQL][FOLLOW-UP] Add parse_url to fu...

2018-08-27 Thread TomaszGaweda
GitHub user TomaszGaweda opened a pull request:

https://github.com/apache/spark/pull/22249

[SPARK-16281][SQL][FOLLOW-UP] Add parse_url to functions.scala

## What changes were proposed in this pull request?

Add `parse_url` function to `functions.scala`. This will allow users to use 
this functions without calling `selectExpr` or `spark.sql`

## How was this patch tested?

`testUrl` function was changed to test also this change

Please review http://spark.apache.org/contributing.html before opening a 
pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/TomaszGaweda/spark parseUrl

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/22249.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #22249


commit 4002cb9c48b608123845b89af9f77e137626f83f
Author: Tomasz Gaweda 
Date:   2018-08-27T20:05:50Z

SPARK-16281 Follow-up: parse_url in functions.scala




---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org