Hi Jacek,
Thanks for your reply. Your provided case was actually same as my second
option in my original email. What I'm wondering was the difference between
those two regarding query performance or efficiency.
On Tue, May 14, 2019 at 3:51 PM Jacek Laskowski wrote:
> Hi,
>
> For this
Hi,
For this particular case I'd use Column.substr (
http://spark.apache.org/docs/latest/api/scala/index.html#org.apache.spark.sql.Column),
e.g.
val ns = Seq(("hello world", 1, 5)).toDF("w", "b", "e")
scala> ns.select($"w".substr($"b", $"e" - $"b" + 1) as "demo").show
+-+
| demo|
+-+
For example, I have a dataframe with 3 columns: URL, START, END. For each
url from URL column, I want to fetch a substring of it starting from START
and ending at END.
++--+-+
|URL|START |END |
++--+-+