://www.linkedin.com/in/salihoztop
--
*From:* Suraj Shetiya surajshet...@gmail.com
*To:* Michael Armbrust mich...@databricks.com
*Cc:* Salih Oztop soz...@yahoo.com; user@spark.apache.org
user@spark.apache.org; megha.sridh...@cynepia.com
*Sent:* Thursday, July 2, 2015
Date: Jul 2, 2015 12:49 AM
Subject: Re: Spark Dataframe 1.4 (GroupBy partial match)
To: Suraj Shetiya surajshet...@gmail.com
Cc: Salih Oztop soz...@yahoo.com, user@spark.apache.org
user@spark.apache.org
You should probably write a UDF that uses regular expression or other
string munging
.
If you want to count the 2015 records than it is possible.
Kind Regards
Salih Oztop
--
*From:* Suraj Shetiya surajshet...@gmail.com
*To:* user@spark.apache.org
*Sent:* Tuesday, June 30, 2015 3:05 PM
*Subject:* Spark Dataframe 1.4 (GroupBy partial match)
I have
I have a dataset (trimmed and simplified) with 2 columns as below.
DateSubject
2015-01-14 SEC Inquiry
2014-02-12 Happy birthday
2014-02-13 Re: Happy birthday
2015-01-16 Re: SEC Inquiry
2015-01-18 Fwd: Re: SEC Inquiry
I have imported the same in a
Hi,
I wanted to obtain a grouped by frame from a dataframe.
A snippet of the column on which I need to perform groupby is below.
df.select(To).show()
To
ArrayBuffer(vance...
ArrayBuffer(vance...
ArrayBuffer(rober...
ArrayBuffer(richa...
ArrayBuffer(guill...
ArrayBuffer(m..pr...
Hi,
I have come across ways of building pipeline of input/transform and output
pipelines with Java (Google Dataflow/Spark etc). I also understand that
Spark itelf provides ways for creating a pipeline within mlib for
MLtransforms (primarily fit) Both of the above are available in Java/Scala