Once you convert your data to a dataframe (look at spark-csv), try
df.write.partitionBy("", "mm").save("...").
On Thu, Oct 1, 2015 at 4:11 PM, haridass saisriram <
haridass.saisri...@gmail.com> wrote:
> Hi,
>
> I am trying to find a simple example to read a data file on HDFS. The
> file
Hi,
I am trying to find a simple example to read a data file on HDFS. The
file has the following format
a , b , c ,,mm
a1,b1,c1,2015,09
a2,b2,c2,2014,08
I would like to read this file and store it in HDFS partitioned by year and
month. Something like this
/path/to/hdfs//mm
I want to