subject:"Explode row with start and end dates into row for each date"

Re: Explode row with start and end dates into row for each date

2016-06-22 Thread Saurabh Sardeshpande

I don't think there would be any issues since MLlib is part of Spark as against being an external package. Most of the problems I've had to deal were because of the existence of both versions of Python on a system, and not Python 3 itself. On Wed, Jun 22, 2016 at 3:51 PM, John Aherne

Re: Explode row with start and end dates into row for each date

2016-06-22 Thread John Aherne

Thanks Saurabh! That explode function looks like it is exactly what I need. We will be using MLlib quite a lot - Do I have to worry about python versions for that? John On Wed, Jun 22, 2016 at 4:34 PM, Saurabh Sardeshpande wrote: > Hi John, > > If you can do it in Hive,

Re: Explode row with start and end dates into row for each date

2016-06-22 Thread Saurabh Sardeshpande

Hi John, If you can do it in Hive, you should be able to do it in Spark. Just make sure you import HiveContext instead of SQLContext. If your intent is to explore rather than get stuff done, I've not aware of any RDD operations that do this for you, but there is a DataFrame operation called

Explode row with start and end dates into row for each date

2016-06-22 Thread John Aherne

Hi Everyone, I am pretty new to Spark (and the mailing list), so forgive me if the answer is obvious. I have a dataset, and each row contains a start date and end date. I would like to explode each row so that each day between the start and end dates becomes its own row. e.g. row1 2015-01-01