+1
On Wed, Mar 8, 2023 at 10:40 PM Winston Lai wrote:
> +1, any webinar on Spark related topic is appreciated
>
> Thank You & Best Regards
> Winston Lai
> --
> *From:* asma zgolli
> *Sent:* Thursday, March 9, 2023 5:43:06 AM
> *To:* karan alang
> *Cc:* Mich
Hey Mich
my 2 cents on top of Jerry's.
for reusable @fixtures across your tests, i'd leverage conftest.py and put
all of them there -if number is not too big. OW. as you say, you can
create tests\fixtures where you place all of them there
in term of extractHiveDAta for a @fixture it is
Hello
my 2cents/./
well that will be an integ test to write to a 'dev' database. (which you
might pre-populate and clean up after your runs, so you can have repeatable
data).
then either you
1 - use normal sql and assert that the values you store in your dataframe
are the same as what you get
;> Mich
>>>
>>> *Disclaimer:* Use it at your own risk. Any and all responsibility for
>>> any loss, damage or destruction of data or any other property which may
>>> arise from relying on this email's technical content is explicitly
>>> disclaimed. The
all responsibility for any
> loss, damage or destruction of data or any other property which may arise
> from relying on this email's technical content is explicitly disclaimed.
> The author will in no case be liable for any monetary damages arising from
> such loss, damage or destruction.
rt + numRows - 1
>>
>> print ("starting at ID = ",start, ",ending on = ",end)
>>
>> Range = range(start, end+1)
>>
>> ## This traverses through the Range and increment "x" by one unit each
>> time, and that x value i
copying and pasting your code code in a jup notebook works fine. that is,
using my own version of Range which is simply a list of numbers
how bout this.. does this work fine?
list(map(lambda x: (x, clustered(x, numRows)),[1,2,3,4]))
If it does, i'd look in what's inside your Range and what you
Hey
they are good libraries..to get you started. Have used both of them..
unfortunately -as far as i saw when i started to use them - only few
people maintains them.
But you can get pointers out of them for writing tests. the code below can
get you started
What you'll need is
- a method to
Hey
My 2 cents on CI/Cd for pyspark. You can leverage pytests + holden karau's
spark testing libs for CI thus giving you `almost` same functionality as
Scala - I say almost as in Scala you have nice and descriptive funcspecs -
For me choice is based on expertise.having worked with teams which