[GitHub] [airflow] potiuk commented on issue #27674: Status of testing Providers that were prepared on November 15, 2022

GitBox Sat, 19 Nov 2022 15:41:27 -0800


potiuk commented on issue #27674:
URL: https://github.com/apache/airflow/issues/27674#issuecomment-1320993162


   Hello @alexott - I have always been valued the work of yours. And please 
don't treat that perspnally. This comment has no personal attack on you in any 
way, but your comment come to me as a bit of shock. And I presume you are 
speaking in the name of Databricks as an organisation here. If not, I am even a 
bit more shocked.
   
   Are you seriously saying that the work of volunteers that prepare and 
release the ASF software mostly in their free (or sponsored by some responsible 
players) should be impacted to delay/no time that commercial company - 
Databricks -  interested in their (Databricks) intergration could not afford 
any of their employees to spend any time on it during their day job? 
   
   Do I interpret your words well?
   
   Are you seriously saying that Databricks has an expectation that their 
integration will Airflow works and get tested, but they are not able to 
allocate time of their employees working on it to verify it during their day 
job, and instead they expect that Airflow PMC process should be changed to  
impact (and delay) general release process of 70+ providers because of that?
   
   If that's the case, I am literally shocked - even stunned. It shows great 
misunderstanding of the role and contributions that the company can make to the 
project that they deeply care about. (again - I have nothing to you personally, 
It's a comment to the organization of yours).  
   
   > The problem here was that voting was open only during the work week, not 
over the weekend as usual. And I have time only over weekend. I agree about 
having test coverage, but it work in progress.
   
   There is no "usual" thing here. I checked and in the past sometimes voting 
on Sat, sometimes on Mon, but sometimes on Thu. The 72 hours that the ASF 
requires is there for a reason. It should give not only a lot of time for 
reaction, but also should account for geographical locations
   
   https://www.apache.org/foundation/voting.html
   
   > Voting periods should generally run for at least 72 hours to provide an 
opportunity for all concerned persons to participate, regardless of their 
geographic location.
   
   Yes. There is "at least", but there is nothig about "make sure it goes 
through" weekend. Surely companies like Databricks have on-call-rotation with 
"hours" SLAs for their critical problems, and giving 72 Hrs notice is quite a 
lot of time to be able to react for the company.
   
   Do I understand the situation correctly?
   
   Just another comment there - If your organisation wants to really have a 
control over the release process and testing - it's perfectly possible. We 
could actually let databricks release their own provider (not as a community 
one). Databricks is free to do that - we can move the 
"apache-airflow-providers-databricks" and let databricks release their own 
"databricks-airflow-provider" (same as Great Expectations did). Then you will 
have fulll control over release, testing and changes. You (Databricks) can do 
it very easily.
   
   Being part of the community and ASF-ruled project is that you take both 
benefits and limitations that are coming from that. We do all the release 
process, we have a number of people (like @kazanzhy ) who contribute and 
improve the stuff there. This all comes as a benefit for Databricks, as they 
don't have to necessary put their own effort there into improving and 
maintaining the libraries. And it comes with the cost - the cost is that 
release process of ASF has to be followed. The cost is that if you should make 
sure the code is sufficiently covered by unit tests. Finally the cost is that 
when we put it up for release Databricks have  72 hrs to see if they are ok 
with the release - with all the prior notifications, warnings and even listing 
details list of changes so that Databricks can focus just on this.
   
   And even more - with the AIP-47, you have an apportunity (as Databricks) to 
create, maintain, develop and RUN system tests for your provider. We have a 
very well defined way how to build and add those tests, yet we EXPECT 
Databricks ( similarly as Amazon and Google who pave the way) to eventually run 
and report the status of their end-2-end integration of the provider.  That 
would have caught the problem way earlier. And you are absolutely free to 
invest your time and effort to make those examples/system_tests fully 
automatically runnable, have an account to run them and actually run and report 
result of those on whatever frequency you think is appropriate. This is 
something Databricks could do today. Just needs to invest. And this is the way 
how the "testability" of the Databricks provider can be improved. Not by 
shifting voting to weekends.
   
   I'd really love that Databricks does the investment there to improve the 
test quality tehre. If they care about the quality of their integration - 
AIP-47 is precisely about that. And the best way of doing it.
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [airflow] potiuk commented on issue #27674: Status of testing Providers that were prepared on November 15, 2022

Reply via email to