potiuk commented on issue #27674: URL: https://github.com/apache/airflow/issues/27674#issuecomment-1320993162
Hello @alexott - I have always been valued the work of yours. And please don't treat that perspnally. This comment has no personal attack on you in any way, but your comment come to me as a bit of shock. And I presume you are speaking in the name of Databricks as an organisation here. If not, I am even a bit more shocked. Are you seriously saying that the work of volunteers that prepare and release the ASF software mostly in their free (or sponsored by some responsible players) should be impacted to delay/no time that commercial company - Databricks - interested in their (Databricks) intergration could not afford any of their employees to spend any time on it during their day job? Do I interpret your words well? Are you seriously saying that Databricks has an expectation that their integration will Airflow works and get tested, but they are not able to allocate time of their employees working on it to verify it during their day job, and instead they expect that Airflow PMC process should be changed to impact (and delay) general release process of 70+ providers because of that? If that's the case, I am literally shocked - even stunned. It shows great misunderstanding of the role and contributions that the company can make to the project that they deeply care about. (again - I have nothing to you personally, It's a comment to the organization of yours). > The problem here was that voting was open only during the work week, not over the weekend as usual. And I have time only over weekend. I agree about having test coverage, but it work in progress. There is no "usual" thing here. I checked and in the past sometimes voting on Sat, sometimes on Mon, but sometimes on Thu. The 72 hours that the ASF requires is there for a reason. It should give not only a lot of time for reaction, but also should account for geographical locations https://www.apache.org/foundation/voting.html > Voting periods should generally run for at least 72 hours to provide an opportunity for all concerned persons to participate, regardless of their geographic location. Yes. There is "at least", but there is nothig about "make sure it goes through" weekend. Surely companies like Databricks have on-call-rotation with "hours" SLAs for their critical problems, and giving 72 Hrs notice is quite a lot of time to be able to react for the company. Do I understand the situation correctly? Just another comment there - If your organisation wants to really have a control over the release process and testing - it's perfectly possible. We could actually let databricks release their own provider (not as a community one). Databricks is free to do that - we can move the "apache-airflow-providers-databricks" and let databricks release their own "databricks-airflow-provider" (same as Great Expectations did). Then you will have fulll control over release, testing and changes. You (Databricks) can do it very easily. Being part of the community and ASF-ruled project is that you take both benefits and limitations that are coming from that. We do all the release process, we have a number of people (like @kazanzhy ) who contribute and improve the stuff there. This all comes as a benefit for Databricks, as they don't have to necessary put their own effort there into improving and maintaining the libraries. And it comes with the cost - the cost is that release process of ASF has to be followed. The cost is that if you should make sure the code is sufficiently covered by unit tests. Finally the cost is that when we put it up for release Databricks have 72 hrs to see if they are ok with the release - with all the prior notifications, warnings and even listing details list of changes so that Databricks can focus just on this. And even more - with the AIP-47, you have an apportunity (as Databricks) to create, maintain, develop and RUN system tests for your provider. We have a very well defined way how to build and add those tests, yet we EXPECT Databricks ( similarly as Amazon and Google who pave the way) to eventually run and report the status of their end-2-end integration of the provider. That would have caught the problem way earlier. And you are absolutely free to invest your time and effort to make those examples/system_tests fully automatically runnable, have an account to run them and actually run and report result of those on whatever frequency you think is appropriate. This is something Databricks could do today. Just needs to invest. And this is the way how the "testability" of the Databricks provider can be improved. Not by shifting voting to weekends. I'd really love that Databricks does the investment there to improve the test quality tehre. If they care about the quality of their integration - AIP-47 is precisely about that. And the best way of doing it. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
