I think repeatedly calling the contributors on this list a “cartel” is not 
conducive to a calm and amicable resolution.

You may have some history built up that led you to use that word, but to the 
rest of us it comes out of nowhere; you in fact opened this thread with that 
attack. If you keep making your case in this manner, you will just turn 
everyone against you.

If there is a history of what you feel is others stealing your work, please 
link to a few examples so we can see what you are seeing. If you can’t do that, 
then just focus on this current example. And try to refrain from calling people 
names unless your goal is just to have a fight, as opposed to resolving the 
problematic behavior so you can continue to contribute.

I am not a committer and don’t have any special role in this community. I am 
speaking just as an observer and regular contributor to the project.

> I have experienced this before, as recent as couple of months back ( 
> https://issues.apache.org/jira/browse/SPARK-54386)

For others following along, I took a look at this ticket and the associated 
PRs: #53261 <https://github.com/apache/spark/pull/53261> / #53100 
<https://github.com/apache/spark/pull/53100>

It looks like Asif is upset that he submitted a fix for the same issue a week 
or so prior to the fix that eventually got merged. But the fixes are different, 
and the one that got merged is a lot shorter, though they are both simple. The 
PR that got merged was submitted by someone who appears to be employed by 
Databricks; perhaps this is part of the “cartel” accusation. The two PRs were 
reviewed by different committers, however, and the one that got merged was 
merged in by someone who does _not_ work for Databricks.

I don’t see anything here other than the normal dynamic of a large and busy 
open source project. Committer attention is limited; things fall through the 
cracks; different contributors may occasionally work on the same thing without 
knowing about each other. A minor help to this specific problem would be to 
have some way of automatically linking issues that appear to be about the same 
thing.

Nick


> On May 28, 2026, at 11:33 AM, Asif Shahid <[email protected]> wrote:
> 
> Hi Peter, 
> Pls see inline for comments/ replies
> 
> On Thu, May 28, 2026 at 6:11 AM Peter Toth <[email protected] 
> <mailto:[email protected]>> wrote:
>> Hey Asif,
>> 
>> Are you referring to https://github.com/apache/spark/pull/49154/changes vs. 
>> https://github.com/apache/spark/pull/55644/changes? Those are definitely 
>> solving the same issue but I can assure you I wouldn't take any code from 
>> your PR without consulting with you first.
>  Yes Indeed Peter, I am referring to those.
> As for the fix, itself, is not indicative of any thing as its a one liner, 
> test has uncanny resemblance.
>  
>> As far as I remember, I opened SPARK-56694 / 
>> https://github.com/apache/spark/pull/55644 because I ran into that minor bug 
>> during the implementation of https://github.com/apache/spark/pull/55298.
>  
>> Sorry, I didn't check whether a ticket or PR already existed.
> 
> The below I am addressing to the whole cartel.:
> I have experienced this before, as recent as couple of months back ( 
> https://issues.apache.org/jira/browse/SPARK-54386)
> I have experienced,  my personal effort ( going into weeks) to debug, 
> reproduce issue reliably , being hijacked by members, without even discussing 
> the fix proposed, ( by opening new PRs). ( If interested, I can provide 
> details of the PRs / issues I am talking about)
> I have seen a perfectly valid PR being nixed , by following comment which 
> essentially said
> "  my code of making the cache lookup more effective , would result in 
> greater chances of stale cache being picked,  which already spark suffers 
> from."
> Now the PR was related to collapsing the projects in analysis phase, and side 
> effect was cache pick up being more sensitive. 
> So this is such a frivolous reason to nix the PR , because "staleness" is an 
> underlying existing issue which had nothing to do with my PR. And its more 
> amusing , that if a DB is giving even one wrong result in millions, that 
> makes all the results a suspect in any case. It does not matter at what 
> frequency this occurs. To me the real reason was code complexity ( & more 
> likely  the loss of control of the code to the outsider).
> 
> The reason I call this open source community as cartel, is because, I have 
> seen the way it works pretty closely and have experienced it in the email 
> exchanges which happen on this group.
> For the same PR , same issue,  if advertently or inadvertently , other person 
> ( especially a member) gets his changes pushed, by the virtue of his 
> standing/position and the "for profit" company the person works, how would 
> you give the credit to the original person who discovered the issue first / 
> provided the fix?  
> Why are issues filed by some immediately worked upon by members ( some of 
> whom claim to be working full time on spark) ? Is it because certain 
> companies / groups ( for profit companies, mind you )  exert undue control, 
> or the petty newbee has to be in the good books of members ( with the hope 
> that at some point they will also reach that position of power ?)  
> 
> Given the AI advent and such occurrences,  how will you give due credit to 
> the original creators and how do you plan to prevent some member for taking 
> up idea of any old open PR ( which for reasons of complexity and non 
> technical reasons) ,  polishing it up and pushing it as their own?
> 
> I am also curious , am I the only one who is troubled by all this, or there 
> are others who have experienced it?
> 
> Regards
> Asif
>  
>> If you have further improvements please feel free to open a PR.
>> 
>> Best,
>> Peter
>> 
>> On Thu, May 28, 2026 at 8:20 AM Asif Shahid <[email protected] 
>> <mailto:[email protected]>> wrote:
>>> Hi,
>>> I had filed a bug
>>>  https://issues.apache.org/jira/browse/SPARK-45866
>>> 
>>> I had also opened a PR for the same.
>>> 
>>> Now I see that the ticket I  filed is still open, but the issue has been 
>>> fixed using a new ticket 
>>> https://issues.apache.org/jira/browse/SPARK-56694
>>> 
>>> and on top of that the bug test and ofcourse the fix ( which in any case 
>>> would be same) has been taken from my PR for  
>>> https://github.com/apache/spark/pull/49154/changes#diff-137d880ff73623bf7a452bb84f9c3dbbb27ba929e7f5e070c6bff68cfc8ec71f
>>> 
>>> To me this is clear unethical conduct of cartel member, unless I am missing 
>>> some valid reason.
>>> 
>>> And the irony is that the fix is still incomplete, as I just found and 
>>> filed a new ticket
>>> https://issues.apache.org/jira/browse/SPARK-57126
>>> 
>>> I know that atleast some cartel members are insecure and think of OSS as 
>>> their fiefdom, but this sort of behaviour , I never expected.
>>> Regards
>>> Asif

Reply via email to