Github user henryr commented on the issue:

    https://github.com/apache/spark/pull/19683
  
    My guess is that it's safe to do so in our case because of the immediate 
projection that happens. In general, emitting JoinedRows where the RHS row is 
shared between all JoinedRows could be a problem if some operator mutates that 
RHS row. There may be other issues that I'm not aware of (perhaps memory 
management concerns about transferring resources between operators).
    
    @davies @cloud-fan you guys collaborated on SPARK-13476 - do you have any 
input on whether it's safe for us to skip the projection inside the generator 
if we know the next operator will immediately do a projection? (For context: 
projecting all JoinedRows into UnsafeRows inside a generator is very very slow 
if the input row is huge (e.g. contains a large array), and that's usually 
wasted work because most of the input row gets projected out immediately 
anyhow).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

Reply via email to