[GitHub] spark pull request: [SPARK-4999][Streaming] Change storeInBlockMan...

tdas Tue, 06 Jan 2015 01:25:03 -0800

Github user tdas commented on the pull request:

    https://github.com/apache/spark/pull/3906#issuecomment-68843161
  
    If you are using a window operations, then previous batches data may need 
to be access multiple times. If we dont put the data in WAL back in memory, the 
system will have to read the data multiple times from the WAL. That's going to 
be very slow, isnt it.
    
    A smarter thing to do is to figure (based on the transformations) whether 
the data is going to required multiple times or not, and accordingly store the 
data in BM. Just turning it blindly setting it to false will cause a 
performance regression.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org

[GitHub] spark pull request: [SPARK-4999][Streaming] Change storeInBlockMan...

Reply via email to