Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1415#issuecomment-48996149
I looked into the event logger code and it appears that codec change should
be fine. It figures out the codec for old data automatically anyway.
---
If your project is set
Github user andrewor14 commented on the pull request:
https://github.com/apache/spark/pull/1415#issuecomment-48996256
Yes, we log the codec used in a separate file so we don't lock ourselves
out of our old event logs. This change seems fine.
---
If your project is set up for it, you
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1415#issuecomment-48996763
@andrewor14 do we also log the block size, etc of the codec used ?
If yes, then atleast for event data we should be fine.
IIRC we use the codec to compress
Github user SparkQA commented on the pull request:
https://github.com/apache/spark/pull/1415#issuecomment-49001592
QA results for PR 1415:br- This patch FAILED unit tests.br- This patch
merges cleanlybr- This patch adds no public classesbrbrFor more
information see test
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1415#issuecomment-49005728
weird that test failures - unrelated to this change
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1415#issuecomment-49005818
ah yes, blocksize is only used during compression time : and inferred from
stream during decompression.
Then only class name should be sufficient
---
If your
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1415#issuecomment-49005883
Yea the test failure isn't related.
If there is no objection, I'm going to merge this tomorrow. I will file a
jira ticket so we can prepend compression codec
Github user mridulm commented on the pull request:
https://github.com/apache/spark/pull/1415#issuecomment-49006312
Cant comment on tachyon since we dont use it and have no experience with it
unfortunately.
I am fine with this change for the rest.
---
If your project is set up
Github user pwendell commented on the pull request:
https://github.com/apache/spark/pull/1415#issuecomment-49090832
@rxin IIRC at one point we changed this before and it caused a performance
regression for our perf suite so we reverted it. At the time I think we were
running on
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1415#issuecomment-49099980
Yea - stability seems much more important than a small performance gain
---
If your project is set up for it, you can reply to this email and have your
reply appear on
Github user vanzin commented on the pull request:
https://github.com/apache/spark/pull/1415#issuecomment-49100370
Only the codec names are stored in the event logs; no other information is
currently recorded. But this change isn't really breaking anything in that
area. (And, by
Github user rxin commented on the pull request:
https://github.com/apache/spark/pull/1415#issuecomment-49100573
FYI filed JIRA: https://issues.apache.org/jira/browse/SPARK-2496
Compression streams should write its codec info to the stream
---
If your project is set up for it, you
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/1415
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is
13 matches
Mail list logo