[GitHub] spark pull request: [SPARK-9643] Upgrade pyrolite to 4.9

2015-10-19 Thread angelini
Github user angelini commented on the pull request: https://github.com/apache/spark/pull/7950#issuecomment-149241211 The timezone issue was due to our code, and we fixed it out of band. We've been running with 4.9 for a couple of months with no problem. :+1: to bumping i

[GitHub] spark pull request: [SPARK-9643] Upgrade pyrolite to 4.9

2015-08-12 Thread angelini
Github user angelini commented on the pull request: https://github.com/apache/spark/pull/7950#issuecomment-130485312 For the first part, if there was a flag to say: "Do not verify schema" which would avoid the following block: https://github.com/apache/spark/blob/mas

[GitHub] spark pull request: [SPARK-9643] Upgrade pyrolite to 4.9

2015-08-12 Thread angelini
Github user angelini commented on the pull request: https://github.com/apache/spark/pull/7950#issuecomment-130473017 So I did some more digging into this issue and found that the updated Pyrolite version was only necessary because we were skipping the `map(schema.toInternal)` found

[GitHub] spark pull request: [SPARK-9643] Upgrade pyrolite to 4.9

2015-08-11 Thread angelini
Github user angelini commented on the pull request: https://github.com/apache/spark/pull/7950#issuecomment-129914362 I should be able to work on the regression test this week. But I do agree that upgrading makes sense if we trust Pyrolite's tests. --- If your project is set u

[GitHub] spark pull request: [SPARK-9643] Upgrade pyrolite to 4.9

2015-08-05 Thread angelini
Github user angelini commented on the pull request: https://github.com/apache/spark/pull/7950#issuecomment-128102504 @srowen :ok_hand:, I'll re-purpose it --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your pr

[GitHub] spark pull request: [SPARK-9643] Upgrade pyrolite to 4.9

2015-08-05 Thread angelini
Github user angelini commented on the pull request: https://github.com/apache/spark/pull/7950#issuecomment-128101264 Should I create another JIRA? I've already created one for this issue, linked in the description. --- If your project is set up for it, you can reply to this

[GitHub] spark pull request: [SPARK-9643] Upgrade pyrolite to 4.9

2015-08-05 Thread angelini
Github user angelini commented on the pull request: https://github.com/apache/spark/pull/7950#issuecomment-128094701 @davies Using a build from the latest master, I still got the above error. `net.razorvine.pickle.PickleException: invalid pickle data for datetime; expected 1

[GitHub] spark pull request: [SPARK-9643] Upgrade pyrolite to 4.9

2015-08-05 Thread angelini
Github user angelini commented on the pull request: https://github.com/apache/spark/pull/7950#issuecomment-128085463 @davies sure thing. I'll try that and report back. But does it make sense to fix that bug in Spark? If it's fixed upstream aren't we best to use

[GitHub] spark pull request: Upgrade pyrolite to 4.9

2015-08-05 Thread angelini
Github user angelini commented on the pull request: https://github.com/apache/spark/pull/7950#issuecomment-128067957 @mengxr @davies I needed 4.9 to be able to serialize datetime's with tzinfos. If not, `df.write.parquet('...')` would throw an "

[GitHub] spark pull request: Upgrade pyrolite to 4.9

2015-08-04 Thread angelini
GitHub user angelini opened a pull request: https://github.com/apache/spark/pull/7950 Upgrade pyrolite to 4.9 Includes: https://github.com/irmen/Pyrolite/pull/23 which fixes datetimes with timezones. @JoshRosen You can merge this pull request into a Git repository by

[GitHub] spark pull request: Fix reference to self.names in StructType

2015-07-29 Thread angelini
GitHub user angelini opened a pull request: https://github.com/apache/spark/pull/7766 Fix reference to self.names in StructType `names` is not defined in this context, I think you meant `self.names`. @davies You can merge this pull request into a Git repository by running

[GitHub] spark pull request: [SPARK-8202] [PYSPARK] fix infinite loop durin...

2015-06-09 Thread angelini
Github user angelini commented on a diff in the pull request: https://github.com/apache/spark/pull/6714#discussion_r32007325 --- Diff: python/pyspark/shuffle.py --- @@ -512,9 +512,6 @@ def load(f): f.close() chunks.append(load(open(path