Re: Parquet vectorized reader DELTA_BYTE_ARRAY

2017-05-25 Thread andreiL
I took a closer look and, yes the files were written with Parquet v2. For some reason Parquet v2 was set as the default, I set it back to Parquet v1. Thanks Michael and Ryan for the info. Andrei. -- View this message in context:

Re: SQL TIMESTAMP semantics vs. SPARK-18350

2017-05-25 Thread Ofir Manor
Reynold, my point is that Spark should aim to follow the SQL standard instead of rolling its own type system. If I understand correctly, the existing implementation is similar to TIMESTAMP WITH LOCAL TIMEZONE data type in Oracle.. In addition, there are the standard TIMESTAMP and TIMESTAMP WITH

Re: [VOTE] Apache Spark 2.2.0 (RC2)

2017-05-25 Thread Michael Allman
PR is here: https://github.com/apache/spark/pull/18112 > On May 25, 2017, at 10:28 AM, Michael Allman wrote: > > Michael, > > If you haven't started cutting the new RC, I'm working on a documentation PR > right now I'm

Re: [VOTE] Apache Spark 2.2.0 (RC2)

2017-05-25 Thread Michael Allman
Michael, If you haven't started cutting the new RC, I'm working on a documentation PR right now I'm hoping we can get into Spark 2.2 as a migration note, even if it's just a mention: https://issues.apache.org/jira/browse/SPARK-20888 . Michael

FYI - Kafka's built-in performance test tool

2017-05-25 Thread Ofir Manor
comes with source code. Some basic results from the VM, - Write every second 50K-60K messages, each 1KB (total 50MB-60MB) - Read every second more than 200K messages, each 1KB. May help in assessing whether any Kafka-related slowness is Kafka limitation or our implementation.

Re: SQL TIMESTAMP semantics vs. SPARK-18350

2017-05-25 Thread Zoltan Ivanfi
Hi, Ofir, thanks for your support. My understanding is that many users have the same problem as you do. Reynold, thanks for your reply and sorry for the confusion. My personal e-mail was specifically about your concerns regarding SPARK-12297 and I started this separate thread because this is

unsubscribe

2017-05-25 Thread Amit Rana

unsubscribe

2017-05-25 Thread 信息安全部

Re: SQL TIMESTAMP semantics vs. SPARK-18350

2017-05-25 Thread Reynold Xin
Zoltan, Thanks for raising this again, although I'm a bit confused since I've communicated with you a few times on JIRA and on private emails to explain that you have some misunderstanding of the timestamp type in Spark and some of your statements are wrong (e.g. the except text file part). Not

Re: SQL TIMESTAMP semantics vs. SPARK-18350

2017-05-25 Thread Ofir Manor
Hi Zoltan, thanks for bringing this up, this is really important to me! Personally, as a user developing app on top of Spark and other tools, the current timestamp semantics has been a source of some pain - needing to undo Spark's "auto-correcting" of timestamps . It would be really great if we