Re: Block Transfer Service encryption support

2015-03-16 Thread Aaron Davidson
Out of curiosity, why could we not use Netty's SslHandler injected into the TransportContext pipeline? On Mon, Mar 16, 2015 at 7:56 PM, turp1twin turp1t...@gmail.com wrote: Hey Patrick, Sorry for the delay, I was at Elastic{ON} last week and well, my day job has been keeping me busy... I

Re: Block Transfer Service encryption support

2015-03-16 Thread turp1twin
Hey Patrick, Sorry for the delay, I was at Elastic{ON} last week and well, my day job has been keeping me busy... I went ahead and opened a Jira feature request, https://issues.apache.org/jira/browse/SPARK-6373. In it I reference a commit I made in my fork which is a rough implementation,

Re: Block Transfer Service encryption support

2015-03-16 Thread turp1twin
Hey Aaron, That is what I do, except I add the Netty SslHandler in the TransportServer and the TransportClientFactory I do this because the Server pipeline is a bit different as I have to add a Netty ChunkedWriteHandler... Again, this is a rough prototype, just to get something working...

Typo in 1.3.0 release notes: s/extended renamed/renamed/

2015-03-16 Thread Joe Halliwell
Cheers, Joe Best regards, Joe

Re: Typo in 1.3.0 release notes: s/extended renamed/renamed/

2015-03-16 Thread Sean Owen
Here's the sentence: As part of stabilizing the Spark SQL API, the SchemaRDD class has been extended renamed to DataFrame. Yes, I can remove the word 'extended' On Mon, Mar 16, 2015 at 1:18 PM, Joe Halliwell joe.halliw...@gmail.com wrote: Cheers, Joe Best regards, Joe

Re: SparkSQL 1.3.0 cannot read parquet files from different file system

2015-03-16 Thread Cheng Lian
Oh sorry, I misread your question. I thought you were trying something like |parquetFile(“s3n://file1,hdfs://file2”)|. Yeah, it’s a valid bug. Thanks for opening the JIRA ticket and the PR! Cheng On 3/16/15 6:39 PM, Cheng Lian wrote: Hi Pei-Lun, We intentionally disallowed passing

Re: SparkSQL 1.3.0 cannot read parquet files from different file system

2015-03-16 Thread Cheng Lian
Hi Pei-Lun, We intentionally disallowed passing multiple comma separated paths in 1.3.0. One of the reason is that users report that this fail when a file path contain an actual comma in it. In your case, you may do something like this: |val s3nDF = parquetFile(s3n://...) val hdfsDF =

problems with Parquet in Spark 1.3.0

2015-03-16 Thread Gil Vernik
Hi, I am storing Parquet files in the OpenStack Swift and access those files from Spark. This works perfectly in Spark prior 1.3.0, but in 1.3.0 I am getting this error: Is there some configuration i missed? I am not sure where this error get from, does Spark 1.3.0 requires Parquet files to

Re: problems with Parquet in Spark 1.3.0

2015-03-16 Thread Gil Vernik
I just noticed about this one https://issues.apache.org/jira/browse/SPARK-6351 https://github.com/apache/spark/pull/5039 I verified it and this resolves my issues with Parquet and swift:// name space. From: Gil Vernik/Haifa/IBM@IBMIL To: dev dev@spark.apache.org Date: 16/03/2015

Re: extended jenkins downtime monday, march 16th, plus some hints at the future

2015-03-16 Thread shane knapp
ok, we're back up and building. upgrading the github plugin (and possibly EnvInject) caused the stacktraces, so i've kept those at the old versions that were working before. jenkins and the rest of the plugins are updated and we're g2g. i'll be, of course, keeping an eye on things today and

Re: extended jenkins downtime monday, march 16th, plus some hints at the future

2015-03-16 Thread shane knapp
this is starting now. On Fri, Mar 13, 2015 at 10:12 AM, shane knapp skn...@berkeley.edu wrote: i'll be taking jenkins down for some much-needed plugin updates, as well as potentially upgrading jenkins itself. this will start at 730am PDT, and i'm hoping to have everything up by noon. the

Re: enum-like types in Spark

2015-03-16 Thread Patrick Wendell
Hey Xiangrui, Do you want to write up a straw man proposal based on this line of discussion? - Patrick On Mon, Mar 16, 2015 at 12:12 PM, Kevin Markey kevin.mar...@oracle.com wrote: In some applications, I have rather heavy use of Java enums which are needed for related Java APIs that the

Re: enum-like types in Spark

2015-03-16 Thread Aaron Davidson
It's unrelated to the proposal, but Enum#ordinal() should be much faster, assuming it's not serialized to JVMs with different versions of the enum :) On Mon, Mar 16, 2015 at 12:12 PM, Kevin Markey kevin.mar...@oracle.com wrote: In some applications, I have rather heavy use of Java enums which

Re: enum-like types in Spark

2015-03-16 Thread Xiangrui Meng
In MLlib, we use strings for emu-like types in Python APIs, which is quite common in Python and easy for py4j. On the JVM side, we implement `fromString` to convert them back to enums. -Xiangrui On Wed, Mar 11, 2015 at 12:56 PM, RJ Nowling rnowl...@gmail.com wrote: How do these proposals affect

SparkSQL 1.3.0 cannot read parquet files from different file system

2015-03-16 Thread Pei-Lun Lee
Hi, I am using Spark 1.3.0, where I cannot load parquet files from more than one file system, say one s3n://... and another hdfs://..., which worked in older version, or if I set spark.sql.parquet.useDataSourceApi=false in 1.3. One way to fix this is instead of get a single FileSystem from

Re: Which OutputCommitter to use for S3?

2015-03-16 Thread Pei-Lun Lee
Hi, I created a JIRA and PR for supporting a s3 friendly output committer for saveAsParquetFile: https://issues.apache.org/jira/browse/SPARK-6352 https://github.com/apache/spark/pull/5042 My approach is add a DirectParquetOutputCommitter class in spark-sql package and use a boolean config