Re: Expression/LogicalPlan dichotomy in Spark SQL Catalyst

2015-12-23 Thread Roland Reumerman
Thanks for the informative reply Michael. The things I'm trying to accomplish with Catalyst are certain external domain model resolving and security-related constraint-handling transformations that depend more on the syntactic (nested TreeNode) structure of the query than on the actual

Re: A proposal for Spark 2.0

2015-12-23 Thread Sean Owen
I think this will be hard to maintain; we already have JIRA as the de facto central place to store discussions and prioritize work, and the 2.x stuff is already a JIRA. The wiki doesn't really hurt, just probably will never be looked at again. Let's point people in all cases to JIRA. On Tue, Dec

Re: [VOTE] Release Apache Spark 1.6.0 (RC4)

2015-12-23 Thread Iulian Dragoș
+1 (non-binding) Tested Mesos deployments (client and cluster-mode, fine-grained and coarse-grained). Things look good . iulian On Wed, Dec 23, 2015 at 2:35 PM, Sean Owen wrote: > Docker integration

value of sc.defaultParallelism

2015-12-23 Thread Chang Ya-Hsuan
python version: 2.7.9 os: ubuntu 14.04 spark: 1.5.2 I run a standalone spark on localhost, and use the following code to access sc.defaultParallism # a.py import pyspark sc = pyspark.SparkContext() print(sc.defaultParallelism) and use the following command to submit $ spark-submit --master

Re: [VOTE] Release Apache Spark 1.6.0 (RC4)

2015-12-23 Thread Allen Zhang
+1 (non-binding) I have just tarball a new binary and tested am.nodelabelexpression and executor.nodelabelexpression manully, result is expected. At 2015-12-23 21:44:08, "Iulian Dragoș" wrote: +1 (non-binding) Tested Mesos deployments (client and

Re: A proposal for Spark 2.0

2015-12-23 Thread Nicholas Chammas
Yeah, I'd also favor maintaining docs with strictly temporary relevance on JIRA when possible. The wiki is like this weird backwater I only rarely visit. Don't we typically do this kind of stuff with an umbrella issue on JIRA? Tom, wouldn't that work well for you? Nick On Wed, Dec 23, 2015 at

Re: [Spark SQL] SQLContext getOrCreate incorrect behaviour

2015-12-23 Thread Jerry Lam
Hi Kostas, Thank you for the references of the 2 tickets. It helps me to understand why I got some weird experiences lately. Best Regards, Jerry On Wed, Dec 23, 2015 at 2:32 AM, kostas papageorgopoylos wrote: > Hi > > Fyi > The following 2 tickets are blocking currently

confused behavior about pyspark.sql, Row, schema, and createDataFrame

2015-12-23 Thread Chang Ya-Hsuan
python version: 2.7.9 os: ubuntu 14.04 spark: 1.5.2 ``` import pyspark from pyspark.sql import Row from pyspark.sql.types import StructType, IntegerType sc = pyspark.SparkContext() sqlc = pyspark.SQLContext(sc) schema1 = StructType() \ .add('a', IntegerType()) \ .add('b', IntegerType())

Kafka consumer: Upgrading to use the the new Java Consumer

2015-12-23 Thread eugene miretsky
Hi, The Kafka connector currently uses the older Kafka Scala consumer. Kafka 0.9 came out with a new Java Kafka consumer. One of the main differences is that the Scala consumer uses a Decoder( kafka.serializer.decoder) trait to decode keys/values while the Java consumer uses the Deserializer

Re: Downloading Hadoop from s3://spark-related-packages/

2015-12-23 Thread Nicholas Chammas
FYI: I opened an INFRA ticket with questions about how best to use the Apache mirror network. https://issues.apache.org/jira/browse/INFRA-10999 Nick On Mon, Nov 2, 2015 at 8:00 AM Luciano Resende wrote: > I am getting the same results using closer.lua versus close.cgi,

Re: [VOTE] Release Apache Spark 1.6.0 (RC4)

2015-12-23 Thread Sean Owen
Docker integration tests still fail for Mark and I, and should probably be disabled: https://issues.apache.org/jira/browse/SPARK-12426 ... but if anyone else successfully runs these (and I assume Jenkins does) then not a blocker. I'm having intermittent trouble with other tests passing, but

Re: [VOTE] Release Apache Spark 1.6.0 (RC4)

2015-12-23 Thread Zsolt Tóth
+1 (non binding) (Pyspark K-Means still shows the numeric diff, of course.) 2015-12-23 9:33 GMT+01:00 Kousuke Saruta : > +1 > > > On 2015/12/23 16:14, Jean-Baptiste Onofré wrote: > >> +1 (non binding) >> >> Tested with samples on standalone and yarn. >> >> Regards >>

Re: [VOTE] Release Apache Spark 1.6.0 (RC4)

2015-12-23 Thread Kousuke Saruta
+1 On 2015/12/23 16:14, Jean-Baptiste Onofré wrote: +1 (non binding) Tested with samples on standalone and yarn. Regards JB On 12/22/2015 09:10 PM, Michael Armbrust wrote: Please vote on releasing the following candidate as Apache Spark version 1.6.0! The vote is open until Friday,