Re: If I pass raw SQL string to dataframe do I still get the Spark SQL optimizations?

2017-07-06 Thread ayan guha
I kind of think "Thats the whole point" :) Sorry it is Friday here :) :) On Fri, Jul 7, 2017 at 1:09 PM, Michael Armbrust wrote: > It goes through the same optimization pipeline. More in this video > . > > On Thu, Jul 6, 2017 at 5:28

Re: If I pass raw SQL string to dataframe do I still get the Spark SQL optimizations?

2017-07-06 Thread Michael Armbrust
It goes through the same optimization pipeline. More in this video . On Thu, Jul 6, 2017 at 5:28 PM, kant kodali wrote: > HI All, > > I am wondering If I pass a raw SQL string to dataframe do I still get the > Spark SQL optimizations? why

RE: Do we anything for Deep Learning in Spark?

2017-07-06 Thread hosur narahari
Thank you. Best Regards, Hari On 7 Jul 2017 3:59 a.m., "Roope Astala" wrote: > You can use an attached GPU VM for DNN training, and do other processing > on regular CPU nodes. You can even deallocate the GPU VM to save costs when > not using it. The GPU branch has

If I pass raw SQL string to dataframe do I still get the Spark SQL optimizations?

2017-07-06 Thread kant kodali
HI All, I am wondering If I pass a raw SQL string to dataframe do I still get the Spark SQL optimizations? why or why not? Thanks!

GraphQL to Spark SQL

2017-07-06 Thread kant kodali
Hi All, I wonder if anyone had experience exposing Spark SQL interface through GraphQL? The main benefit I see is that we could send Spark SQL query through REST so clients can express their own transformations over REST. I understand the final outcome is probably the same as what one would

Spark 2.0.2 - JdbcRelationProvider does not allow create table as select

2017-07-06 Thread Kanagha Kumar
Hi, I'm running spark 2.0.2 version and I'm noticing an issue with DataFrameWriter.save() Code: ds.write().format("jdbc").mode("overwrite").options(ImmutableMap.of( "driver", "org.apache.phoenix.jdbc.PhoenixDriver", "url", urlWithTenant,

RE: Do we anything for Deep Learning in Spark?

2017-07-06 Thread Roope Astala
You can use an attached GPU VM for DNN training, and do other processing on regular CPU nodes. You can even deallocate the GPU VM to save costs when not using it. The GPU branch has instructions how to set up such compute environment: https://github.com/Azure/mmlspark/tree/gpu#gpu-vm-setup

Re: Is there "EXCEPT ALL" in Spark SQL?

2017-07-06 Thread jeff saremi
EXCEPT is not the same as EXCEPT ALL Had they implemented EXCEPT ALL in SparkSQL one could have easily obtained EXCEPT by adding a disctint() to the results From: hareesh makam Sent: Thursday, July 6, 2017 12:48:18 PM To: jeff saremi

Logging in lSpark streaming application

2017-07-06 Thread anna stax
Do I need to include the log4j dependencies in my pom.xml of the spark streaming application or it is already included in spark libraries? I am running Spark in standalone mode on AWS EC2. Thanks

Re: Is there "EXCEPT ALL" in Spark SQL?

2017-07-06 Thread hareesh makam
There is Except in DataFrame API. df1.except(df2) Same can be used in SQL as well. public DataFrame except(DataFrame other)

Re: Is there "EXCEPT ALL" in Spark SQL?

2017-07-06 Thread upendra 1991
To add to it, is there any specific documentation or reference where we could check out what SQL functions and features are available in spark spl for a specific sparksql version.? Thanks,Upendra On Thu, Jul 6, 2017 at 2:22 PM, jeff saremi wrote: I tried this

Structured Streaming: consumerGroupId

2017-07-06 Thread aravias
Hi, Is there a way to get the *consumerGroupId* assigned to a structured streaming application when its consuming from kafka? regards Aravind -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Structured-Streaming-consumerGroupId-tp28828.html Sent from the

Is there "EXCEPT ALL" in Spark SQL?

2017-07-06 Thread jeff saremi
I tried this query in 1.6 and it failed: SELECT * FROM Table1 EXCEPT ALL SELECT * FROM Table2 Exception in thread "main" java.lang.RuntimeException: [1.32] failure: ``('' expected but `all' found thanks Jeff

Unsubscribe

2017-07-06 Thread Kun Liu
Unsubscribe

Partitions cached by updatStateByKey does not seem to be getting evicted forever

2017-07-06 Thread SRK
Hi, We use updateStateByKey in our Spark streaming application. The partitions cached by updateStateByKey does not seem to be getting evicted. It was getting evicted fine with spark.cleaner.ttl in 1.5.1. I am facing issues with partitions not getting evicted with Stateful Streaming after Spark

Re: Spark | Window Function |

2017-07-06 Thread Julien CHAMP
Thx a lot for your answer Radhwane :) I have some (many) use case with such needs of Long in window function. As said in the bug report, I can store events in ms in a dataframe, and want to count the number of events in past 10 years ( requiring a Long value ) -> *Let's imagine that this window

Re: Spark, S3A, and 503 SlowDown / rate limit issues

2017-07-06 Thread Steve Loughran
On 5 Jul 2017, at 14:40, Vadim Semenov > wrote: Are you sure that you use S3A? Because EMR says that they do not support S3A https://aws.amazon.com/premiumsupport/knowledge-center/emr-file-system-s3/ > Amazon EMR does not

Re: Collecting matrix's entries raises an error only when run inside a test

2017-07-06 Thread Yanbo Liang
Hi Simone, Would you mind to share the minimized code to reproduce this issue? Yanbo On Wed, Jul 5, 2017 at 10:52 PM, Simone Robutti wrote: > Hello, I have this problem and Google is not helping. Instead, it looks > like an unreported bug and there are no hints to

Re: UDAFs for sketching Dataset columns with T-Digests

2017-07-06 Thread Sam Bessalah
This is interesting and very useful. Thanks. On Thu, Jul 6, 2017 at 2:33 AM, Erik Erlandson wrote: > After my talk on T-Digests in Spark at Spark Summit East, there were some > requests for a UDAF-based interface for working with Datasets. I'm > pleased to announce that I