Is Spark ML model possible continues learning

2016-09-01 Thread 김태준
hi all i have been studying and working using by spark ml 2.0 on few days anyway, i have a few question about spark ml model process Q1. Is model possible continues learning ?? below code is example scenario to this question === // training model var

PySpark: preference for Python 2.7 or Python 3.5?

2016-09-01 Thread Ian Stokes Rees
I have the option of running PySpark with Python 2.7 or Python 3.5. I am fairly expert with Python and know the Python-side history of the differences. All else being the same, I have a preference for Python 3.5. I'm using CDH 5.8 and I'm wondering if that biases whether I should proceed

Re: Scala Vs Python

2016-09-01 Thread ayan guha
Hi That is great to know, Jakob, Thanks. So can I safely say even there are future features may be built on top of dataset APIs, same functionality will be available in Python APIs, eventually? Going back to older question, if the above statement is true, I do not see a major feature parity

MLib : Non Linear Optimization

2016-09-01 Thread nsareen
I'm part of an Predictive Analytics marketing platform. We do a lot of Optimizations ( non linear ), currently using SAS / Lindo routines. I was going through Spark's MLib documentation & found it supports Linear Optimization, was wondering if it also supports Non Linear Optimization & if not, are

Spark 2.0.0 - SQL - Running query with outer join from 1.6 fails

2016-09-01 Thread Don Drake
So I was able to reproduce in a simple case the issue I'm seeing with a query from Spark 1.6.2 that would run fine that is no longer working on Spark 2.0. Example code: https://gist.github.com/dondrake/c136d61503b819f0643f8c02854a9cdf Here's the code for Spark 2.0 that doesn't run (this runs

Re: Hi, guys, does anyone use Spark in finance market?

2016-09-01 Thread Taotao.Li
Hi, Adam, great thanks for your detailed reply, the three videos are very referential for me. Actually, the App submitted to IBM Spark Contest is a very small demo, I'll do much more work to enhance that model, and recently we just started a new project which aims to building a platform that makes

Re: Possible Code Generation Bug: Can Spark 2.0 Datasets handle Scala Value Classes?

2016-09-01 Thread Jakob Odersky
I'm not sure how the shepherd thing works, but just FYI Michael Armbrust originally wrote Catalyst, the engine behind Datasets. You can find a list of all committers here https://cwiki.apache.org/confluence/display/SPARK/Committers. Another good resource is to check https://spark-prs.appspot.com/

Re: Possible Code Generation Bug: Can Spark 2.0 Datasets handle Scala Value Classes?

2016-09-01 Thread Aris
On a more serious note -- yes, Datasets breaks with Scala value classes in Spark 2.0.0 and Spark 1.6.1. I wrote up a JIRA bug and I hope some more knowledgable people can look at it. Sean Own has commented on other code generation errors before I put him as shepherd in JIRA. Michael Armbrust has

Re: Scala Vs Python

2016-09-01 Thread Jakob Odersky
Hi Mich, the functional difference between Datasets and DataFrames is virtually non-existant in Spark 2.0. Historically, DataFrames were the first implementation of a collection to use Catalyst, Spark SQL's query optimizer. Whilst bringing lots of performance benefits, DataFrames came at the

Re: Expected benefit of parquet filter pushdown?

2016-09-01 Thread Christon DeWan
Thanks for the references, that explains a great deal. I can verify that using integer keys in this use case does work as expected w/r/t run time and bytes read. Hopefully this all works in the next spark release! Thanks, Xton > On Aug 31, 2016, at 3:41 PM, Robert Kruszewski

Re: Scala Vs Python

2016-09-01 Thread Mich Talebzadeh
Hi, Thanks I have already seen that link. We were discussing this topic on another thread today. "Difference between Data set and Data Frame in Spark 2 Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Re: Scala Vs Python

2016-09-01 Thread darren
This topic is a concern for us as well. In the data science world no one uses native scala or java by choice. It's R and Python. And python is growing. Yet in spark, python is 3rd in line for feature support, if at all. This is why we have decoupled from spark in our project. It's really

Re: Scala Vs Python

2016-09-01 Thread Peyman Mohajerian
https://databricks.com/blog/2016/07/14/a-tale-of-three-apache-spark-apis-rdds-dataframes-and-datasets.html On Thu, Sep 1, 2016 at 3:01 PM, Mich Talebzadeh wrote: > Hi Jacob. > > My understanding of Dataset is that it is basically an RDD with some > optimization gone

Re: Possible Code Generation Bug: Can Spark 2.0 Datasets handle Scala Value Classes?

2016-09-01 Thread Aris
Thank you Jakob on two counts 1. Yes, thanks for pointing out that spark-shell cannot take value classes, that was an additional confusion to me! 2. We have a Spark 2.0 project which is definitely breaking at runtime with a Dataset of value classes. I am not sure if this is also the case in

Re: Scala Vs Python

2016-09-01 Thread Mich Talebzadeh
Hi Jacob. My understanding of Dataset is that it is basically an RDD with some optimization gone into it. RDD is meant to deal with unstructured data? Now DataFrame is the tabular format of RDD designed for tabular work, csv, SQL stuff etc. When you mention DataFrame is just an alias for

Re: Scala Vs Python

2016-09-01 Thread Jakob Odersky
> However, what really worries me is not having Dataset APIs at all in Python. I think thats a deal breaker. What is the functionality you are missing? In Spark 2.0 a DataFrame is just an alias for Dataset[Row] ("type DataFrame = Dataset[Row]" in core/.../o/a/s/sql/package.scala). Since python is

Re: Possible Code Generation Bug: Can Spark 2.0 Datasets handle Scala Value Classes?

2016-09-01 Thread Jakob Odersky
Hi Aris, thanks for sharing this issue. I can confirm that value classes currently don't work, however I can't think of reason why they shouldn't be supported. I would therefore recommend that you report this as a bug. (Btw, value classes also currently aren't definable in the REPL. See

Re: Scala Vs Python

2016-09-01 Thread ayan guha
Thanks All for your replies. Feature Parity: MLLib, RDD and dataframes features are totally comparable. Streaming is now at par in functionality too, I believe. However, what really worries me is not having Dataset APIs at all in Python. I think thats a deal breaker. Performance: I do get this

Possible Code Generation Bug: Can Spark 2.0 Datasets handle Scala Value Classes?

2016-09-01 Thread Aris
Hello Spark community - Does Spark 2.0 Datasets *not support* Scala Value classes (basically "extends AnyVal" with a bunch of limitations) ? I am trying to do something like this: case class FeatureId(value: Int) extends AnyVal val seq = Seq(FeatureId(1),FeatureId(2),FeatureId(3)) import

Re: Fwd: Need some help

2016-09-01 Thread Shashank Mandil
Hi Aakash, I think what it generally means that you have to use the general spark APIs of Dataframe to bring in the data and crunch the numbers, however you cannot use the KMeansClustering algorithm which is already present in the MLlib spark library. I think a good place to start would be

Re: Fwd: Need some help

2016-09-01 Thread Aakash Basu
Hey Siva, It needs to be done with Spark, without the use of any Spark libraries. Need some help in this. Thanks, Aakash. On Fri, Sep 2, 2016 at 1:25 AM, Sivakumaran S wrote: > If you are to do it without Spark, you are asking at the wrong place. Try > Python +

Re: Difference between Data set and Data Frame in Spark 2

2016-09-01 Thread Mich Talebzadeh
yes I tested that. sounds like RDD is faster. Having said that I think there are advantages within DS over RDD. Will RDD be phased out? Thanks Dr Mich Talebzadeh LinkedIn * https://www.linkedin.com/profile/view?id=AAEWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw

Fwd: Need some help

2016-09-01 Thread Aakash Basu
-- Forwarded message -- From: Aakash Basu Date: Thu, Aug 25, 2016 at 10:06 PM Subject: Need some help To: user@spark.apache.org Hi all, Aakash here, need a little help in KMeans clustering. This is needed to be done: "Implement Kmeans Clustering

Re: Spark scheduling mode

2016-09-01 Thread Mark Hamstra
Spark's FairSchedulingAlgorithm is not round robin: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/SchedulingAlgorithm.scala#L43 When at the scope of fair scheduling Jobs within a single Pool, the Schedulable entities being handled (s1 and s2) are

Re: Spark scheduling mode

2016-09-01 Thread enrico d'urso
I tried it before, but still I am not able to see a proper round robin across the jobs I submit. Given this: FAIR 1 2 Each jobs inside production pool should be scheduled in round robin way, am I right? From: Mark Hamstra

Re: Spark scheduling mode

2016-09-01 Thread Mark Hamstra
The default pool (``) can be configured like any other pool: https://spark.apache.org/docs/latest/job-scheduling.html#configuring-pool-properties On Thu, Sep 1, 2016 at 11:11 AM, enrico d'urso wrote: > Is there a way to force scheduling to be fair *inside* the default pool? >

Re: Spark scheduling mode

2016-09-01 Thread enrico d'urso
Is there a way to force scheduling to be fair inside the default pool? I mean, round robin for the jobs that belong to the default pool. Cheers, From: Mark Hamstra Sent: Thursday, September 1, 2016 7:24:54 PM To: enrico d'urso Cc:

Re: Difference between Data set and Data Frame in Spark 2

2016-09-01 Thread Maciej Bryński
I think there could be performance reason. RDD can be faster than Datasets. For example check query plan for this code: spark.range(100).map(_ * 2).filter(_ < 100).map(_ * 2).collect() There are two serialize / deserialize pairs. And then compare with RDD equivalent. sc.parallelize(1 to

What's the best way to detect and remove outliers in a table?

2016-09-01 Thread Mobius ReX
Given a table with hundreds of columns mixed with both categorical and numerical attributes, and the distribution of values is unknown, what's the best way to detect outliers? For example, given a table Category Price A 1 A 1.3 A 100 C

Dataset Filter performance - trying to understand

2016-09-01 Thread Darin McBeath
I've been trying to understand the performance of Datasets (and filters) in Spark 2.0. I have a Dataset which I've read from a parquet file and cached into memory (deser). This is spread across 8 partitions and consumes a total of 826MB of memory on my cluster. I verified that the dataset

Re: Spark scheduling mode

2016-09-01 Thread Mark Hamstra
Just because you've flipped spark.scheduler.mode to FAIR, that doesn't mean that Spark can magically configure and start multiple scheduling pools for you, nor can it know to which pools you want jobs assigned. Without doing any setup of additional scheduling pools or assigning of jobs to pools,

[HELP] Force stop a Spark Streaming application running on EMR

2016-09-01 Thread Rajkiran Rajkumar
Hi Spark community, I have a Spark streaming application that reads from a Kinesis stream and processes data. It calls some services which can experience transient failures. When such a transient failure happens, a retry mechanism kicks into action. For the shutdown use case, I have a separate

Re: Spark 2.0: SQL runs 5x times slower when adding 29th field to aggregation.

2016-09-01 Thread Mich Talebzadeh
What happens if you run the following query on its own. How long it takes? SELECT field, SUM(x29) FROM FROM parquet_table WHERE partition = 1 GROUP BY field Have Stats been updated for all columns in Hive? And the type x29 field? HTH Dr Mich Talebzadeh LinkedIn *

Re: Difference between Data set and Data Frame in Spark 2

2016-09-01 Thread Sean Owen
On Thu, Sep 1, 2016 at 4:56 PM, Mich Talebzadeh wrote: > Data Frame built on top of RDD to create as tabular format that we all love > to make the original build easily usable (say SQL like queries, column > headings etc). The drawback is it restricts you with what you

Re: Difference between Data set and Data Frame in Spark 2

2016-09-01 Thread Mich Talebzadeh
Hi, This is my understanding of these three RDD is the basic construct to prepare to spread data across the nodes. Any form and any shape, structured, un-structured etc. It is the building block of Spark if I may call Data Frame built on top of RDD to create as tabular format that we all love

Spark 2.0: SQL runs 5x times slower when adding 29th field to aggregation.

2016-09-01 Thread Сергей Романов
Hi, When I run a query like "SELECT field, SUM(x1), SUM(x2)... SUM(x28) FROM parquet_table WHERE partition = 1 GROUP BY field" it runs in under 2 seconds, but when I add just one more aggregate field to the query "SELECT field, SUM(x1), SUM(x2)... SUM(x28), SUM(x29) FROM parquet_table WHERE

Error creating dataframe from schema with nested using case class

2016-09-01 Thread Corentin Kerisit
Hi all, By migrating to Spark 2.0.0, one of my program now throws the following runtime exception: - java.lang.RuntimeException: conversions.ProtoTCConversion$Timestamp is not a valid external type for schema of struct although Timestamp is defined as follows: - case

Re: Difference between Data set and Data Frame in Spark 2

2016-09-01 Thread Ovidiu-Cristian MARCU
Thank you! The talk is indeed very good. Best, Ovidiu > On 01 Sep 2016, at 16:47, Jules Damji wrote: > > Sean put it succinctly the nuanced differences and the evolution of Datasets. > Simply put, structure, to some extent, limits you—and that's what the > DataFrames &

Re: Spark 2.0: SQL runs 5x times slower when adding 29th field to aggregation.

2016-09-01 Thread Romanov
Can this be related to SPARK-17115 ? -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-2-0-SQL-runs-5x-times-slower-when-adding-29th-field-to-aggregation-tp27624p27643.html Sent from the Apache Spark User List mailing list archive at Nabble.com.

Re: how should I compose keyStore and trustStore if Spark needs to talk to Kafka & Cassandra ?

2016-09-01 Thread Cody Koeninger
Why not just use different files for Kafka? Nothing else in Spark should be using those Kafka configuration parameters. On Thu, Sep 1, 2016 at 3:26 AM, Eric Ho wrote: > I'm interested in what I should put into the trustStore file, not just for > Spark but also for Kafka

Re: Scala Vs Python

2016-09-01 Thread kant kodali
c'mon man this is no Brainer..Dynamic Typed Languages for Large Code Bases or Large Scale Distributed Systems makes absolutely no sense. I can write a 10 page essay on why that wouldn't work so great. you might be wondering why would spark have it then? well probably because its ease of use for

Re: Difference between Data set and Data Frame in Spark 2

2016-09-01 Thread Ovidiu-Cristian MARCU
Thank you, I like and agree with your point. RDD evolved to Datasets by means of an optimizer. I just wonder what are the use cases for RDDs (other than current version of GraphX leveraging RDDs)? Best, Ovidiu > On 01 Sep 2016, at 16:26, Sean Owen wrote: > > Here's my

Re: Difference between Data set and Data Frame in Spark 2

2016-09-01 Thread Sean Owen
Here's my paraphrase: Datasets are really the new RDDs. They have a similar nature (container of strongly-typed objects) but bring some optimizations via Encoders for common types. DataFrames are different from RDDs and Datasets and do not replace and are not replaced by them. They're

Re: Hi, guys, does anyone use Spark in finance market?

2016-09-01 Thread Adam Roberts
Hi, yes, there's definitely a market for Apache Spark and financial institutions, I can't provide specific details but to answer your survey: "yes" and "more than a few GB!" Here are a couple of examples showing Spark with financial data, full disclosure that I work for IBM, I'm sure there are

Difference between Data set and Data Frame in Spark 2

2016-09-01 Thread Ashok Kumar
Hi, What are practical differences between the new Data set in Spark 2 and the existing DataFrame. Has Dataset replaced Data Frame and what advantages it has if I use Data Frame instead of Data Frame. Thanks

Spark scheduling mode

2016-09-01 Thread enrico d'urso
I am building a Spark App, in which I submit several jobs (pyspark). I am using threads to run them in parallel, and also I am setting: conf.set("spark.scheduler.mode", "FAIR") Still, I see the jobs run serially in FIFO way. Am I missing something? Cheers, Enrico

Re: Spark 2.0.0 - has anyone used spark ML to do predictions under 20ms?

2016-09-01 Thread Sean Owen
Yeah there's a method to predict one Vector in the .mllib API but not the newer one. You could possibly hack your way into calling it anyway, or just clone the logic. On Thu, Sep 1, 2016 at 2:37 PM, Nick Pentreath wrote: > Right now you are correct that Spark ML APIs do

Re: Spark 2.0.0 - has anyone used spark ML to do predictions under 20ms?

2016-09-01 Thread Nick Pentreath
I should also point out that right now your only option is to code up your own export functionality (or be able to read Spark's format in your serving system), and translate that into the correct format for some other linear algebra or ML library, and use that for serving. On Thu, 1 Sep 2016 at

Re: Spark 2.0.0 - has anyone used spark ML to do predictions under 20ms?

2016-09-01 Thread Nick Pentreath
Right now you are correct that Spark ML APIs do not support predicting on a single instance (whether Vector for the models or a Row for a pipeline). See https://issues.apache.org/jira/browse/SPARK-10413 and https://issues.apache.org/jira/browse/SPARK-16431 (duplicate) for some discussion. There

Re: Spark 2.0.0 - Java vs Scala performance difference

2016-09-01 Thread Adam Roberts
On Java vs Scala: Sean's right that behind the scenes you'll be calling JVM based APIs anyway (e.g. sun.misc.unsafe for Tungsten) and that the vast majority of Apache Spark's important logic is written in Scala. Would be an interesting experiment to write the same functioning program using the

Re: Spark 2.0.0 - has anyone used spark ML to do predictions under 20ms?

2016-09-01 Thread Aseem Bansal
I understand from a theoretical perspective that the model itself is not distributed. Thus it can be used for making predictions for a vector or a RDD. But speaking in terms of the APIs provided by spark 2.0.0 when I create a model from a large data the recommended way is to use the ml library for

Re: Spark 2.0.0 - has anyone used spark ML to do predictions under 20ms?

2016-09-01 Thread Sean Owen
How the model is built isn't that related to how it scores things. Here we're just talking about scoring. NaiveBayesModel can score Vector which is not a distributed entity. That's what you want to use. You do not want to use a whole distributed operation to score one record. This isn't related to

Re: Spark 2.0.0 - has anyone used spark ML to do predictions under 20ms?

2016-09-01 Thread Aseem Bansal
I understand your point. Is there something like a bridge? Is it possible to convert the model trained using Dataset (i.e. the distributed one) to the one which uses vectors? In Spark 1.6 the mllib packages had everything as per vectors and that should be faster as per my understanding. But in

using multiple worker instances in spark standalone

2016-09-01 Thread AssafMendelson
Hi, I have a machine with lots of memory. Since I understand all executors in a single worker run on the same JVM, I do not want to use just one worker for the whole memory. Instead I want to define multiple workers each with less than 30GB memory. Looking at the documentation I see this would

Re: Spark 2.0.0 - has anyone used spark ML to do predictions under 20ms?

2016-09-01 Thread Sean Owen
If you're trying to score a single example by way of an RDD or Dataset, then no it will never be that fast. It's a whole distributed operation, and while you might manage low latency for one job at a time, consider what will happen when hundreds of them are running at once. It's just huge overkill

Spark 2.0.0 - has anyone used spark ML to do predictions under 20ms?

2016-09-01 Thread Aseem Bansal
Hi Currently trying to use NaiveBayes to make predictions. But facing issues that doing the predictions takes order of few seconds. I tried with other model examples shipped with Spark but they also ran in minimum of 500 ms when I used Scala API. With Has anyone used spark ML to do predictions

Re: Why there is no top method in dataset api

2016-09-01 Thread Sean Owen
You can always call .rdd.top(n) of course. Although it's slightly clunky, you can also .orderBy($"value".desc).take(n). Maybe there's an easier way. I don't think if there's a strong reason other than it wasn't worth it to write this and many other utility wrappers that a) already exist on the

Re: Spark 2.0.0 - Java vs Scala performance difference

2016-09-01 Thread Aseem Bansal
there is already a mail thread for scala vs python. check the archives On Thu, Sep 1, 2016 at 5:18 PM, ayan guha wrote: > How about Scala vs Python? > > On Thu, Sep 1, 2016 at 7:27 PM, Sean Owen wrote: > >> I can't think of a situation where it would be

Why there is no top method in dataset api

2016-09-01 Thread Jakub Dubovsky
Hey all, in RDD api there is very usefull method called top. It finds top n records in according to certain ordering without sorting all records. Very usefull! There is no top method nor similar functionality in Dataset api. Has anybody any clue why? Is there any specific reason for this? Any

Re: Spark 2.0.0 - Java vs Scala performance difference

2016-09-01 Thread ayan guha
How about Scala vs Python? On Thu, Sep 1, 2016 at 7:27 PM, Sean Owen wrote: > I can't think of a situation where it would be materially different. > Both are using the JVM-based APIs directly. Here and there there's a > tiny bit of overhead in using the Java APIs because

Re: Does a driver jvm houses some rdd partitions?

2016-09-01 Thread Jakub Dubovsky
Hey Mich, the question was not about one particular job but rather about general way how spark functions. If I do call persist on rdd then the executor which computed the partition of the rdd would try to save the partition on the memory that executor has reserved for caching. So my question is

java.lang.OutOfMemoryError Spark MLlib ALS matrix factorization

2016-09-01 Thread ANDREA SPINA
Hello everyone. I'm running the Apache Spark MLlib ALS matrix factorization and I ran into the following exceptions: *The following exception is periodic, it starts on the first iteration with the OOM error and then a long line of FNF exceptions during stage resubmittings (according with the UI,

Re: Spark 2.0.0 - Java vs Scala performance difference

2016-09-01 Thread Sean Owen
I can't think of a situation where it would be materially different. Both are using the JVM-based APIs directly. Here and there there's a tiny bit of overhead in using the Java APIs because something is translated from a Java-style object to a Scala-style object, but this is generally trivial. On

Re: difference between package and jar Option in Spark

2016-09-01 Thread Sean Owen
--jars includes a local JAR file in the application's classpath. --package references Maven coordinates of a dependency and retrieves and includes all of those JAR files, and includes them in the app classpath. On Thu, Sep 1, 2016 at 10:24 AM, Divya Gehlot wrote: > Hi, >

difference between package and jar Option in Spark

2016-09-01 Thread Divya Gehlot
Hi, Would like to know the difference between the --package and --jars option in Spark . Thanks, Divya

Spark 2.0.0 - Java vs Scala performance difference

2016-09-01 Thread Aseem Bansal
Hi Would there be any significant performance difference when using Java vs. Scala API?

Re: [Error:]while read s3 buckets in Spark 1.6 in spark -submit

2016-09-01 Thread Steve Loughran
On 1 Sep 2016, at 03:45, Divya Gehlot > wrote: Hi, I am using Spark 1.6.1 in EMR machine I am trying to read s3 buckets in my Spark job . When I read it through Spark shell I am able to read it ,but when I try to package the job and and

spark 2.0.0 - code generation inputadapter_value is not rvalue

2016-09-01 Thread Aseem Bansal
Hi Does spark does some code generation? I am trying to use map on a Java RDD and getting a huge generated files with 17406 lines in my terminal and then a stacktrace 16/09/01 13:57:36 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 16/09/01 13:57:36 INFO

Re: how should I compose keyStore and trustStore if Spark needs to talk to Kafka & Cassandra ?

2016-09-01 Thread Eric Ho
I'm interested in what I should put into the trustStore file, not just for Spark but also for Kafka and Cassandra sides.. The way I generated self-signed certs for Kafka and Cassandra sides are slightly different... On Thu, Sep 1, 2016 at 1:09 AM, Eric Ho wrote: > A

how should I compose keyStore and trustStore if Spark needs to talk to Kafka & Cassandra ?

2016-09-01 Thread Eric Ho
A working example would be great... Thx -- -eric ho

RE: Scala Vs Python

2016-09-01 Thread AssafMendelson
I believe this would greatly depend on your use case and your familiarity with the languages. In general, scala would have a much better performance than python and not all interfaces are available in python. That said, if you are planning to use dataframes without any UDF then the performance

RE: Window Functions with SQLContext

2016-09-01 Thread Saurabh Dubey
Hi Divya,   Then, how can https://issues.apache.org/jira/browse/SPARK-11001 be resolved.   Thanks, Saurabh   From: Divya Gehlot [mailto:divya.htco...@gmail.com] Sent: 01 September 2016 11:33 To: saurabh3d Cc: user @spark Subject: Re: Window Functions with SQLContext   Hi Saurabh,  

Re: Window Functions with SQLContext

2016-09-01 Thread Divya Gehlot
Hi Saurabh, Even I am using Spark 1.6+ version ..and when I didnt create hiveContext it threw the same error . So have to create HiveContext to access windows function Thanks, Divya On 1 September 2016 at 13:16, saurabh3d wrote: > Hi All, > > As per SPARK-11001