Re: Where does the Driver run?

2019-03-24 Thread Arko Provo Mukherjee
Hello,

Is spark.driver.memory per Job or shared across jobs? You should do load
testing before setting this?

Thanks & regards
Arko


On Sun, Mar 24, 2019 at 3:09 PM Pat Ferrel  wrote:

>
> 2 Slaves, one of which is also Master.
>
> Node 1 & 2 are slaves. Node 1 is where I run start-all.sh.
>
> The machines both have 60g of free memory (leaving about 4g for the master
> process on Node 1). The only constraint to the Driver and Executors is
> spark.driver.memory = spark.executor.memory = 60g
>
> BTW I would expect this to create one Executor, one Driver, and the Master
> on 2 Workers.
>
>
>
>
> From: Andrew Melo  
> Reply: Andrew Melo  
> Date: March 24, 2019 at 12:46:35 PM
> To: Pat Ferrel  
> Cc: Akhil Das  , user
>  
> Subject:  Re: Where does the Driver run?
>
> Hi Pat,
>
> On Sun, Mar 24, 2019 at 1:03 PM Pat Ferrel  wrote:
>
>> Thanks, I have seen this many times in my research. Paraphrasing docs:
>> “in deployMode ‘cluster' the Driver runs on a Worker in the cluster”
>>
>> When I look at logs I see 2 executors on the 2 slaves (executor 0 and 1
>> with addresses that match slaves). When I look at memory usage while the
>> job runs I see virtually identical usage on the 2 Workers. This would
>> support your claim and contradict Spark docs for deployMode = cluster.
>>
>> The evidence seems to contradict the docs. I am now beginning to wonder
>> if the Driver only runs in the cluster if we use spark-submit
>>
>
> Where/how are you starting "./sbin/start-master.sh"?
>
> Cheers
> Andrew
>
>
>>
>>
>>
>> From: Akhil Das  
>> Reply: Akhil Das  
>> Date: March 23, 2019 at 9:26:50 PM
>> To: Pat Ferrel  
>> Cc: user  
>> Subject:  Re: Where does the Driver run?
>>
>> If you are starting your "my-app" on your local machine, that's where the
>> driver is running.
>>
>> [image: image.png]
>>
>> Hope this helps.
>> 
>>
>> On Sun, Mar 24, 2019 at 4:13 AM Pat Ferrel  wrote:
>>
>>> I have researched this for a significant amount of time and find answers
>>> that seem to be for a slightly different question than mine.
>>>
>>> The Spark 2.3.3 cluster is running fine. I see the GUI on “
>>> http://master-address:8080;, there are 2 idle workers, as configured.
>>>
>>> I have a Scala application that creates a context and starts execution
>>> of a Job. I *do not use spark-submit*, I start the Job programmatically and
>>> this is where many explanations forks from my question.
>>>
>>> In "my-app" I create a new SparkConf, with the following code (slightly
>>> abbreviated):
>>>
>>>   conf.setAppName(“my-job")
>>>   conf.setMaster(“spark://master-address:7077”)
>>>   conf.set(“deployMode”, “cluster”)
>>>   // other settings like driver and executor memory requests
>>>   // the driver and executor memory requests are for all mem on the
>>> slaves, more than
>>>   // mem available on the launching machine with “my-app"
>>>   val jars = listJars(“/path/to/lib")
>>>   conf.setJars(jars)
>>>   …
>>>
>>> When I launch the job I see 2 executors running on the 2 workers/slaves.
>>> Everything seems to run fine and sometimes completes successfully. Frequent
>>> failures are the reason for this question.
>>>
>>> Where is the Driver running? I don’t see it in the GUI, I see 2
>>> Executors taking all cluster resources. With a Yarn cluster I would expect
>>> the “Driver" to run on/in the Yarn Master but I am using the Spark
>>> Standalone Master, where is the Drive part of the Job running?
>>>
>>> If is is running in the Master, we are in trouble because I start the
>>> Master on one of my 2 Workers sharing resources with one of the Executors.
>>> Executor mem + driver mem is > available mem on a Worker. I can change this
>>> but need so understand where the Driver part of the Spark Job runs. Is it
>>> in the Spark Master, or inside and Executor, or ???
>>>
>>> The “Driver” creates and broadcasts some large data structures so the
>>> need for an answer is more critical than with more typical tiny Drivers.
>>>
>>> Thanks for you help!
>>>
>>
>>
>> --
>> Cheers!
>>
>>


CCEACC67-4431-4246-AEB8-60CEC0940BA9
Description: Binary data


Re: Encoder for JValue

2018-09-19 Thread Arko Provo Mukherjee
Hello Muthu,

Many thanks for your reply. That is what we are currently doing.

However, we finally load the data somewhere and we need to have JSON
objects rather than serialized strings.

Hence I was wondering if there are encoders our there for JObject and if I
can somehow pass that information to Spark.

Thanks & regards
Arko


On Tue, Sep 18, 2018 at 11:39 PM Muthu Jayakumar  wrote:

> A naive workaround may be to transform the json4s JValue to String (using
> something like compact()) and process it as String? Once you are done with
> the last action, you could write it back as JValue (using something like
> parse())
>
> Thanks,
> Muthu
>
> On Wed, Sep 19, 2018 at 6:35 AM Arko Provo Mukherjee <
> arkoprovomukher...@gmail.com> wrote:
>
>> Hello Spark Gurus,
>>
>> I am running into an issue with Encoding and wanted your help.
>>
>> I have a case class with a JObject in it. Ex:
>> *case class SomeClass(a: String, b: JObject)*
>>
>> I also have an encoder for this case class:
>> *val encoder = Encoders.product[**SomeClass**]*
>>
>> Now I am creating a DataFrame with the tuple (a, b) from my
>> transformations and converting into a DataSet:
>> *df.as <http://df.as>[SomeClass](encoder)*
>>
>> When I do this, I get the following error:
>> *java.lang.UnsupportedOperationException: No Encoder found for
>> org.json4s.JsonAST.JValue*
>>
>> Appreciate any help regarding this issue.
>>
>> Many thanks in advance!
>> Warm regards
>> Arko
>>
>>
>>


Encoder for JValue

2018-09-18 Thread Arko Provo Mukherjee
Hello Spark Gurus,

I am running into an issue with Encoding and wanted your help.

I have a case class with a JObject in it. Ex:
*case class SomeClass(a: String, b: JObject)*

I also have an encoder for this case class:
*val encoder = Encoders.product[**SomeClass**]*

Now I am creating a DataFrame with the tuple (a, b) from my transformations
and converting into a DataSet:
*df.as [SomeClass](encoder)*

When I do this, I get the following error:
*java.lang.UnsupportedOperationException: No Encoder found for
org.json4s.JsonAST.JValue*

Appreciate any help regarding this issue.

Many thanks in advance!
Warm regards
Arko


Re: SparkMaster IP

2016-02-22 Thread Arko Provo Mukherjee
Passing --host localhost solved the issue, thanks!
Warm regards
Arko


On Mon, Feb 22, 2016 at 5:44 PM, Jakob Odersky <ja...@odersky.com> wrote:
> Spark master by default binds to whatever ip address your current host
> resolves to. You have a few options to change that:
> - override the ip by setting the environment variable SPARK_LOCAL_IP
> - change the ip in your local "hosts" file (/etc/hosts on linux, not
> sure on windows)
> - specify a different hostname such as "localhost" when starting spark
> master by passing the "--host HOSTNAME" command-line parameter (the ip
> address will be resolved from the supplied HOSTNAME)
>
> best,
> --Jakob
>
> On Mon, Feb 22, 2016 at 5:09 PM, Arko Provo Mukherjee
> <arkoprovomukher...@gmail.com> wrote:
>> Hello,
>>
>> I am running Spark on Windows.
>>
>> I start up master as follows:
>> .\spark-class.cmd org.apache.spark.deploy.master.Master
>>
>> I see that the SparkMaster doesn't start on 127.0.0.1 but starts on my
>> "actual" IP. This is troublesome for me as I use it in my code and
>> need to change every time I restart.
>>
>> Is there a way to make SparkMaster listen to 127.0.0.1:7077?
>>
>> Thanks much in advace!
>> Warm regards
>> Arko
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



SparkMaster IP

2016-02-22 Thread Arko Provo Mukherjee
Hello,

I am running Spark on Windows.

I start up master as follows:
.\spark-class.cmd org.apache.spark.deploy.master.Master

I see that the SparkMaster doesn't start on 127.0.0.1 but starts on my
"actual" IP. This is troublesome for me as I use it in my code and
need to change every time I restart.

Is there a way to make SparkMaster listen to 127.0.0.1:7077?

Thanks much in advace!
Warm regards
Arko

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Submitting Jobs Programmatically

2016-02-19 Thread Arko Provo Mukherjee
Hello,

Thanks much. I could start the service.

When I run my program, the launcher is not being able to find the app class:

java.lang.ClassNotFoundException: SparkSubmitter
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
at java.lang.Class.forName0(Native Method)
Spark job complete. Exit code:101
at java.lang.Class.forName(Class.java:274)
at org.apache.spark.util.Utils$.classForName(Utils.scala:173)
at 
org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:639)
at 
org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:120)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

My launch code is as follows:
val spark = new SparkLauncher()
.setSparkHome("C:\\spark-1.5.1-bin-hadoop2.6")

.setAppResource("C:\\SparkService\\Scala\\RequestSubmitter\\target\\scala-2.10\\spark-submitter_2.10-0.0.1.jar")
.setMainClass("SparkSubmitter")
.addAppArgs(inputQuery)
.setMaster("spark://157.54.189.70:7077")
.launch()
spark.waitFor()

I added the spark-submitter_2.10-0.0.1.jar in the classpath as well
but that didn't help.

Thanks & regards
Arko

On Fri, Feb 19, 2016 at 6:49 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> Cycling old bits:
>
> http://search-hadoop.com/m/q3RTtHrxMj2abwOk2
>
> On Fri, Feb 19, 2016 at 6:40 PM, Arko Provo Mukherjee
> <arkoprovomukher...@gmail.com> wrote:
>>
>> Hi,
>>
>> Thanks for your response. Is there a similar link for Windows? I am
>> not sure the .sh scripts would run on windows.
>>
>> My default the start-all.sh doesn't work and I don't see anything in
>> localhos:8080
>>
>> I will do some more investigation and come back.
>>
>> Thanks again for all your help!
>>
>> Thanks & regards
>> Arko
>>
>>
>> On Fri, Feb 19, 2016 at 6:35 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>> > Please see https://spark.apache.org/docs/latest/spark-standalone.html
>> >
>> > On Fri, Feb 19, 2016 at 6:27 PM, Arko Provo Mukherjee
>> > <arkoprovomukher...@gmail.com> wrote:
>> >>
>> >> Hi,
>> >>
>> >> Thanks for your response, that really helped.
>> >>
>> >> However, I don't believe the job is being submitted. When I run spark
>> >> from the shell, I don't need to start it up explicitly. Do I need to
>> >> start up Spark on my machine before running this program?
>> >>
>> >> I see the following in the SPARK_HOME\bin directory:
>> >> Name
>> >> 
>> >> beeline.cmd
>> >> load-spark-env.cmd
>> >> pyspark.cmd
>> >> pyspark2.cmd
>> >> run-example.cmd
>> >> run-example2.cmd
>> >> spark-class.cmd
>> >> spark-class2.cmd
>> >> spark-shell.cmd
>> >> spark-shell2.cmd
>> >> spark-submit.cmd
>> >> spark-submit2.cmd
>> >> sparkR.cmd
>> >> sparkR2.cmd
>> >>
>> >> Do I need to run anyone of them before submitting the job via the
>> >> program?
>> >>
>> >> Thanks & regards
>> >> Arko
>> >>
>> >> On Fri, Feb 19, 2016 at 6:01 PM, Holden Karau <hol...@pigscanfly.ca>
>> >> wrote:
>> >> > How are you trying to launch your application? Do you have the Spark
>> >> > jars on
>> >> > your class path?
>> >> >
>> >> >
>> >> > On Friday, February 19, 2016, Arko Provo Mukherjee
>> >> > <arkoprovomukher...@gmail.com> wrote:
>> >> >>
>> >> >> Hello,
>> >> >>
>> >> >> I am trying to submit a spark job via a program.
>> >> >>
>> >> >> When I run it, I receive the following error:
>> >> >> Exception in thread "Thread-1" java.lang.NoClassDefFoundError:
>> >> >> org/a

Re: Submitting Jobs Programmatically

2016-02-19 Thread Arko Provo Mukherjee
Hi,

Thanks for your response. Is there a similar link for Windows? I am
not sure the .sh scripts would run on windows.

My default the start-all.sh doesn't work and I don't see anything in
localhos:8080

I will do some more investigation and come back.

Thanks again for all your help!

Thanks & regards
Arko


On Fri, Feb 19, 2016 at 6:35 PM, Ted Yu <yuzhih...@gmail.com> wrote:
> Please see https://spark.apache.org/docs/latest/spark-standalone.html
>
> On Fri, Feb 19, 2016 at 6:27 PM, Arko Provo Mukherjee
> <arkoprovomukher...@gmail.com> wrote:
>>
>> Hi,
>>
>> Thanks for your response, that really helped.
>>
>> However, I don't believe the job is being submitted. When I run spark
>> from the shell, I don't need to start it up explicitly. Do I need to
>> start up Spark on my machine before running this program?
>>
>> I see the following in the SPARK_HOME\bin directory:
>> Name
>> 
>> beeline.cmd
>> load-spark-env.cmd
>> pyspark.cmd
>> pyspark2.cmd
>> run-example.cmd
>> run-example2.cmd
>> spark-class.cmd
>> spark-class2.cmd
>> spark-shell.cmd
>> spark-shell2.cmd
>> spark-submit.cmd
>> spark-submit2.cmd
>> sparkR.cmd
>> sparkR2.cmd
>>
>> Do I need to run anyone of them before submitting the job via the program?
>>
>> Thanks & regards
>> Arko
>>
>> On Fri, Feb 19, 2016 at 6:01 PM, Holden Karau <hol...@pigscanfly.ca>
>> wrote:
>> > How are you trying to launch your application? Do you have the Spark
>> > jars on
>> > your class path?
>> >
>> >
>> > On Friday, February 19, 2016, Arko Provo Mukherjee
>> > <arkoprovomukher...@gmail.com> wrote:
>> >>
>> >> Hello,
>> >>
>> >> I am trying to submit a spark job via a program.
>> >>
>> >> When I run it, I receive the following error:
>> >> Exception in thread "Thread-1" java.lang.NoClassDefFoundError:
>> >> org/apache/spark/launcher/SparkLauncher
>> >> at Spark.SparkConnector.run(MySpark.scala:33)
>> >> at java.lang.Thread.run(Thread.java:745)
>> >> Caused by: java.lang.ClassNotFoundException:
>> >> org.apache.spark.launcher.SparkLauncher
>> >> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>> >> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>> >> at java.security.AccessController.doPrivileged(Native Method)
>> >> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>> >> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>> >> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>> >> ... 2 more
>> >>
>> >> It seems it cannot find the SparkLauncher class. Any clue to what I am
>> >> doing wrong?
>> >>
>> >> Thanks & regards
>> >> Arko
>> >>
>> >> -
>> >> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> >> For additional commands, e-mail: user-h...@spark.apache.org
>> >>
>> >
>> >
>> > --
>> > Cell : 425-233-8271
>> > Twitter: https://twitter.com/holdenkarau
>> >
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Submitting Jobs Programmatically

2016-02-19 Thread Arko Provo Mukherjee
Hi,

Thanks for your response, that really helped.

However, I don't believe the job is being submitted. When I run spark
from the shell, I don't need to start it up explicitly. Do I need to
start up Spark on my machine before running this program?

I see the following in the SPARK_HOME\bin directory:
Name

beeline.cmd
load-spark-env.cmd
pyspark.cmd
pyspark2.cmd
run-example.cmd
run-example2.cmd
spark-class.cmd
spark-class2.cmd
spark-shell.cmd
spark-shell2.cmd
spark-submit.cmd
spark-submit2.cmd
sparkR.cmd
sparkR2.cmd

Do I need to run anyone of them before submitting the job via the program?

Thanks & regards
Arko

On Fri, Feb 19, 2016 at 6:01 PM, Holden Karau <hol...@pigscanfly.ca> wrote:
> How are you trying to launch your application? Do you have the Spark jars on
> your class path?
>
>
> On Friday, February 19, 2016, Arko Provo Mukherjee
> <arkoprovomukher...@gmail.com> wrote:
>>
>> Hello,
>>
>> I am trying to submit a spark job via a program.
>>
>> When I run it, I receive the following error:
>> Exception in thread "Thread-1" java.lang.NoClassDefFoundError:
>> org/apache/spark/launcher/SparkLauncher
>> at Spark.SparkConnector.run(MySpark.scala:33)
>> at java.lang.Thread.run(Thread.java:745)
>> Caused by: java.lang.ClassNotFoundException:
>> org.apache.spark.launcher.SparkLauncher
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
>> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>> at java.security.AccessController.doPrivileged(Native Method)
>> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>> ... 2 more
>>
>> It seems it cannot find the SparkLauncher class. Any clue to what I am
>> doing wrong?
>>
>> Thanks & regards
>> Arko
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>
>
> --
> Cell : 425-233-8271
> Twitter: https://twitter.com/holdenkarau
>

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Submitting Jobs Programmatically

2016-02-19 Thread Arko Provo Mukherjee
Hello,

I am trying to submit a spark job via a program.

When I run it, I receive the following error:
Exception in thread "Thread-1" java.lang.NoClassDefFoundError:
org/apache/spark/launcher/SparkLauncher
at Spark.SparkConnector.run(MySpark.scala:33)
at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.ClassNotFoundException:
org.apache.spark.launcher.SparkLauncher
at java.net.URLClassLoader$1.run(URLClassLoader.java:366)
at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
... 2 more

It seems it cannot find the SparkLauncher class. Any clue to what I am
doing wrong?

Thanks & regards
Arko

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Using sbt assembly

2016-02-18 Thread Arko Provo Mukherjee
Hello,

I am trying to use sbt assembly to generate a fat JAR.

Here is my \project\assembly.sbt file:
resolvers += Resolver.url("bintray-sbt-plugins",
url("http://dl.bintray.com/sbt/sbt-plugin-releases;))(Resolver.ivyStylePatterns)

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "0.13.9")


However, when I run sbt assembly I get the error:
[error] (*:update) sbt.ResolveException: unresolved dependency:
com.eed3si9n#sbt-assembly;0.13.9: not found

Anyone faced this issue before?

Thanks & regards
Arko

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: Spark with .NET

2016-02-09 Thread Arko Provo Mukherjee
Doesn't seem to be supported, but thanks! I will probably write some .NET
wrapper in my front end and use the java api in the backend.
Warm regards
Arko


On Tue, Feb 9, 2016 at 12:05 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> This thread is related:
> http://search-hadoop.com/m/q3RTtwp4nR1lugin1=+NET+on+Apache+Spark+
>
> On Tue, Feb 9, 2016 at 11:43 AM, Arko Provo Mukherjee <
> arkoprovomukher...@gmail.com> wrote:
>
>> Hello,
>>
>> I want to use Spark (preferable Spark SQL) using C#. Anyone has any
>> pointers to that?
>>
>> Thanks & regards
>> Arko
>>
>>
>


Spark with .NET

2016-02-09 Thread Arko Provo Mukherjee
Hello,

I want to use Spark (preferable Spark SQL) using C#. Anyone has any
pointers to that?

Thanks & regards
Arko


Re: Spark with .NET

2016-02-09 Thread Arko Provo Mukherjee
Hello,

Thanks much for your help, much helpful! Let me explore some of the stuff
suggested :)

Thanks & regards
Arko


On Tue, Feb 9, 2016 at 3:17 PM, Ted Yu <yuzhih...@gmail.com> wrote:

> bq. it is a .NET assembly and not really used by SparkCLR
>
> Then maybe drop the import ?
>
> I was searching the SparkCLR repo to see whether (Spark) DataSet is
> supported.
>
> Cheer
>
> On Tue, Feb 9, 2016 at 3:07 PM, skaarthik oss <skaarthik@gmail.com>
> wrote:
>
>> *Arko* – you could use the following links to get started with SparkCLR
>> API and use C# with Spark for DataFrame processing. If you need the support
>> for interactive scenario, please feel free to share your scenario and
>> requirements to the SparkCLR project. Interactive scenario is one of the
>> focus areas of the current milestone in SparkCLR project.
>>
>> ·
>> https://github.com/Microsoft/SparkCLR/blob/master/examples/JdbcDataFrame/Program.cs
>>
>> ·
>> https://github.com/Microsoft/SparkCLR/blob/master/csharp/Samples/Microsoft.Spark.CSharp/DataFrameSamples.cs
>>
>>
>>
>>
>>
>> *Ted* – System.Data.DataSetExtensions is a reference that is
>> automatically added when a C# project is created in Visual Studio. As
>> Silvio pointed out below, it is a .NET assembly and not really used by
>> SparkCLR.
>>
>>
>>
>> *From:* Silvio Fiorito [mailto:silvio.fior...@granturing.com]
>> *Sent:* Tuesday, February 9, 2016 1:31 PM
>> *To:* Ted Yu <yuzhih...@gmail.com>; Bryan Jeffrey <
>> bryan.jeff...@gmail.com>
>> *Cc:* Arko Provo Mukherjee <arkoprovomukher...@gmail.com>; user <
>> user@spark.apache.org>
>>
>> *Subject:* Re: Spark with .NET
>>
>>
>>
>> That’s just a .NET assembly (not related to Spark DataSets) but doesn’t
>> look like they’re actually using it. It’s typically a default reference
>> pulled in by the project templates.
>>
>>
>>
>> The code though is available from Mono here:
>> https://github.com/mono/mono/tree/master/mcs/class/System.Data.DataSetExtensions
>>
>>
>>
>> *From: *Ted Yu <yuzhih...@gmail.com>
>> *Date: *Tuesday, February 9, 2016 at 3:56 PM
>> *To: *Bryan Jeffrey <bryan.jeff...@gmail.com>
>> *Cc: *Arko Provo Mukherjee <arkoprovomukher...@gmail.com>, user <
>> user@spark.apache.org>
>> *Subject: *Re: Spark with .NET
>>
>>
>>
>> Looks like they have some system support whose source is not in the repo:
>>
>> 
>>
>>
>>
>> FYI
>>
>>
>>
>> On Tue, Feb 9, 2016 at 12:17 PM, Bryan Jeffrey <bryan.jeff...@gmail.com>
>> wrote:
>>
>> Arko,
>>
>>
>> Check this out: https://github.com/Microsoft/SparkCLR
>>
>>
>>
>> This is a Microsoft authored C# language binding for Spark.
>>
>>
>>
>> Regards,
>>
>>
>>
>> Bryan Jeffrey
>>
>>
>>
>> On Tue, Feb 9, 2016 at 3:13 PM, Arko Provo Mukherjee <
>> arkoprovomukher...@gmail.com> wrote:
>>
>> Doesn't seem to be supported, but thanks! I will probably write some .NET
>> wrapper in my front end and use the java api in the backend.
>>
>> Warm regards
>>
>> Arko
>>
>>
>>
>>
>>
>> On Tue, Feb 9, 2016 at 12:05 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>>
>> This thread is related:
>>
>> http://search-hadoop.com/m/q3RTtwp4nR1lugin1=+NET+on+Apache+Spark+
>>
>>
>>
>> On Tue, Feb 9, 2016 at 11:43 AM, Arko Provo Mukherjee <
>> arkoprovomukher...@gmail.com> wrote:
>>
>> Hello,
>>
>>
>>
>> I want to use Spark (preferable Spark SQL) using C#. Anyone has any
>> pointers to that?
>>
>> Thanks & regards
>>
>> Arko
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>
>


Using GraphX with Spark Streaming?

2014-10-03 Thread Arko Provo Mukherjee
Hello Spark Gurus,

I am trying to learn Spark. I am specially interested in GraphX.

Since Spark can used in streaming context as well, I wanted to know
whether it is possible to use the Spark Toolkits like GraphX or MLlib
in the streaming context?

Apologies if this is a stupid question but I am trying to understand
why this can or cannot be done. As far as I understand that streaming
algorithms need to be different from batch algorithms as the streaming
algorithms are generally incremental. Hence the question whether the
RDD transformations can be extended to streaming or not.

Thanks much in advance for all the help!
Warm regards
Arko

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org