RE: we control spark file names before we write them - should we opensource it?

2020-06-08 Thread Stefan Panayotov
Yes, I think so. Stefan Panayotov, PhD spanayo...@outlook.com spanayo...@comcast.net spanayo...@gmail.com -Original Message- From: ilaimalka Sent: Monday, June 8, 2020 9:17 AM To: user@spark.apache.org Subject: we control spark file names before we write them - should we opensource

RE: spark 2 new stuff

2018-02-26 Thread Stefan Panayotov
To me Delta is very valuable. Stefan Panayotov, PhD spanayo...@outlook.com<mailto:spanayo...@outlook.com> spanayo...@comcast.net<mailto:spanayo...@comcast.net> Cell: 610-517-5586 From: Mich Talebzadeh [mailto:mich.talebza...@gmail.com] Sent: Monday, February 26, 2018 9:26 AM To:

RE: How to output RDD to one file with specific name?

2016-08-25 Thread Stefan Panayotov
You can do something like: dbutils.fs.cp("/foo/part-0","/foo/my-data.csv") Stefan Panayotov, PhD <mailto:spanayo...@outlook.com> spanayo...@outlook.com <mailto:spanayo...@comcast.net> spanayo...@comcast.net Cell: 610-517-5586 Home:

adding rows to a DataFrame

2016-03-11 Thread Stefan Panayotov
achieve this in Spark without doing DF.collect(), which will get everything to the driver and for a big data set I'll get OOM? BTW, I know how to use withColumn() to add new columns to the DataFrame. I need to also add new rows. Any help will be appreciated. Thanks, Stefan Panayotov, PhD Home

RE: Spark 1.5.2 memory error

2016-02-03 Thread Stefan Panayotov
ted containers from NM context: [container_1454509557526_0014_01_93] I'll appreciate any suggestions. Thanks, Stefan Panayotov, PhD Home: 610-355-0919 Cell: 610-517-5586 email: spanayo...@msn.com spanayo...@outlook.com spanayo...@comcast.net Date: Tue, 2 Feb 2016 15:40:10 -0800 Subject:

Spark 1.5.2 memory error

2016-02-02 Thread Stefan Panayotov
:47446/user/CoarseGrainedScheduler, executorHostname: 10.0.0.8 16/02/02 20:33:57 INFO yarn.YarnAllocator: Received 1 containers from YARN, launching executors on 1 of them. I'll really appreciate any help here. Thank you, Stefan Panayotov, PhD Home: 610-355-0919 Cell: 610-517-5586 email: s

RE: Spark 1.5.2 memory error

2016-02-02 Thread Stefan Panayotov
For the memoryOvethead I have the default of 10% of 16g, and Spark version is 1.5.2. Stefan Panayotov, PhD Sent from Outlook Mail for Windows 10 phone From: Ted Yu Sent: Tuesday, February 2, 2016 4:52 PM To: Jakob Odersky Cc: Stefan Panayotov; user@spark.apache.org Subject: Re: Spark 1.5.2

RE: Python UDFs

2016-01-28 Thread Stefan Panayotov
Thanks, Jacob. But it seems that Python requires the RETURN Type to be specified. And DenseVector is not a valid return type, or I do not know the correct type to put in. Shall I try ArrayType? Any ideas? Stefan Panayotov, PhD Home: 610-355-0919 Cell: 610-517-5586 email: spanayo

Python UDFs

2016-01-27 Thread Stefan Panayotov
ray[Double] = Array(a,b) val dv = new DenseVector(data) dv }) How can I write the corresponding function in Pyhton/Pyspark? Thanks for your help Stefan Panayotov, PhD Home: 610-355-0919 Cell: 610-517-5586 email: spanayo...@msn.com spanayo...@outlook.com spanayo...@comcast.net

RE: Spark SQL running totals

2015-10-16 Thread Stefan Panayotov
Thanks Deenar. This works perfectly. I can't test the solution with window functions because I am still on Spark 1.3.1 Hopefully will move to 1.5 soon. Stefan Panayotov Sent from my Windows Phone From: Deenar Toraskar<mailto:deenar.toras...@gmail.com> Sen

RE: Spark SQL running totals

2015-10-15 Thread Stefan Panayotov
Thanks to all of you guys for the helpful suggestions. I'll try these first thing tomorrow morning. Stefan Panayotov Sent from my Windows Phone From: java8964<mailto:java8...@hotmail.com> Sent: ‎10/‎15/‎2015 4:30 PM To: Michael Armbrust<ma

Spark SQL running totals

2015-10-15 Thread Stefan Panayotov
1 1010 2 3040 3 1555 4 2075 5 25100 Is there a way to achieve this in Spark SQL or maybe with Data frame transformations? Thanks in advance, Stefan Panayotov, PhD Home: 610-355-0919 Cell: 610-517-5586 email

HiveContext error

2015-08-05 Thread Stefan Panayotov
(console) Can anybody help please? Stefan Panayotov, PhD Home: 610-355-0919 Cell: 610-517-5586 email: spanayo...@msn.com spanayo...@outlook.com spanayo...@comcast.net

FW: Executing spark code in Zeppelin

2015-07-29 Thread Stefan Panayotov
Stefan Panayotov Sent from my Windows Phone From: Stefan Panayotovmailto:spanayo...@msn.com Sent: ‎7/‎29/‎2015 8:20 AM To: user-subscr...@spark.apache.orgmailto:user-subscr...@spark.apache.org Subject: Executing spark code in Zeppelin Hi, I faced a problem

RE: Executing spark code in Zeppelin

2015-07-29 Thread Stefan Panayotov
to the Zeppelin community. Stefan Panayotov, PhD Home: 610-355-0919 Cell: 610-517-5586 email: spanayo...@msn.com spanayo...@outlook.com spanayo...@comcast.net From: silvio.fior...@granturing.com To: spanayo...@msn.com; user@spark.apache.org Subject: Re: Executing spark code in Zeppelin Date: Wed, 29

Zeppelin notebook question

2015-07-23 Thread Stefan Panayotov
the HiveContext tables in %sql? Thanks, Stefan Panayotov, Ph.D. email: spanayo...@msn.com home: 610-355-0919 cell: 610-517-5586

RE: Add column to DF

2015-07-21 Thread Stefan Panayotov
This is working! Thank you so much :) Stefan Panayotov, PhD Home: 610-355-0919 Cell: 610-517-5586 email: spanayo...@msn.com spanayo...@outlook.com spanayo...@comcast.net From: mich...@databricks.com Date: Tue, 21 Jul 2015 12:08:04 -0700 Subject: Re: Add column to DF To: spanayo...@msn.com

Add column to DF

2015-07-21 Thread Stefan Panayotov
you please help? Stefan Panayotov, PhD Home: 610-355-0919 Cell: 610-517-5586 email: spanayo...@msn.com spanayo...@outlook.com spanayo...@comcast.net

RE: import errors with Eclipse Scala

2015-07-01 Thread Stefan Panayotov
Thanks, Jem. I added scala-compiler.jar from C:\Eclipse\eclipse\plugins\org.scala-ide.scala210.jars_4.1.0.201505250838\target\jars And looks like this resolved the issue. Thanks once again. Stefan Panayotov, PhD Home: 610-355-0919 Cell: 610-517-5586 email: spanayo...@msn.com spanayo

RE: import errors with Eclipse Scala

2015-07-01 Thread Stefan Panayotov
Hi Ted, How can I import the relevant Spark projects into Eclipse? Do I need to add anything the Java Build Path in the project properties? Also, I have installed sbt on my machine. Is there a corresponding sbt command to the maven command below? Stefan Panayotov, PhD Home: 610-355-0919

import errors with Eclipse Scala

2015-07-01 Thread Stefan Panayotov
org.apache.spark.sql._import org.json4s._import org.json4s.JsonDSL._import org.json4s.jackson.JsonMethodsimport org.json4s.jackson.JsonMethods._ All report errors of type: “object apache is not member of package org” or“object json4s is not member of package org” How can I resolve this? Thanks, Stefan