Re: Run spark tests on Windows/docker

2018-09-20 Thread Shmuel Blitz
Since I got no feedback, I'll try asking differently: Can anyone point me to any resources regarding how to run the project's tests? Where can I find a good Docker image that would serve as a YARN cluster for submitting jobs? Thanks, Shmuel On Sun, Sep 16, 2018 at 10:09 PM Shmuel Blitz wrote

Run spark tests on Windows/docker

2018-09-16 Thread Shmuel Blitz
on a docker linux container with the Spark build mounted from the host PC. Has anyone done this? Do you have a recommended docker image to work with? 4. any special consideration I should think of? Thanks, Shmuel -- Shmuel Blitz Big Data Developer Email: shmuel.bl...@similarweb.com

Re: Running Production ML Pipelines

2018-07-17 Thread Shmuel Blitz
gar...@gmail.com> wrote: > Hi all, > > Any suggestions on optimization of running ML Pipeline inference in a > webapp in a multi-tenant low-latency mode. > > Suggestions would be appreciated. > > Thank you! > -- Shmuel Blitz Big Data Developer Email: shmuel.bl...@

Re: Create an Empty dataframe

2018-07-08 Thread Shmuel Blitz
gt; > > > Thank you in advance. > > -- > Apostolos N. Papadopoulos, Associate Professor > Department of Informatics > Aristotle University of Thessaloniki > Thessaloniki, GREECE > tel: ++0030312310991918 > email: papad...@csd.auth.gr > twitter: @papadopoulos_ap > web: http:/

Re: Class cast exception while using Data Frames

2018-03-27 Thread Shmuel Blitz
tion: Caused by: java.lang.ClassCastException: >> org.apache.spark.sql.catalyst.expressions.GenericRowWithSchema cannot be >> cast to scala.Tuple2 >> >> Even the schema says that key is of type struct of (string, string). >> >> Any idea why this is happening? &

Re: Re: the issue about the + in column,can we support the string please?

2018-03-26 Thread Shmuel Blitz
easy to understand. > > -- > 1427357...@qq.com > > > *From:* Shmuel Blitz <shmuel.bl...@similarweb.com> > *Date:* 2018-03-26 15:31 > *To:* 1427357...@qq.com > *CC:* spark?users <user@spark.apache.org>; dev <d...@spark.apache.org> > *Subject:* Re: the issue about the

Re: the issue about the + in column,can we support the string please?

2018-03-26 Thread Shmuel Blitz
quot;(${ctx.javaType(dataType)})($eval1 $symbol $eval2)") > case CalendarIntervalType => > defineCodeGen(ctx, ev, (eval1, eval2) => s"$eval1.add($eval2)") > case _ => > defineCodeGen(ctx, ev, (eval1, eval2) => s"$eval1 $symbol $eval2")

Re: Open sourcing Sparklens: Qubole's Spark Tuning Tool

2018-03-25 Thread Shmuel Blitz
e as we have > 750 cores available for the job. 200 is the default value for > spark.sql.shuffle.partitions. Alternatively you could try increasing the > value of spark.sql.shuffle.partitions to latest 750. > > thanks, > rohitk > > On Sun, Mar 25, 2018 at 1:25 PM, Shmuel Blit

Re: Open sourcing Sparklens: Qubole's Spark Tuning Tool

2018-03-25 Thread Shmuel Blitz
> Nice! > > Shmuel, Were you able to run on a cluster level or for a specific job? > > Did you configure it on the spark-default.conf? > > On Sun, 25 Mar 2018 at 10:34 Shmuel Blitz <shmuel.bl...@similarweb.com> > wrote: > >> Just to let you know, I have managed to r

Re: Open sourcing Sparklens: Qubole's Spark Tuning Tool

2018-03-25 Thread Shmuel Blitz
you compile the code against the right branch for Spark 1.6. >> >> I tested it and it looks working and now i'm testing the branch for a >> wide tests, Please use the branch for Spark 1.6 >> >> On Fri, Mar 23, 2018 at 12:43 AM, Shmuel Blitz < >> shmuel.bl...@similarweb.com&

Re: Open sourcing Sparklens: Qubole's Spark Tuning Tool

2018-03-22 Thread Shmuel Blitz
Hi Rohit, Thanks for sharing this great tool. I tried running a spark job with the tool, but it failed with an *IncompatibleClassChangeError *Exception. I have opened an issue on Github.( https://github.com/qubole/sparklens/issues/1) Shmuel On Thu, Mar 22, 2018 at 5:05 PM, Shmuel Blitz

Re: Open sourcing Sparklens: Qubole's Spark Tuning Tool

2018-03-22 Thread Shmuel Blitz
gt; project. It helps in understanding the scalability limits of spark >>>>> applications and can be a useful guide on the path towards tuning >>>>> applications for lower runtime or cost. >>>>> >>>>> Please clone from here: https://githu

Re: Spark Dataframe and HIVE

2018-02-11 Thread Shmuel Blitz
ble\":tru >>>>> e,\"metadata\":{\"name\":\"statecode\",\"scale\":0}},{\"name >>>>> \":\"Socialid\",\"type\":\"string\",\"nullable\":true,\"meta >>>>> data\":

Re: Spark Dataframe and HIVE

2018-02-10 Thread Shmuel Blitz
>> \"name\":\"latitude\",\"scale\":6}},{\"name\":\"longitude\", >>>>> \"type\":\"decimal(6,6)\",\"nullable\":true,\"metadata\":{ >>>>> \"name\":\"longitude\",\&qu

Re: Spark Tuning Tool

2018-01-24 Thread Shmuel Blitz
gt; >> for some interest in the community if people find this work interesting > and > >> would like to try to it out. > >> > >> thanks, > >> Rohit Karlupia > >> > >> > > > > - > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Shmuel Blitz Big Data Developer Email: shmuel.bl...@similarweb.com www.similarweb.com <https://www.facebook.com/SimilarWeb/> <https://www.linkedin.com/company/429838/> <https://twitter.com/similarweb>

Spark 2.1 table loaded from Hive Metastore has null values

2017-08-07 Thread Shmuel Blitz
TIES. Why is this happening? How can it be fixed? -- [image: Logo] <https://www.similarweb.com/?utm_source=WiseStamp_medium=email_term=_content=_campaign=signature> Shmuel Blitz *Big Data Developer* www.similarweb.com <http://www.similarweb.com?utm_source=WiseStamp_medium=email_term=_content=_campaig